ADA Research Group

MultiETSC: Automated Machine Learning for Early Time Series Classification

Abstract

Early time series classification (EarlyTSC) involves the prediction of a class label based on partial observation of a given time series. In time-critical applications where data are observed over time, valuable time can often be saved at the cost of minor decreases in classification accuracy. Since accuracy and earliness are competing objectives, EarlyTSC algorithms must address this trade-off by deciding when enough data has been observed to produce a sufficiently reliable early classification, or simply: when to trigger the classification mechanism. Many algorithms have been proposed for this problem, using varying strategies for the classification and triggering mechanisms. Finding an optimal model or algorithm along with hyper-parameter settings for a given machine learning task has been the focus of a fast-moving research area known as automated machine learning (AutoML). In this thesis, we propose MultiETSC, an AutoML approach to EarlyTSC. This poses the challenge of optimising two conflicting objectives. We introduce an approach we dub multi-objective combined algorithm selection and hyper-parameter optimisation (MO-CASH). We compared our approach to hyper-parameter optimisation (HPO) on individual EarlyTSC algorithms based on their dominated hypervolume on 115 real-world and synthetic data sets from the UCR Time Series Classification Archive. We show MultiETSC achieves higher hypervolume than what achieved by HPO on each of the algorithms individually in 43% of all cases, with the best single algorithm only achieving highest hypervolume 29% of the time. Additionally, we demonstrate that MultiETSC outperforms a conceptually simpler single-objective optimisation approach, MO achieving higher hypervolume than SO 85% of the time. Finally, we find that only the most recently proposed algorithms are able to outperform a naive fixed-time nearest neighbour approach.

People

Software

MultiETSC is published on GitHub or you can download this snapshot (Feb 2021): [MultiETSC.zip, 26MB]

Data

For the experimental evaluation of MultiETSC we used the UCR Time Series Classification Archive.

License

MultiETSC has been made available under the GPL-3.0 License.

Papers