MultiETSC: Automated Machine Learning for Early Time Series Classification

Abstract

Early time series classification (EarlyTSC) involves the prediction of a class label based on partial observation of a given time series. Most EarlyTSC algorithms consider the trade-off between accuracy and earliness as two competing objectives, using a single dedicated hyperparameter. To obtain insights into this trade-off requires finding a set of non-dominated (Pareto efficient) classifiers. So far, this has been approached through manual hyperparameter tuning. Since the trade-off hyperparameters only provide indirect control over the earliness-accuracy trade-off, manual tuning is tedious and tends to result in many sub-optimal hyperparameter settings. This complicates the search for optimal hyperparameter settings and forms a hurdle for the application of EarlyTSC to real-world problems. To address these issues, we propose an automated approach to hyperparameter tuning and algorithm selection for EarlyTSC, building on developments in the fast-moving research area known as automated machine learning (AutoML). To deal with the challenging task of optimising two conflicting objectives in early time series classification, we propose MultiETSC, a system for multi-objective algorithm selection and hyperparameter optimisation (MO-CASH) for EarlyTSC. MultiETSC can potentially leverage any existing or future EarlyTSC algorithm and produces a set of Pareto optimal algorithm configurations from which a user can choose a posteriori. As an additional benefit, our proposed framework can incorporate and leverage time-series classification algorithms not originally designed for EarlyTSC for improving performance on EarlyTSC; we demonstrate this property using a newly defined, "naïve" fixed-time algorithm. In an extensive empirical evaluation of our new approach on a benchmark of 115 data sets, we show that MultiETSC performs substantially better than baseline methods, ranking highest (avg. rank 1.98) compared to conceptually simpler single-algorithm (2.98) and single-objective alternatives (4.36).

People

Software

MultiETSC is published on GitHub or you can download this snapshot (Feb 2021): [MultiETSC.zip, 26MB]

Data

For the experimental evaluation of MultiETSC we used the UCR Time Series Classification Archive.

License

MultiETSC has been made available under the GPL-3.0 License.

Papers

Gilles Ottervanger (supervisors: Can Wang, MSc & Dr. Mitra Baratchi & Prof.dr. Holger H. Hoos)
MultiETSC: Automated Machine Learning for Early Time Series Classification
Master's Thesis in Computer Science at Leiden Institute of Advanced Computer Science, Leiden University, 2021.
[ PDF ∙ BibTeX ]
Gilles Ottervanger, Mitra Baratchi and Holger H. Hoos.
MultiETSC: Automated Machine Learning for Early Time Series Classification.
Accepted for publication in Data Mining and Knowledge Discovery, pages (25 manuscript pages), 2021, to appear.