Parallel Classification Method of Shapelet for Large-scale Time Series

  • Cao Yang China University of Mining and Technology,China

Abstract

Time series shapelet are subsequences of time series that can maximally represent a class, the quality of shapelet set is the key of time series classification algorithms based on shapelet, which have high accuracy and good interpretability mostly. However, due to the large number of shapelet candidate sets that need to be traversed in the process of calculating shapelet, its drawback of high time complexity makes it difficult to use traditional methods in large-scale data sets. Therefore, in order to be able to improve shapelet algorithm on the time complexity of large-scale data set and reservation of calculation accuracy meanwhile, in this paper, a new method was suggested to change the traditional shapelet algorithm with parallel computing, through the combination of clustering and sampling method, making the large time series data set into several small samples. Then get their candidate set of shapelet by parallel computing, finally through the merge algorithm candidates were calculated according to the original data set into the most discrimination shapelet collections. Fifteen large scale UCR data sets are selected in the experiment to verify the algorithm. Through comparison experiments, it can be shown that this method can greatly reduce the training time on most time series data sets, and effectively improve the classification accuracy of the time series classification algorithm based on shapelet.

 

Downloads

Download data is not yet available.
Published
2020-05-14
How to Cite
Yang, C. (2020). Parallel Classification Method of Shapelet for Large-scale Time Series . IJRDO -Journal of Computer Science Engineering, 6(4), 01-15. https://doi.org/10.53555/cse.v6i4.3344