{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,16]],"date-time":"2025-04-16T06:00:42Z","timestamp":1744783242520,"version":"3.37.3"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,11,2]],"date-time":"2021-11-02T00:00:00Z","timestamp":1635811200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,11,2]],"date-time":"2021-11-02T00:00:00Z","timestamp":1635811200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003329","name":"Ministerio de Econom\u00eda y Competitividad","doi-asserted-by":"publisher","award":["TIN2016-81113-R"],"award-info":[{"award-number":["TIN2016-81113-R"]}],"id":[{"id":"10.13039\/501100003329","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002878","name":"Consejer\u00eda de Econom\u00eda, Innovaci\u00f3n, Ciencia y Empleo, Junta de Andaluc\u00eda","doi-asserted-by":"publisher","award":["P12-TIC-2985","P18-TP-5168"],"award-info":[{"award-number":["P12-TIC-2985","P18-TP-5168"]}],"id":[{"id":"10.13039\/501100002878","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003329","name":"Ministerio de Econom\u00eda y Competitividad","doi-asserted-by":"publisher","award":["BES-2017-080137"],"award-info":[{"award-number":["BES-2017-080137"]}],"id":[{"id":"10.13039\/501100003329","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Research Foundation of Flanders","award":["170303\/12X1619N"],"award-info":[{"award-number":["170303\/12X1619N"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Intell Syst"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments, the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS), which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation, along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and the code is publicly available.<\/jats:p>","DOI":"10.1007\/s44196-021-00036-7","type":"journal-article","created":{"date-parts":[[2021,12,13]],"date-time":"2021-12-13T13:03:36Z","timestamp":1639400616000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments"],"prefix":"10.1007","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3102-8367","authenticated-orcid":false,"given":"Francisco J.","family":"Bald\u00e1n","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Peralta","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yvan","family":"Saeys","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jos\u00e9 M.","family":"Ben\u00edtez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,11,2]]},"reference":[{"key":"36_CR1","doi-asserted-by":"publisher","first-page":"416","DOI":"10.1016\/j.future.2018.05.021","volume":"87","author":"A Kobusi\u0144ska","year":"2018","unstructured":"Kobusi\u0144ska, A., Leung, C., Hsu, C.-H., Raghavendra, S., Chang, V.: Emerging trends, issues and challenges in Internet of Things, Big Data and cloud computing. Future Gener. Comput. Syst. 87, 416\u2013419 (2018)","journal-title":"Future Gener. Comput. Syst."},{"key":"36_CR2","doi-asserted-by":"publisher","first-page":"113704","DOI":"10.1016\/j.eswa.2020.113704","volume":"161","author":"SW Lee","year":"2020","unstructured":"Lee, S.W., Kim, H.Y.: Stock market forecasting with super-high dimensional time-series data using ConvLSTM, trend sampling, and specialized data augmentation. Expert Syst. Appl. 161, 113704 (2020)","journal-title":"Expert Syst. Appl."},{"key":"36_CR3","doi-asserted-by":"crossref","unstructured":"Kim, T.-Y., Cho, S.-B.: Predicting the household power consumption using CNN-LSTM hybrid networks. In: Intelligent Data Engineering and Automated Learning\u2014IDEAL 2018, pp. 481\u2013490 (2018)","DOI":"10.1007\/978-3-030-03493-1_50"},{"issue":"5","key":"36_CR4","doi-asserted-by":"publisher","first-page":"5257","DOI":"10.1007\/s12652-020-02003-0","volume":"12","author":"S Aarthy","year":"2021","unstructured":"Aarthy, S., Iqbal, J.M.: Time series real time Naive Bayes electrocardiogram signal classification for efficient disease prediction using fuzzy rules. J. Ambient Intell. Humaniz. Comput. 12(5), 5257\u20135267 (2021)","journal-title":"J. Ambient Intell. Humaniz. Comput."},{"issue":"2","key":"36_CR5","doi-asserted-by":"publisher","first-page":"1144","DOI":"10.2991\/ijcis.d.190930.003","volume":"12","author":"T Nguyen","year":"2019","unstructured":"Nguyen, T., Nguyen, T., Nguyen, B.M., Nguyen, G.: Efficient time-series forecasting using neural network and opposition-based coral reefs optimization. Int. J. Comput. Intell. Syst. 12(2), 1144\u20131161 (2019)","journal-title":"Int. J. Comput. Intell. Syst."},{"issue":"1","key":"36_CR6","doi-asserted-by":"publisher","first-page":"336","DOI":"10.2991\/ijcis.2017.10.1.23","volume":"10","author":"B Wu","year":"2017","unstructured":"Wu, B., Duan, T.: A performance comparison of neural networks in forecasting stock price trend. Int. J. Comput. Intell. Syst. 10(1), 336\u2013346 (2017)","journal-title":"Int. J. Comput. Intell. Syst."},{"key":"36_CR7","doi-asserted-by":"crossref","unstructured":"Viegas, J.L., Cepeda, N.M., Vieira, S.M.: Electricity fraud detection using committee semi-supervised learning. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1\u20136 (2018)","DOI":"10.1109\/IJCNN.2018.8489389"},{"key":"36_CR8","doi-asserted-by":"crossref","unstructured":"Haddi, Z., Ananou, B., Trardi, Y., Pons, J.-F., Delliaux, S., Deharo, J.-C., Ouladsine, M.: Advanced machine learning coupled with heart-inter-beat derivatives for cardiac arrhythmia detection. In: 2020 American Control Conference (ACC), pp. 5433\u20135438 (2020)","DOI":"10.23919\/ACC45564.2020.9147991"},{"key":"36_CR9","doi-asserted-by":"crossref","unstructured":"Handhika, T., Murni, Lestari, D.P., Sari, I.: Multivariate time series classification analysis: state-of-the-art and future challenges. In: IOP Conference Series: Materials Science and Engineering, vol. 536, p. 012003 (2019)","DOI":"10.1088\/1757-899X\/536\/1\/012003"},{"key":"36_CR10","unstructured":"Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation, vol. 6, p. 10 (2004)"},{"key":"36_CR11","volume-title":"Learning Spark: Lightning-Fast Big Data Analytics","author":"M Hamstra","year":"2015","unstructured":"Hamstra, M., Karau, H., Zaharia, M., Konwinski, A., Wendell, P.: Learning Spark: Lightning-Fast Big Data Analytics. O\u2019Reilly Media, Inc., Sebastopol (2015)"},{"issue":"1","key":"36_CR12","first-page":"1235","volume":"17","author":"X Meng","year":"2016","unstructured":"Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai, D., Amde, M., Owen, S., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235\u20131241 (2016)","journal-title":"J. Mach. Learn. Res."},{"key":"36_CR13","unstructured":"Packages, S.: 3rd Party Spark Packages (2019). https:\/\/spark-packages.org\/"},{"key":"36_CR14","unstructured":"Bald\u00e1n, F.J., Peralta, D., Saeys, Y., Ben\u00edtez, J.M.: Scalable complexity measures and features for times series classification package repository (2021). https:\/\/github.com\/fjbaldan\/SCMFTS\/"},{"key":"36_CR15","doi-asserted-by":"crossref","unstructured":"Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262\u2013270 (2012)","DOI":"10.1145\/2339530.2339576"},{"key":"36_CR16","doi-asserted-by":"crossref","unstructured":"Rakthanmanon, T., Keogh, E.: Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp. 668\u2013676 (2013)","DOI":"10.1137\/1.9781611972832.74"},{"key":"36_CR17","doi-asserted-by":"crossref","unstructured":"Laptev, N., Amizadeh, S., Flint, I.: Generic and scalable framework for automated time-series anomaly detection. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1939\u20131947 (2015)","DOI":"10.1145\/2783258.2788611"},{"issue":"6","key":"36_CR18","doi-asserted-by":"publisher","first-page":"220","DOI":"10.3847\/1538-3881\/aa9332","volume":"154","author":"D Foreman-Mackey","year":"2017","unstructured":"Foreman-Mackey, D., Agol, E., Ambikasaran, S., Angus, R.: Fast and scalable Gaussian process modeling with applications to astronomical time series. Astron. J. 154(6), 220 (2017)","journal-title":"Astron. J."},{"issue":"3","key":"36_CR19","doi-asserted-by":"publisher","first-page":"607","DOI":"10.1007\/s10618-019-00617-3","volume":"33","author":"B Lucas","year":"2019","unstructured":"Lucas, B., Shifaz, A., Pelletier, C., O\u2019Neill, L., Zaidi, N., Goethals, B., Petitjean, F., Webb, G.I.: Proximity forest: an effective and scalable distance-based classifier for time series. Data Min. Knowl. Discov. 33(3), 607\u2013635 (2019)","journal-title":"Data Min. Knowl. Discov."},{"key":"36_CR20","doi-asserted-by":"publisher","first-page":"451","DOI":"10.1016\/j.ins.2018.10.028","volume":"496","author":"FJ Bald\u00e1n","year":"2019","unstructured":"Bald\u00e1n, F.J., Ben\u00edtez, J.M.: Distributed FastShapelet Transform: a Big Data time series classification algorithm. Inf. Sci. 496, 451\u2013463 (2019)","journal-title":"Inf. Sci."},{"key":"36_CR21","doi-asserted-by":"crossref","unstructured":"Lines, J., Davis, L.M., Hills, J., Bagnall, A.: A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 289\u2013297 (2012)","DOI":"10.1145\/2339530.2339579"},{"issue":"83","key":"36_CR22","doi-asserted-by":"publisher","first-page":"20130048","DOI":"10.1098\/rsif.2013.0048","volume":"10","author":"BD Fulcher","year":"2013","unstructured":"Fulcher, B.D., Little, M.A., Jones, N.S.: Highly comparative time-series analysis: the empirical structure of time series and their methods. J. R. Soc. Interface 10(83), 20130048 (2013)","journal-title":"J. R. Soc. Interface"},{"key":"36_CR23","doi-asserted-by":"crossref","unstructured":"Fulcher, B.D.: Feature-based time-series analysis (2017). arXiv preprint. arXiv:1709.08055","DOI":"10.1201\/9781315181080-4"},{"key":"36_CR24","unstructured":"Kang, Y., Hyndman, R.J., Li, F., et al.: Efficient generation of time series with diverse and controllable characteristics. Technical report, Monash University, Department of Econometrics and Business Statistics (2018)"},{"issue":"6","key":"36_CR25","doi-asserted-by":"publisher","first-page":"1821","DOI":"10.1007\/s10618-019-00647-x","volume":"33","author":"CH Lubba","year":"2019","unstructured":"Lubba, C.H., Sethi, S.S., Knaute, P., Schultz, S.R., Fulcher, B.D., Jones, N.S.: catch22: CAnonical Time-series CHaracteristics. Data Min. Knowl. Discov. 33(6), 1821\u20131852 (2019)","journal-title":"Data Min. Knowl. Discov."},{"key":"36_CR26","doi-asserted-by":"publisher","first-page":"106421","DOI":"10.1016\/j.asoc.2020.106421","volume":"93","author":"D Peralta","year":"2020","unstructured":"Peralta, D., Saeys, Y.: Robust unsupervised dimensionality reduction based on feature clustering for single-cell imaging data. Appl. Soft Comput. 93, 106421 (2020)","journal-title":"Appl. Soft Comput."},{"key":"36_CR27","unstructured":"Bald\u00e1n, F.J., Ben\u00edtez, J.M.: Complexity measures and features for times series classification (2020). arXiv preprint arXiv:2002.12036"},{"key":"36_CR28","doi-asserted-by":"publisher","first-page":"596","DOI":"10.1016\/j.ins.2021.05.024","volume":"569","author":"FJ Bald\u00e1n","year":"2021","unstructured":"Bald\u00e1n, F.J., Ben\u00edtez, J.M.: Multivariate times series classification through an interpretable representation. Inf. Sci. 569, 596\u2013614 (2021)","journal-title":"Inf. Sci."},{"key":"36_CR29","volume-title":"Hadoop: The Definitive Guide","author":"T White","year":"2012","unstructured":"White, T.: Hadoop: The Definitive Guide. O\u2019Reilly Media, Inc., Sebastopol (2012)"},{"key":"36_CR30","unstructured":"Flink, A.: Apache Flink (2019). http:\/\/flink.apache.org\/"},{"key":"36_CR31","unstructured":"Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pp. 15\u201328 (2012)"},{"issue":"1","key":"36_CR32","first-page":"1","volume":"92","author":"DB Dahl","year":"2020","unstructured":"Dahl, D.B.: Integration of R and Scala using rscala. J. Stat. Softw. 92(1), 1\u201318 (2020)","journal-title":"J. Stat. Softw."},{"key":"36_CR33","unstructured":"Dua, D., Graff, C.: UCI Machine Learning Repository (2017). http:\/\/archive.ics.uci.edu\/ml"},{"key":"36_CR34","doi-asserted-by":"crossref","unstructured":"Schmidt, P., Reiss, A., Duerichen, R., Marberger, C., Van Laerhoven, K.: Introducing wesad, a multimodal dataset for wearable stress and affect detection. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 400\u2013408 (2018)","DOI":"10.1145\/3242969.3242985"},{"key":"36_CR35","doi-asserted-by":"crossref","unstructured":"Bobade, P., Vani, M.: Stress detection with machine learning and deep learning using multimodal physiological data. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 51\u201357 (2020)","DOI":"10.1109\/ICIRCA48905.2020.9183244"},{"key":"36_CR36","doi-asserted-by":"crossref","unstructured":"Indikawati, F.I., Winiarti, S.: Stress detection from multimodal wearable sensor data. In: IOP Conference Series: Materials Science and Engineering, vol. 771, p. 012028 (2020)","DOI":"10.1088\/1757-899X\/771\/1\/012028"},{"key":"36_CR37","doi-asserted-by":"crossref","unstructured":"Lin, J., Pan, S., Lee, C.S., Oviatt, S.: An explainable deep fusion net-work for affect recognition using physiological signals. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 2069\u20132072 (2019)","DOI":"10.1145\/3357384.3358160"},{"issue":"2","key":"36_CR38","doi-asserted-by":"publisher","first-page":"1030","DOI":"10.1109\/JIOT.2020.3009358","volume":"8","author":"A Saeed","year":"2020","unstructured":"Saeed, A., Salim, F.D., Ozcelebi, T., Lukkien, J.: Federated self-supervised learning of multisensor representations for embedded intelligence. IEEE Internet Things J. 8(2), 1030\u20131040 (2020)","journal-title":"IEEE Internet Things J."},{"key":"36_CR39","doi-asserted-by":"crossref","unstructured":"Samyoun, S., Sayeed\u00a0Mondol, A., Stankovic, J.A.: Stress detection via sensor translation. In: 2020 16th International Conference on Distributed Computing in Sensor Systems (DCOSS), pp. 19\u201326 (2020)","DOI":"10.1109\/DCOSS49796.2020.00017"},{"key":"36_CR40","doi-asserted-by":"crossref","unstructured":"Esp\u00edndola, R.P., Ebecken, N.F.: On extending f-measure and g-mean metrics to multi-class problems. WIT Trans. Inf. Commun. Technol. 35 (2005)","DOI":"10.2495\/DATA050031"},{"issue":"7","key":"36_CR41","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1109\/MC.2008.209","volume":"41","author":"MD Hill","year":"2008","unstructured":"Hill, M.D., Marty, M.R.: Amdahl\u2019s law in the multicore era. Computer 41(7), 33\u201338 (2008)","journal-title":"Computer"},{"issue":"1","key":"36_CR42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2016.18","volume":"3","author":"MD Wilkinson","year":"2016","unstructured":"Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3(1), 1\u20139 (2016)","journal-title":"Sci. Data"}],"container-title":["International Journal of Computational Intelligence Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44196-021-00036-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44196-021-00036-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44196-021-00036-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,13]],"date-time":"2021-12-13T13:25:48Z","timestamp":1639401948000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44196-021-00036-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,2]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["36"],"URL":"https:\/\/doi.org\/10.1007\/s44196-021-00036-7","relation":{},"ISSN":["1875-6883"],"issn-type":[{"type":"electronic","value":"1875-6883"}],"subject":[],"published":{"date-parts":[[2021,11,2]]},"assertion":[{"value":"3 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 October 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 November 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics Approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to Participate"}},{"value":"The authors consent to this work for publication.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for Publication"}}],"article-number":"186"}}