{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T17:00:24Z","timestamp":1772643624518,"version":"3.50.1"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,1,24]],"date-time":"2020-01-24T00:00:00Z","timestamp":1579824000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,1,24]],"date-time":"2020-01-24T00:00:00Z","timestamp":1579824000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010665","name":"H2020 Marie Sklodowska-Curie Actions","doi-asserted-by":"publisher","award":["676157"],"award-info":[{"award-number":["676157"]}],"id":[{"id":"10.13039\/100010665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["CCF Trans. Pervasive Comp. Interact."],"published-print":{"date-parts":[[2020,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The use of Convolutional Neural Networks (CNNs) as a feature learning method for Human Activity Recognition (HAR) is becoming more and more common. Unlike conventional machine learning methods, which require domain-specific expertise, CNNs can extract features automatically. On the other hand, CNNs require a training phase, making them prone to the cold-start problem. In this work, a case study is presented where the use of a pre-trained CNN feature extractor is evaluated under realistic conditions. The case study consists of two main steps: (1) different topologies and parameters are assessed to identify the best candidate models for HAR, thus obtaining a pre-trained CNN model. The pre-trained model (2) is then employed as feature extractor evaluating its use with a large scale real-world dataset. Two CNN applications were considered: Inertial Measurement Unit (IMU) and audio based HAR. For the IMU data, balanced accuracy was 91.98% on the UCI-HAR dataset, and 67.51% on the real-world Extrasensory dataset. For the audio data, the balanced accuracy was 92.30% on the DCASE 2017 dataset, and 35.24% on the Extrasensory dataset.<\/jats:p>","DOI":"10.1007\/s42486-020-00026-2","type":"journal-article","created":{"date-parts":[[2020,1,24]],"date-time":"2020-01-24T11:02:57Z","timestamp":1579863777000},"page":"18-32","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":121,"title":["Feature learning for Human Activity Recognition using Convolutional Neural Networks"],"prefix":"10.1007","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1870-0203","authenticated-orcid":false,"given":"Federico","family":"Cruciani","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1102-5708","authenticated-orcid":false,"given":"Anastasios","family":"Vafeiadis","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0882-7902","authenticated-orcid":false,"given":"Chris","family":"Nugent","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2368-7354","authenticated-orcid":false,"given":"Ian","family":"Cleland","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9060-6262","authenticated-orcid":false,"given":"Paul","family":"McCullagh","sequence":"additional","affiliation":[]},{"given":"Konstantinos","family":"Votis","sequence":"additional","affiliation":[]},{"given":"Dimitrios","family":"Giakoumis","sequence":"additional","affiliation":[]},{"given":"Dimitrios","family":"Tzovaras","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0200-7989","authenticated-orcid":false,"given":"Liming","family":"Chen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6699-7331","authenticated-orcid":false,"given":"Raouf","family":"Hamzaoui","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,1,24]]},"reference":[{"key":"26_CR1","unstructured":"Abadi, M., Agarwal, A., et al.: TensorFlow: Large-scale machine learning on heterogeneous systems. https:\/\/www.tensorflow.org\/, software available from tensorflow.org (2015)"},{"issue":"10","key":"26_CR2","doi-asserted-by":"publisher","first-page":"1533","DOI":"10.1109\/TASLP.2014.2339736","volume":"22","author":"O Abdel-Hamid","year":"2014","unstructured":"Abdel-Hamid, O., Ar, Mohamed, Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE\/ACM Trans. Audio Speech Lang. Process. 22(10), 1533\u20131545 (2014)","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"issue":"4","key":"26_CR3","doi-asserted-by":"publisher","first-page":"854","DOI":"10.3390\/s17040854","volume":"17","author":"R Alsina-Pag\u00e8s","year":"2017","unstructured":"Alsina-Pag\u00e8s, R., Navarro, J., Al\u00edas, F., Herv\u00e1s, M.: homesound: Real-time audio event detection based on high performance computing for behaviour and surveillance remote monitoring. Sensors 17(4), 854 (2017)","journal-title":"Sensors"},{"key":"26_CR4","unstructured":"Anguita, D., Ghio, A., Oneto, L., Parra, X., Reyes-Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN (2013)"},{"issue":"3","key":"26_CR5","doi-asserted-by":"publisher","first-page":"521","DOI":"10.3390\/s19030521","volume":"19","author":"A Baldominos","year":"2019","unstructured":"Baldominos, A., Cervantes, A., Saez, Y., Isasi, P.: A comparison of machine learning and deep learning techniques for activity recognition using mobile devices. Sensors 19(3), 521 (2019). https:\/\/doi.org\/10.3390\/s19030521","journal-title":"Sensors"},{"issue":"June","key":"26_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2499621","volume":"1","author":"A Bulling","year":"2014","unstructured":"Bulling, A., Blanke, U., Schiele, B.: A tutorial on human activity recognition using body-worn inertial sensors. ACM Comput. Surv. (CSUR) 1(June), 1\u201333 (2014)","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"26_CR7","doi-asserted-by":"crossref","unstructured":"\u00c7akir, E., Virtanen, T.: End-to-end polyphonic sound event detection using convolutional recurrent neural networks with learned time-frequency representation input. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1\u20137 (2018)","DOI":"10.1109\/IJCNN.2018.8489470"},{"key":"26_CR8","unstructured":"Chollet, F., et al.: Keras. (2015). https:\/\/keras.io"},{"key":"26_CR9","doi-asserted-by":"crossref","unstructured":"Cruciani, F., Sun, C., Zhang, S., Nugent, C., Li, C., Song, S., Cheng, C., Cleland, I., McCullagh, P.: A public domain dataset for human activity recognition in free-living. In: 2019 IEEE SmartWorld, 2nd SmarterAAL Workshop (2019a)","DOI":"10.1109\/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00071"},{"key":"26_CR10","doi-asserted-by":"crossref","unstructured":"Cruciani, F., Vafeiadis, A., Nugent, C., Cleland, I., McCullagh, P., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., Hamzaoui, R.: Comparing CNN and human crafted features for human activity recognition. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing (2019b)","DOI":"10.1109\/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00190"},{"key":"26_CR11","unstructured":"Cruciani, F., Vafeiadis, A., et al.: Source code repository (2019c). https:\/\/github.com\/fcruciani\/cnn_rf_har"},{"issue":"1","key":"26_CR12","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1109\/TSA.2005.854103","volume":"14","author":"AJ Eronen","year":"2006","unstructured":"Eronen, A.J., Peltonen, V.T., Tuomi, J.T., Klapuri, A.P., Fagerlund, S., Sorsa, T., Lorho, G., Huopaniemi, J.: Audio-based context recognition. IEEE Trans Audio Speech Lang Process 14(1), 321\u2013329 (2006)","journal-title":"IEEE Trans Audio Speech Lang Process"},{"key":"26_CR13","doi-asserted-by":"crossref","unstructured":"Espinilla, M., Medina, J., Salguero, A., Irvine, N., Donnelly, M., Cleland, I., Nugent, C.: Human Activity Recognition from the Acceleration Data of a Wearable Device. Which Features Are More Relevant by Activities? Proceedings vol. 2, no. 19, pp. 1242 (2018)","DOI":"10.3390\/proceedings2191242"},{"key":"26_CR14","doi-asserted-by":"crossref","unstructured":"Gemmeke, J.F., Ellis, D.P., Freedman, D., Jansen, A., Lawrence, W., Moore, R.C., Plakal, M., Ritter, M.: Audio set: An ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 776\u2013780. IEEE (2017)","DOI":"10.1109\/ICASSP.2017.7952261"},{"key":"26_CR15","doi-asserted-by":"crossref","unstructured":"Grais, E.M., Wierstorf, H., Ward, D., Plumbley, M.D.: Multi-resolution fully convolutional neural networks for monaural audio source separation. In: International Conference on Latent Variable Analysis and Signal Separation, pp. 340\u2013350. Springer (2018)","DOI":"10.1007\/978-3-319-93764-9_32"},{"issue":"11","key":"26_CR16","doi-asserted-by":"publisher","first-page":"2614","DOI":"10.1109\/TPAMI.2018.2861732","volume":"41","author":"SJ Huang","year":"2019","unstructured":"Huang, S.J., Gao, W., Zhou, Z.H.: Fast multi-instance multi-label learning. IEEE Trans Pattern Anal Mach Intell 41(11), 2614\u20132627 (2019)","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"2","key":"26_CR17","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1007\/s12668-013-0088-3","volume":"3","author":"OD Incel","year":"2013","unstructured":"Incel, O.D., Kose, M., Ersoy, C.: A review and taxonomy of activity recognition on mobile phones. BioNanoScience 3(2), 145\u2013171 (2013)","journal-title":"BioNanoScience"},{"issue":"3","key":"26_CR18","doi-asserted-by":"publisher","first-page":"529","DOI":"10.3390\/s17030529","volume":"17","author":"M Janidarmian","year":"2017","unstructured":"Janidarmian, M., Fekr, A.R., Radecka, K., Zilic, Z.: A comprehensive analysis on wearable acceleration sensors in human activity recognition. Sensors 17(3), 529 (2017)","journal-title":"Sensors"},{"key":"26_CR19","unstructured":"Keskar, N.S., Socher, R.: Improving generalization performance by switching from adam to sgd. arXiv preprint arXiv:171207628 (2017)"},{"key":"26_CR20","unstructured":"Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations (ICLR-15) (2015)"},{"issue":"7553","key":"26_CR21","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436\u2013444 (2015). https:\/\/doi.org\/10.1038\/nature14539","journal-title":"Nature"},{"issue":"2","key":"26_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/JSEN.2017.2772718","volume":"18","author":"F Li","year":"2018","unstructured":"Li, F., Shirahama, K., Nisar, M.A., K\u00f6ping, L., Grzegorzek, M.: Comparison of feature learning methods for human activity recognition using wearable sensors. Sensors 18(2), 1\u201322 (2018)","journal-title":"Sensors"},{"key":"26_CR23","unstructured":"Mesaros, A., Heittola, T., Diment, A., Elizalde, B., Shah, A., Vincent, E., Raj, B., Virtanen, T.: Dcase 2017 challenge setup: Tasks, datasets and baseline system. In: DCASE 2017-Workshop on Detection and Classification of Acoustic Scenes and Events (2017)"},{"issue":"3","key":"26_CR24","doi-asserted-by":"publisher","first-page":"388","DOI":"10.1016\/j.bbe.2017.04.004","volume":"37","author":"J Morales","year":"2017","unstructured":"Morales, J., Akopian, D.: Physical activity recognition by smartphones, a survey. Biocybern. Biomed. Eng. 37(3), 388\u2013400 (2017)","journal-title":"Biocybern. Biomed. Eng."},{"issue":"8","key":"26_CR25","doi-asserted-by":"publisher","first-page":"1397","DOI":"10.3390\/app8081397","volume":"8","author":"V Morfi","year":"2018","unstructured":"Morfi, V., Stowell, D.: Deep learning for audio event detection and tagging on low-resource datasets. Appl. Sci. 8(8), 1397 (2018)","journal-title":"Appl. Sci."},{"key":"26_CR26","doi-asserted-by":"publisher","unstructured":"Moya Rueda, F., Grzeszick, R., Fink, G., Feldhorst, S., ten Hompel, M.: Convolutional neural networks for human activity recognition using body-worn sensors. Informatics 5(2), 26 (2018). https:\/\/doi.org\/10.3390\/informatics5020026. http:\/\/www.mdpi.com\/2227-9709\/5\/2\/26","DOI":"10.3390\/informatics5020026"},{"key":"26_CR27","unstructured":"Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807\u2013814 (2010)"},{"issue":"1","key":"26_CR28","doi-asserted-by":"publisher","first-page":"115","DOI":"10.3390\/s16010115","volume":"16","author":"FJ Ord\u00f3\u00f1ez","year":"2016","unstructured":"Ord\u00f3\u00f1ez, F.J., Roggen, D.: Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)","journal-title":"Sensors"},{"key":"26_CR29","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825\u20132830 (2011)","journal-title":"J. Mach. Learn. Res."},{"key":"26_CR30","doi-asserted-by":"crossref","unstructured":"Peltonen, V., Tuomi, J., Klapuri, A., Huopaniemi, J., Sorsa, T.: Computational auditory scene recognition. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1941\u20131944 (2002)","DOI":"10.1109\/ICASSP.2002.5745009"},{"key":"26_CR31","unstructured":"Perttunen, M., Van Kleek, M., Lassila, O., Riekki, J.: Auditory context recognition using SVMs. In: Mobile Ubiquitous Computing, Systems, Services and Technologies, 2008. UBICOMM\u201908, IEEE, pp. 102\u2013108 (2008)"},{"key":"26_CR32","doi-asserted-by":"publisher","first-page":"e4568","DOI":"10.7717\/peerj.4568","volume":"6","author":"S Rajaraman","year":"2018","unstructured":"Rajaraman, S., Antani, S.K., Poostchi, M., Silamut, K., Hossain, M.A., Maude, R.J., Jaeger, S., Thoma, G.R.: Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 6, e4568 (2018)","journal-title":"PeerJ"},{"key":"26_CR33","doi-asserted-by":"publisher","first-page":"754","DOI":"10.1016\/j.neucom.2015.07.085","volume":"171","author":"JL Reyes-Ortiz","year":"2016","unstructured":"Reyes-Ortiz, J.L., Oneto, L., Sam\u00e0, A., Parra, X., Anguita, D.: Transition-aware human activity recognition using smartphones. Neurocomputing 171, 754\u2013767 (2016)","journal-title":"Neurocomputing"},{"key":"26_CR34","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1016\/j.eswa.2016.04.032","volume":"59","author":"CA Ronao","year":"2016","unstructured":"Ronao, C.A., Cho, S.B.: Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst. Appl. 59, 235\u2013244 (2016)","journal-title":"Expert Syst. Appl."},{"key":"26_CR35","unstructured":"Saeed, A., Ozcelebi, T., Trajanovski, S., Lukkien, J.: Learning behavioral context recognition with multi-stream temporal convolutional networks. arXiv preprint arXiv:180808766 (2018)"},{"issue":"1","key":"26_CR36","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929\u20131958 (2014)","journal-title":"J. Mach. Learn. Res."},{"issue":"1","key":"26_CR37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3161192","volume":"1","author":"Y Vaizman","year":"2017","unstructured":"Vaizman, Y.: Context recognition in-the-wild: unified model for multi-modal sensors and multi-label classification. PACM Interact. Mob. Wearable Ubiquitous Technol. 1(1), 1\u201322 (2017). https:\/\/doi.org\/10.1145\/3161192","journal-title":"PACM Interact. Mob. Wearable Ubiquitous Technol."},{"issue":"4","key":"26_CR38","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1109\/MPRV.2017.3971131","volume":"16","author":"Y Vaizman","year":"2017","unstructured":"Vaizman, Y., Ellis, K., Lanckriet, G.: Recognizing detailed human context in the wild from smartphones and smartwatches. IEEE Pervasive Comput. 16(4), 62\u201374 (2017). https:\/\/doi.org\/10.1109\/MPRV.2017.3971131. arXiv:1609.06354","journal-title":"IEEE Pervasive Comput."},{"issue":"6","key":"26_CR39","doi-asserted-by":"publisher","first-page":"1684","DOI":"10.1109\/TMM.2012.2199972","volume":"14","author":"X Valero","year":"2012","unstructured":"Valero, X., Alias, F.: Gammatone cepstral coefficients: biologically inspired features for non-speech audio classification. IEEE Trans. Multimedia 14(6), 1684\u20131689 (2012)","journal-title":"IEEE Trans. Multimedia"},{"key":"26_CR40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.patcog.2018.03.025","volume":"81","author":"X Xia","year":"2018","unstructured":"Xia, X., Togneri, R., Sohel, F., Huang, D.: Random forest classification based acoustic event detection utilizing contextual-information and bottleneck features. Pattern Recognit. 81, 1\u201313 (2018)","journal-title":"Pattern Recognit."},{"key":"26_CR41","doi-asserted-by":"crossref","unstructured":"Zhao, X., Wang, D.: Analyzing noise robustness of MFCC and GFCC features in speaker identification. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7204\u20137208. IEEE (2013)","DOI":"10.1109\/ICASSP.2013.6639061"}],"container-title":["CCF Transactions on Pervasive Computing and Interaction"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42486-020-00026-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s42486-020-00026-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42486-020-00026-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,1,23]],"date-time":"2021-01-23T01:12:03Z","timestamp":1611364323000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s42486-020-00026-2"}},"subtitle":["A case study for Inertial Measurement Unit and audio data"],"short-title":[],"issued":{"date-parts":[[2020,1,24]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,3]]}},"alternative-id":["26"],"URL":"https:\/\/doi.org\/10.1007\/s42486-020-00026-2","relation":{},"ISSN":["2524-521X","2524-5228"],"issn-type":[{"value":"2524-521X","type":"print"},{"value":"2524-5228","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,24]]},"assertion":[{"value":"11 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 January 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 January 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}