{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:20:58Z","timestamp":1760239258133,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2020,10,26]],"date-time":"2020-10-26T00:00:00Z","timestamp":1603670400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In recent years, many researchers have shown increasing interest in music information retrieval (MIR) applications, with automatic chord recognition being one of the popular tasks. Many studies have achieved\/demonstrated considerable improvement using deep learning based models in automatic chord recognition problems. However, most of the existing models have focused on simple chord recognition, which classifies the root note with the major, minor, and seventh chords. Furthermore, in learning-based recognition, it is critical to collect high-quality and large amounts of training data to achieve the desired performance. In this paper, we present a multi-task learning (MTL) model for a guitar chord recognition task, where the model is trained using a relatively large-vocabulary guitar chord dataset. To solve data scarcity issues, a physical data augmentation method that directly records the chord dataset from a robotic performer is employed. Deep learning based MTL is proposed to improve the performance of automatic chord recognition with the proposed physical data augmentation dataset. The proposed MTL model is compared with four baseline models and its corresponding single-task learning model using two types of datasets, including a human dataset and a human combined with the augmented dataset. The proposed methods outperform the baseline models, and the results show that most scores of the proposed multi-task learning model are better than those of the corresponding single-task learning model. The experimental results demonstrate that physical data augmentation is an effective method for increasing the dataset size for guitar chord recognition tasks.<\/jats:p>","DOI":"10.3390\/s20216077","type":"journal-article","created":{"date-parts":[[2020,10,26]],"date-time":"2020-10-26T10:38:35Z","timestamp":1603708715000},"page":"6077","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Guitar Chord Sensing and Recognition Using Multi-Task Learning and Physical Data Augmentation with Robotics"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2117-2904","authenticated-orcid":false,"given":"Gerelmaa","family":"Byambatsogt","sequence":"first","affiliation":[{"name":"Department of Computer Science and Electrical Engineering, Kumamoto University, Kumamoto 860-8555, Japan"},{"name":"Machine Intelligence Laboratory, National University of Mongolia, Ulaanbaatar 14201, Mongolia"}]},{"given":"Lodoiravsal","family":"Choimaa","sequence":"additional","affiliation":[{"name":"Machine Intelligence Laboratory, National University of Mongolia, Ulaanbaatar 14201, Mongolia"}]},{"given":"Gou","family":"Koutaki","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Electrical Engineering, Kumamoto University, Kumamoto 860-8555, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2020,10,26]]},"reference":[{"key":"ref_1","unstructured":"Zhou, X., Lerch, A., and Bello, J.P. (2015, January 26\u201330). Chord recognition using deep learning. Proceedings of the 16th International Society for Music Information Retrieval Conference, M\u00e1laga, Spain."},{"key":"ref_2","unstructured":"Choi, K., Fazekas, G., Cho, K., and Sandler, M.B. (2017). A Tutorial on Deep Learning for Music Information Retrieval. CoRR, abs\/1709.04396."},{"key":"ref_3","unstructured":"Cheng, H.T., Yang, Y.H., Lin, Y.C., Liao, I.B., and Chen, H.H. (April, January 23). Automatic chord recognition for music classification and retrieval. Proceedings of the IEEE International Conference on Multimedia and Expo, Hannover, Germany."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Korzeniowski, F., and Widmer, G. (2016, January 13\u201316). A fully convolutional deep auditory model for musical chord recognition. Proceedings of the IEEE 26th International Workshop on Machine Learning for Signal Processing, Vietri sul Mare, Italy.","DOI":"10.1109\/MLSP.2016.7738895"},{"key":"ref_5","unstructured":"McFee, B., and Bello, J.P. (2017, January 23\u201327). Structured training for large-vocabulary chord recognition. Proceedings of the 18th International Society for Music Information Retrieval, Suzhou, China."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Maruo, S., Yoshii, K., Itoyama, K., Mauch, M., and Goto, M. (2015, January 19\u201324). A feedback framework for improved chord recognition based on NMF based approximate note transcription. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia.","DOI":"10.1109\/ICASSP.2015.7177959"},{"key":"ref_7","unstructured":"Jiang, N., Grosche, P., Konz, V., and M\u00fcller, M. (2011, January 22\u201324). Analyzing chroma feature types for automated chord recognition. Proceedings of the 42nd AES International Conference on Semantic Audio, Ilmenau, Germany."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Humphrey, E.J., and Bello, J.P. (2012, January 12\u201315). Rethinking automatic chord recognition with convolutional neural networks. Proceedings of the IEEE 11th International Conference on Machine Learning and Applications, Boca Raton, FL, USA.","DOI":"10.1109\/ICMLA.2012.220"},{"key":"ref_9","unstructured":"Ruder, S. (2017). An overview of multi-task learning in deep neural networks. arXiv."},{"key":"ref_10","unstructured":"Xi, Q., Bittner, R.M., Pauwels, J., Ye, X., and Bello, J.P. (2018, January 23\u201327). Guitarset: A dataset for guitar transcription. Proceedings of the 19th International Society for Music Information Retrieval Conference, Paris, France."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Byambatsogt, G., Koutaki, G., and Choimaa, L. (2019, January 15\u201318). Improved Chord Recognition using Synthetic Data Generation by Robot. Proceedings of the IEEE 8th Global Conference on Consumer Electronics, Osaka, Japan.","DOI":"10.1109\/GCCE46687.2019.9015511"},{"key":"ref_12","unstructured":"Fujishima, T. (1999, January 22\u201327). Realtime Chord Recognition of Musical Sound: A System Using Common Lisp Music. Proceedings of the International Computer Music Conference, Beijing, China."},{"key":"ref_13","unstructured":"Boulanger-Lewandowski, N., Bengio, Y., and Vincent, P. (2013, January 4\u20138). Audio chord recognition with recurrent neural networks. Proceedings of the International Society for Music Information Retrieval Conference, Curitiba, PR, Brazil."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Choi, K., Fazekas, G., Sandler, M., and Cho, K. (2017, January 5\u20139). Convolutional recurrent neural networks for music classification. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA.","DOI":"10.1109\/ICASSP.2017.7952585"},{"key":"ref_15","first-page":"30","article-title":"An overview of multi-task learning","volume":"5","author":"Yu","year":"2017","journal-title":"Natl. Sci. Rev."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cipolla, R., Gal, Y., and Kendall, A. (2018, January 18\u201323). Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00781"},{"key":"ref_17","unstructured":"Liu, X., Gao, J., He, X., Deng, L., Duh, K., and Wang, Y. (June, January 31). Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval. Proceedings of the Conference on North AMerican Chapter Assoc. Comput. Linguist. Hum. Lang. Technol, Denver, CO, USA."},{"key":"ref_18","unstructured":"Ramsundar, B., Kearnes, S., Riley, P., Webster, D., Konerding, D., and Pande, V. (2015). Massively Multitask Networks for Drug Discovery. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1508","DOI":"10.1109\/TNNLS.2016.2520964","article-title":"Modeling Disease Progression via Multisource Multitask Learners: A Case Study With Alzheimer\u2019s Disease","volume":"28","author":"Nie","year":"2017","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Yang, M., Su, L., and Yang, Y. (2016, January 13\u201316). Highlighting root notes in chord recognition using cepstral features and multi-task learning. Proceedings of the in Asia- Pacific Signal and Information Processing Association Annual Summit and Conference, Jeju, South Korea.","DOI":"10.1109\/APSIPA.2016.7820865"},{"key":"ref_21","unstructured":"Bittner, R.M., McFee, B., and Bello, J.P. (2018). Multitask learning for fundamental frequency estimation in music. CoRR, abs\/1809.00381."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1145\/2818994","article-title":"A survey of robotic musicianship","volume":"59","author":"Bretan","year":"2016","journal-title":"Commun. Acm"},{"key":"ref_23","unstructured":"Singer, E., Larke, K., and Bianciardi, D. (2003, January 22\u201324). Lemur guitarbot: Midi robotic string instrument. Proceedings of the Conference on New Interfaces for Musical Expression, Montreal, QC, Canada."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"McVay, J., Carnegie, D., and Murphy, J. (2015, January 17\u201319). An overview of mechbass: A four string robotic bass guitar. Proceedings of the 6th International Conference on Automation, Robotics and Applications, Queenstown, New Zealand.","DOI":"10.1109\/ICARA.2015.7081144"},{"key":"ref_25","unstructured":"Vindriis, R., and Dale, A.C. (2016;, January 11\u201315). StrumBot\u2014An Overview of a Strumming Guitar Robot. Proceedings of the New interfaces for musical expression, Brisbane, Australia."},{"key":"ref_26","first-page":"2","article-title":"Data generation from robotic performer for chord recognition","volume":"141","author":"Byambatsogt","year":"2020","journal-title":"Inst. Electr. Eng. Jpn. (IEEJ) Trans.n Electr. Inf. Syst."},{"key":"ref_27","unstructured":"Huang, S., Li, Q., Anil, C., Bao, X., Oore, S., and Grosse, R.B. (2019, January 6\u20139). Timbretron: A wavenet(cycleGAN(CQT(audio))) pipeline for musical timbre transfer. Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1533","DOI":"10.1109\/TASLP.2014.2339736","article-title":"Convolutional neural networks for speech recognition","volume":"22","author":"Mohamed","year":"2014","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"ref_29","unstructured":"Hanson, B.A., and Applebaum, T.H. (1990, January 3\u20136). Robust speaker-independent word recognition using static, dynamic and acceleration features: Experiments with Lombard and noisy speech. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, NM, USA."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., and Nieto, O. (2015, January 6\u201312). librosa: Audio and music signal analysis in python. Proceedings of the 14th Python in Science Conference, Austin, TX, USA.","DOI":"10.25080\/Majora-7b98e3ed-003"},{"key":"ref_31","unstructured":"Harte, C., Sandler, M., Abdallah, S., and Gomez, E. (2005, January 11\u201315). Symbolic representation of musical chords: A proposed syntax for text annotations. Proceedings of the 6th International Conference on Music Information Retrieval, London, UK."},{"key":"ref_32","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, Lille, France."},{"key":"ref_33","first-page":"1929","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_34","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_35","unstructured":"Raffel, C., Mcfee, B., Humphrey, E., Salamon, J., Nieto, O., Liang, D., and Ellis, D. (2014, January 27\u201331). Mir eval: A transparent implementation of common mir metrics. Proceedings of the 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan."},{"key":"ref_36","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the 3rd International Conference on Learning Representation, Conference Track Proc, San Diego, CA, USA."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/21\/6077\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:28:15Z","timestamp":1760178495000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/21\/6077"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,26]]},"references-count":36,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2020,11]]}},"alternative-id":["s20216077"],"URL":"https:\/\/doi.org\/10.3390\/s20216077","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,10,26]]}}}