{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:55:29Z","timestamp":1753887329784,"version":"3.41.2"},"reference-count":36,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2018,8,1]],"date-time":"2018-08-01T00:00:00Z","timestamp":1533081600000},"content-version":"vor","delay-in-days":212,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Wroc\u0142aw University of Science and Technology"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2018,1]]},"abstract":"<jats:p>Automatic retrieval of music information is an active area of research in which problems such as automatically assigning genres or descriptors of emotional content to music emerge. Recent advancements in the area rely on the use of deep learning, which allows researchers to operate on a low\u2010level description of the music. Deep neural network architectures can learn to build feature representations that summarize music files from data itself, rather than expert knowledge. In this paper, a novel approach to applying feature learning in combination with support vector machines to musical data is presented. A spectrogram of the music file, which is too complex to be processed by SVM, is first reduced to a compact representation by a recurrent neural network. An adjustment to loss function of the network is proposed so that the network learns to build a representation space that replicates a certain notion of similarity between annotations, rather than to explicitly make predictions. We evaluate the approach on five datasets, focusing on emotion recognition and complementing it with genre classification. In experiments, the proposed loss function adjustment is shown to improve results in classification and regression tasks, but only when the learned similarity notion corresponds to a kernel function employed within the SVM. These results suggest that adjusting deep learning methods to build data representations that target a specific classifier or regressor can open up new perspectives for the use of standard machine learning methods in music domain.<\/jats:p>","DOI":"10.1155\/2018\/1935938","type":"journal-article","created":{"date-parts":[[2018,8,1]],"date-time":"2018-08-01T23:33:00Z","timestamp":1533166380000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Similarity\u2010Based Summarization of Music Files for Support Vector Machines"],"prefix":"10.1155","volume":"2018","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7816-1205","authenticated-orcid":false,"given":"Jan","family":"Jakubik","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Halina","family":"Kwa\u015bnicka","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2018,8]]},"reference":[{"key":"e_1_2_7_1_2","unstructured":"DefferrardM. MohantyS. P. CarrollS. F. andSalatheM. Learning to recognize musical genre from audio 2018 https:\/\/arxiv.org\/abs\/1803.05337v1."},{"key":"e_1_2_7_2_2","doi-asserted-by":"publisher","DOI":"10.2307\/829298"},{"key":"e_1_2_7_3_2","doi-asserted-by":"publisher","DOI":"10.1093\/oso\/9780192631886.003.0016"},{"key":"e_1_2_7_4_2","doi-asserted-by":"publisher","DOI":"10.1080\/02699939208411068"},{"key":"e_1_2_7_5_2","unstructured":"KimY. E. SchmidtE. M. MignecoR. MortonB. G. RichardsonP. ScottJ. SpeckJ. A. andTurnbulD. Music emotion recognition: a state of the art review Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010) 2010 Utrecht Netherlands 255\u2013266."},{"key":"e_1_2_7_6_2","unstructured":"SkowronekJ. McKinneyM. andvan de ParS. A demonstrator for automatic music mood estimation Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) 2007 Vienna Austria 345\u2013346."},{"key":"e_1_2_7_7_2","unstructured":"LaurierC. LartillotO. EerolaT. andToiviainenP. Exploring relationships between audio features and emotion in music Proceedings of the 7th Triennial Conference of European Society for the Cognitive Sciences of Music (ESCOM 2009) 2009 Jyv\u00e4skyl\u00e4 Finland 260\u2013264."},{"key":"e_1_2_7_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.911513"},{"key":"e_1_2_7_9_2","doi-asserted-by":"publisher","DOI":"10.1037\/1528-3542.8.4.494"},{"key":"e_1_2_7_10_2","doi-asserted-by":"publisher","DOI":"10.14257\/ijmue.2014.9.4.04"},{"key":"e_1_2_7_11_2","unstructured":"AljanakiA. WieringF. andVeltkampR. Computational modeling of induced emotion using gems Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014) 2014 Taipei Taiwan 373\u2013378."},{"key":"e_1_2_7_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10844-013-0248-5"},{"key":"e_1_2_7_13_2","doi-asserted-by":"publisher","DOI":"10.1037\/1528-3542.2.4.412"},{"key":"e_1_2_7_14_2","unstructured":"HenaffM. JarrettK. KavukcuogluK. andLeCunY. Unsupervised learning of sparse features for scalable audio classification Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011) 2011 Miami FL USA 681\u2013686."},{"key":"e_1_2_7_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/taslp.2014.2337842"},{"key":"e_1_2_7_16_2","doi-asserted-by":"crossref","unstructured":"JakubikJ.andKwa\u015bnickaH. Sparse coding methods for music induced emotion recognition Proceedings of the 2016 Federated Conference on Computer Science and Information Systems 2016 Gda\u0144sk Poland 53\u201360 https:\/\/doi.org\/10.15439\/2016F309 2-s2.0-85007158134.","DOI":"10.15439\/2016F309"},{"key":"e_1_2_7_17_2","doi-asserted-by":"crossref","unstructured":"ChoiY. FazekasG. SandlerM. andChoK. Convolutional recurrent neural networks for music classification 2017 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) 2017 New Orleans LA USA 2392\u20132396 https:\/\/doi.org\/10.1109\/ICASSP.2017.7952585 2-s2.0-85023756452.","DOI":"10.1109\/ICASSP.2017.7952585"},{"key":"e_1_2_7_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2017.2713830"},{"key":"e_1_2_7_19_2","doi-asserted-by":"crossref","unstructured":"JakubikJ.andKwa\u015bnickaH. Music emotion analysis using semantic embedding recurrent neural networks 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA) 2017 Gdynia Poland 271\u2013276 IEEEhttps:\/\/doi.org\/10.1109\/INISTA.2017.8001169 2-s2.0-85030220073.","DOI":"10.1109\/INISTA.2017.8001169"},{"key":"e_1_2_7_20_2","unstructured":"TangY. Deep learning using linear support vector machines International Conference on Machine Learning 2013: Challenges in Representation Learning Workshop 2013 Atlanta GA USA."},{"key":"e_1_2_7_21_2","unstructured":"ChoiK. FazekasG. SandlerM. B. andChoK. Transfer learning for music classification and regression tasks Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017) 2017 Suzhou China 141\u2013149."},{"key":"e_1_2_7_22_2","doi-asserted-by":"crossref","unstructured":"GollerC.andKuchlerA. Learning task-dependent distributed representations by backpropagation through structure Proceedings of International Conference on Neural Networks (ICNN\u203296) 1996 Washington DC USA 347\u2013352 https:\/\/doi.org\/10.1109\/ICNN.1996.548916.","DOI":"10.1109\/ICNN.1996.548916"},{"key":"e_1_2_7_23_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_7_24_2","unstructured":"BahdanauD. ChoK. andBengioY. Neural machine translation by jointly learning to align and translate International Conference on Learning Representations (ICLR 2015) 2015 San Diego CA USA."},{"key":"e_1_2_7_25_2","unstructured":"ChungJ. GulcehreC. ChoK. andBengioY. Empirical evaluation of gated recurrent neural networks on sequence modeling 2014 https:\/\/arxiv.org\/abs\/1412.3555."},{"key":"e_1_2_7_26_2","unstructured":"WuH. MinM. R. andBaiB. Deep semantic embedding Proceedings of Workshop on Semantic Matching in Information Retrieval co-located with the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SMIR@SIGIR 2014) 2014 Gold Coast QLD Australia 46\u201352."},{"key":"e_1_2_7_27_2","unstructured":"SongY. DixonS. andPearceM. Evaluation of musical features for emotion classification Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR 2012) 2012 Porto Portugal 523\u2013528."},{"key":"e_1_2_7_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2015.03.004"},{"key":"e_1_2_7_29_2","doi-asserted-by":"crossref","unstructured":"SoleymaniM. CaroM. N. SchmidtE. M. ShaC. Y. andYangY. H. 1000 songs for emotional analysis of music Proceedings of the 2nd ACM International Workshop on Crowdsourcing for Multimedia - CrowdMM \u203213 2012 Barcelona Spain https:\/\/doi.org\/10.1145\/2506364.2506365 2-s2.0-84887500129.","DOI":"10.1145\/2506364.2506365"},{"key":"e_1_2_7_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2002.800560"},{"key":"e_1_2_7_31_2","unstructured":"SturmB. L. The GTZAN dataset: its contents its faults their effects on evaluation and its future use 2013 https:\/\/arxiv.org\/abs\/1306.1461."},{"key":"e_1_2_7_32_2","unstructured":"SeyerlehnerK. WidmerG. andSchnitzerD. From rhythm patterns to perceived tempo Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) 2007 Vienna Austria 519\u2013524."},{"key":"e_1_2_7_33_2","doi-asserted-by":"crossref","unstructured":"JakubikJ. Evaluation of gated recurrent neural networks in music classification tasks Information Systems Architecture and Technology: Proceedings of 38th International Conference on Information Systems Architecture And Technology ISAT 2017 2018 Szklarska Por\u0119ba Poland 27\u201337 Advances in Intelligent Systems and Computing https:\/\/doi.org\/10.1007\/978-3-319-67220-5_3 2-s2.0-85029501887.","DOI":"10.1007\/978-3-319-67220-5_3"},{"key":"e_1_2_7_34_2","unstructured":"ZeilerM. D. ADADELTA: an adaptive learning rate method 2012 https:\/\/arxiv.org\/abs\/1212.5701."},{"key":"e_1_2_7_35_2","unstructured":"Theano Development Team Theano: a python framework for fast computation of mathematical expressions 2016 https:\/\/arxiv.org\/abs\/1605.02688."},{"key":"e_1_2_7_36_2","unstructured":"LartillotO.andToiviainenP. A Matlab toolbox for musical feature extraction from audio International Conference on Digital Audio Effects (DAFX 2018) 2007 Aveiro Portugal 237\u2013244."}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2018\/1935938.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2018\/1935938.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2018\/1935938","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,8]],"date-time":"2024-08-08T23:05:14Z","timestamp":1723158314000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2018\/1935938"}},"subtitle":[],"editor":[{"given":"Piotr","family":"J\u0119drzejowicz","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2018,1]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,1]]}},"alternative-id":["10.1155\/2018\/1935938"],"URL":"https:\/\/doi.org\/10.1155\/2018\/1935938","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"type":"print","value":"1076-2787"},{"type":"electronic","value":"1099-0526"}],"subject":[],"published":{"date-parts":[[2018,1]]},"assertion":[{"value":"2018-04-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-07-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-08-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"1935938"}}