{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:32:09Z","timestamp":1750221129443,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":28,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,6,5]],"date-time":"2019-06-05T00:00:00Z","timestamp":1559692800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004608","name":"Natural Science Foundation of Jiangsu Province","doi-asserted-by":"publisher","award":["BK20171345"],"award-info":[{"award-number":["BK20171345"]}],"id":[{"id":"10.13039\/501100004608","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61003113,61321491,61672273"],"award-info":[{"award-number":["61003113,61321491,61672273"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,6,5]]},"DOI":"10.1145\/3323873.3325031","type":"proceedings-article","created":{"date-parts":[[2019,6,10]],"date-time":"2019-06-10T12:10:58Z","timestamp":1560168658000},"page":"150-158","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["A Hierarchical Attentive Deep Neural Network Model for Semantic Music Annotation Integrating Multiple Music Representations"],"prefix":"10.1145","author":[{"given":"Qianqian","family":"Wang","sequence":"first","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"given":"Feng","family":"Su","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"given":"Yuyang","family":"Wang","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2019,6,5]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Jyh-Shing Roger Jang, and Costas S Iliopoulos","author":"Chang Kaichun K","year":"2010","unstructured":"Kaichun K Chang , Jyh-Shing Roger Jang, and Costas S Iliopoulos . 2010 . Music Genre Classification via Compressive Sampling. In ISMIR. 387--392. Kaichun K Chang, Jyh-Shing Roger Jang, and Costas S Iliopoulos. 2010. Music Genre Classification via Compressive Sampling. In ISMIR. 387--392."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/TASL.2009.2022435","article-title":"On the Use of Anti-Word Models for Audio Music Annotation and Retrieval","volume":"17","author":"Chen Zhi-Sheng","year":"2009","unstructured":"Zhi-Sheng Chen and Jyh-Shing Roger Jang . 2009 . On the Use of Anti-Word Models for Audio Music Annotation and Retrieval . IEEE Transactions on Audio, Speech, and Language Processing , Vol. 17 , 8 (Nov 2009), 1547--1556. Zhi-Sheng Chen and Jyh-Shing Roger Jang. 2009. On the Use of Anti-Word Models for Audio Music Annotation and Retrieval. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, 8 (Nov 2009), 1547--1556.","journal-title":"IEEE Transactions on Audio, Speech, and Language Processing"},{"key":"e_1_3_2_1_3_1","volume-title":"Automatic Tagging Using Deep Convolutional Neural Networks. In 17th International Society for Music Information Retrieval Conference .","author":"Choi Keunwoo","year":"2016","unstructured":"Keunwoo Choi , George Fazekas , and Mark Sandler . 2016 . Automatic Tagging Using Deep Convolutional Neural Networks. In 17th International Society for Music Information Retrieval Conference . Keunwoo Choi, George Fazekas, and Mark Sandler. 2016. Automatic Tagging Using Deep Convolutional Neural Networks. In 17th International Society for Music Information Retrieval Conference ."},{"key":"e_1_3_2_1_4_1","volume-title":"Convolutional Recurrent Neural Networks for Music Classification. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2392--2396","author":"Choi Keunwoo","year":"2017","unstructured":"Keunwoo Choi , Gyorgy Fazekas , Mark Sandler , and Kyunghyun Cho . 2017 . Convolutional Recurrent Neural Networks for Music Classification. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2392--2396 . Keunwoo Choi, Gyorgy Fazekas, Mark Sandler, and Kyunghyun Cho. 2017. Convolutional Recurrent Neural Networks for Music Classification. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2392--2396."},{"key":"e_1_3_2_1_5_1","volume-title":"Language Modeling with Gated Convolutional Networks. arXiv preprint arXiv:1612.08083","author":"Dauphin Yann N.","year":"2016","unstructured":"Yann N. Dauphin , Angela Fan , Michael Auli , and David Grangier . 2016. Language Modeling with Gated Convolutional Networks. arXiv preprint arXiv:1612.08083 ( 2016 ). Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. 2016. Language Modeling with Gated Convolutional Networks. arXiv preprint arXiv:1612.08083 (2016)."},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the 14th international society for music information retrieval conference. 116--121","author":"Dieleman Sander","year":"2013","unstructured":"Sander Dieleman and Benjamin Schrauwen . 2013 . Multiscale Approaches to Music Audio Feature Learning . In Proceedings of the 14th international society for music information retrieval conference. 116--121 . Sander Dieleman and Benjamin Schrauwen. 2013. Multiscale Approaches to Music Audio Feature Learning. In Proceedings of the 14th international society for music information retrieval conference. 116--121."},{"key":"e_1_3_2_1_7_1","volume-title":"End-To-End Learning for Music Audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6964--6968","author":"Dieleman Sander","year":"2014","unstructured":"Sander Dieleman and Benjamin Schrauwen . 2014 . End-To-End Learning for Music Audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6964--6968 . Sander Dieleman and Benjamin Schrauwen. 2014. End-To-End Learning for Music Audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6964--6968."},{"key":"e_1_3_2_1_8_1","volume-title":"Ng","author":"Grosse Roger","year":"2012","unstructured":"Roger Grosse , Rajat Raina , Helen Kwong , and Andrew Y . Ng . 2012 . Shift-Invariance Sparse Coding for Audio Classification . arXiv preprint arXiv:1206.5241 (2012). Roger Grosse, Rajat Raina, Helen Kwong, and Andrew Y. Ng. 2012. Shift-Invariance Sparse Coding for Audio Classification. arXiv preprint arXiv:1206.5241 (2012)."},{"key":"e_1_3_2_1_9_1","volume-title":"Brains on Beats. In 30th Conference on Neural Information Processing Systems (NIPS","author":"G\u00fcccl\u00fc Umut","year":"2016","unstructured":"Umut G\u00fcccl\u00fc , Jordy Thielen , Michael Hanke , and Marcel A. J . van Gerven. 2016 . Brains on Beats. In 30th Conference on Neural Information Processing Systems (NIPS 2016 ). 2101--2109. Umut G\u00fcccl\u00fc, Jordy Thielen, Michael Hanke, and Marcel A. J. van Gerven. 2016. Brains on Beats. In 30th Conference on Neural Information Processing Systems (NIPS 2016). 2101--2109."},{"key":"e_1_3_2_1_10_1","unstructured":"Philippe Hamel Simon Lemieux Yoshua Bengio and Douglas Eck. 2011. Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio. In ISMIR. 729--734.  Philippe Hamel Simon Lemieux Yoshua Bengio and Douglas Eck. 2011. Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio. In ISMIR. 729--734."},{"key":"e_1_3_2_1_11_1","volume-title":"Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778","author":"He Kaiming","year":"2016","unstructured":"Kaiming He , Xiangyu Zhang , Shaoqing Ren , and Jian Sun . 2016 . Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778 . Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778."},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning. 448--456","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy . 2015 . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift . In Proceedings of the 32nd International Conference on Machine Learning. 448--456 . Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning. 448--456."},{"key":"e_1_3_2_1_13_1","volume-title":"Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 366--370","author":"Kim Taejun","year":"2018","unstructured":"Taejun Kim , Jongpil Lee , and Juhan Nam . 2018 . Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 366--370 . Taejun Kim, Jongpil Lee, and Juhan Nam. 2018. Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 366--370."},{"volume-title":"Adam: A Method for Stochastic Optimization. In the 3rd International Conference for Learning Representations .","author":"Diederik","key":"e_1_3_2_1_14_1","unstructured":"Diederik P. Kingma and Jimmy Lei Ba. 2015 . Adam: A Method for Stochastic Optimization. In the 3rd International Conference for Learning Representations . Diederik P. Kingma and Jimmy Lei Ba. 2015. Adam: A Method for Stochastic Optimization. In the 3rd International Conference for Learning Representations ."},{"volume-title":"Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR). 387--392","author":"Law Edith","key":"e_1_3_2_1_15_1","unstructured":"Edith Law , Kris West , Michael Mandel , Mert Bay , and J. Stephen Downie . 2009. Evaluation of Algorithms Using Games: The Case of Music Tagging . In Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR). 387--392 . Edith Law, Kris West, Michael Mandel, Mert Bay, and J. Stephen Downie. 2009. Evaluation of Algorithms Using Games: The Case of Music Tagging. In Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR). 387--392."},{"volume-title":"Multi-Level and Multi-Scale Feature Aggregation Using Pretrained Convolutional Neural Networks for Music Auto-Tagging","author":"Lee Jongpil","key":"e_1_3_2_1_16_1","unstructured":"Jongpil Lee and Juhan Nam . 2017. Multi-Level and Multi-Scale Feature Aggregation Using Pretrained Convolutional Neural Networks for Music Auto-Tagging . In IEEE Signal Processing Letters . 1208--1212. Jongpil Lee and Juhan Nam. 2017. Multi-Level and Multi-Scale Feature Aggregation Using Pretrained Convolutional Neural Networks for Music Auto-Tagging. In IEEE Signal Processing Letters. 1208--1212."},{"key":"e_1_3_2_1_17_1","volume-title":"International Conference on Learning Representations .","author":"Lin Zhouhan","year":"2017","unstructured":"Zhouhan Lin , Minwei Feng , Cicero Nogueira dos Santos , Mo Yu , Bing Xiang , Bowen Zhou , and Yoshua Bengio . 2017 . A Structured Self-attentive Sentence Embedding . In International Conference on Learning Representations . Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A Structured Self-attentive Sentence Embedding. In International Conference on Learning Representations ."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2964292"},{"key":"e_1_3_2_1_19_1","volume-title":"A Deep Bag-of-Features Model for Music Auto-Tagging. arXiv preprint arXiv:1508.04999","author":"Nam Juhan","year":"2015","unstructured":"Juhan Nam , Jorge Herrera , and Kyogu Lee . 2015. A Deep Bag-of-Features Model for Music Auto-Tagging. arXiv preprint arXiv:1508.04999 ( 2015 ). Juhan Nam, Jorge Herrera, and Kyogu Lee. 2015. A Deep Bag-of-Features Model for Music Auto-Tagging. arXiv preprint arXiv:1508.04999 (2015)."},{"key":"e_1_3_2_1_20_1","volume-title":"Proceedings of the 13th International Society for Music Information Retrieval Conference. 565--571","author":"Nam Juhan","year":"2012","unstructured":"Juhan Nam , Jorge Herrera , Malcolm Slaney , and Julius Smith . 2012 . Learning Sparse Feature Representations for Music Annotation and Retrieval . In Proceedings of the 13th International Society for Music Information Retrieval Conference. 565--571 . Juhan Nam, Jorge Herrera, Malcolm Slaney, and Julius Smith. 2012. Learning Sparse Feature Representations for Music Annotation and Retrieval. In Proceedings of the 13th International Society for Music Information Retrieval Conference. 565--571."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631393"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICMLA.2011.102"},{"key":"e_1_3_2_1_24_1","first-page":"1","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava Nitish","year":"2014","unstructured":"Nitish Srivastava , Geoffrey Hinton , Alex Krizhevsky , Ilya Sutskever , and Ruslan Salakhutdinov . 2014 . Dropout: A Simple Way to Prevent Neural Networks from Overfitting . The Journal of Machine Learning Research , Vol. 15 , 1 (January 2014), 1929--1958. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research, Vol. 15, 1 (January 2014), 1929--1958.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1743384.1743400"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.913750"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2002.800560"},{"key":"e_1_3_2_1_28_1","volume-title":"Conference of the International Society for Music Information Retrieval (ISMIR","author":"van den Oord Aaron","year":"2014","unstructured":"Aaron van den Oord , Sander Dieleman , and Benjamin Schrauwen . 2014 . Transfer Learning by Supervised Pre-Training for Audio-Based Music Classification . In Conference of the International Society for Music Information Retrieval (ISMIR 2014). 29--34. Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2014. Transfer Learning by Supervised Pre-Training for Audio-Based Music Classification. In Conference of the International Society for Music Information Retrieval (ISMIR 2014). 29--34."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123327"}],"event":{"name":"ICMR '19: International Conference on Multimedia Retrieval","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Ottawa ON Canada","acronym":"ICMR '19"},"container-title":["Proceedings of the 2019 on International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323873.3325031","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3323873.3325031","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:22Z","timestamp":1750208542000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323873.3325031"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,5]]},"references-count":28,"alternative-id":["10.1145\/3323873.3325031","10.1145\/3323873"],"URL":"https:\/\/doi.org\/10.1145\/3323873.3325031","relation":{},"subject":[],"published":{"date-parts":[[2019,6,5]]},"assertion":[{"value":"2019-06-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}