{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:25:37Z","timestamp":1750220737195,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":18,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,5,28]],"date-time":"2020-05-28T00:00:00Z","timestamp":1590624000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"The Science, Technology and Innovation Commission of Shenzhen Municipality ?","award":["No. JCYJ20180306170414910"],"award-info":[{"award-number":["No. JCYJ20180306170414910"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,5,28]]},"DOI":"10.1145\/3404716.3404726","type":"proceedings-article","created":{"date-parts":[[2020,7,8]],"date-time":"2020-07-08T10:18:17Z","timestamp":1594203497000},"page":"86-91","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["U Recurrent Neural Network for Polyphonic Sound Event Detection and Localization"],"prefix":"10.1145","author":[{"given":"Lihong","family":"Pi","sequence":"first","affiliation":[{"name":"Tsinghua University, The Institute of Microelectronics, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xue","family":"Zheng","sequence":"additional","affiliation":[{"name":"Tsinghua University, The Institute of Microelectronics, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tsinghua University, The Institute of Microelectronics, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ping","family":"Chen","sequence":"additional","affiliation":[{"name":"Beijing Yiemed Medical Technology Co., Ltd, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhe","family":"Wang","sequence":"additional","affiliation":[{"name":"Beijing Sanping Technology Co., Ltd, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangyu","family":"Li","sequence":"additional","affiliation":[{"name":"Research Institute of Tsinghua University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,7,8]]},"reference":[{"unstructured":"http:\/\/dcase.community\/challenge2019\/task-sound-event-localization-and-detection.  http:\/\/dcase.community\/challenge2019\/task-sound-event-localization-and-detection.","key":"e_1_3_2_1_1_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_2_1","DOI":"10.1109\/JSTSP.2018.2885636"},{"key":"e_1_3_2_1_3_1","volume-title":"DCASE 2016 sound event detection system based on convolutional neural network. IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events.","author":"Gorin A.","year":"2016","unstructured":"Gorin , A. , Makhazhanov , N. , & Shmyrev , N. ( 2016 ). DCASE 2016 sound event detection system based on convolutional neural network. IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events. Gorin, A., Makhazhanov, N., & Shmyrev, N. (2016). DCASE 2016 sound event detection system based on convolutional neural network. IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_4_1","DOI":"10.1186\/s13636-015-0069-2"},{"key":"e_1_3_2_1_5_1","volume-title":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2742--2746)","author":"Wang Y.","year":"2016","unstructured":"Wang , Y. , Neves , L. , & Metze , F. ( 2016 , March). Audio-based multimedia event detection using deep recurrent neural networks . In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2742--2746) . IEEE. Wang, Y., Neves, L., & Metze, F. (2016, March). Audio-based multimedia event detection using deep recurrent neural networks. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2742--2746). IEEE."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_6_1","DOI":"10.1109\/ICASSP.2016.7472917"},{"key":"e_1_3_2_1_7_1","volume-title":"Sound event detection in multichannel audio using spatial and harmonic features. arXiv preprint arXiv:1706.02293","author":"Adavanne S.","year":"2017","unstructured":"Adavanne , S. , Parascandolo , G. , Pertil\u00e4 , P. , Heittola , T. , & Virtanen , T. ( 2017 ). Sound event detection in multichannel audio using spatial and harmonic features. arXiv preprint arXiv:1706.02293 . Adavanne, S., Parascandolo, G., Pertil\u00e4, P., Heittola, T., & Virtanen, T. (2017). Sound event detection in multichannel audio using spatial and harmonic features. arXiv preprint arXiv:1706.02293."},{"key":"e_1_3_2_1_8_1","first-page":"35","volume-title":"Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016)","author":"Hayashi T.","year":"2016","unstructured":"Hayashi , T. , Watanabe , S. , Toda , T. , Hori , T. , Le Roux , J. , & Takeda , K. ( 2016 , September). Bidirectional LSTM-HMM hybrid system for polyphonic sound event detection . In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016) (pp. 35 -- 39 ). Hayashi, T., Watanabe, S., Toda, T., Hori, T., Le Roux, J., & Takeda, K. (2016, September). Bidirectional LSTM-HMM hybrid system for polyphonic sound event detection. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016) (pp. 35--39)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_9_1","DOI":"10.1109\/TASLP.2017.2690575"},{"key":"e_1_3_2_1_10_1","volume-title":"Advances in neural information processing systems (pp. 91--99).","author":"Ren S.","year":"2015","unstructured":"Ren , S. , He , K. , Girshick , R. , & Sun , J. ( 2015 ). Faster r-cnn: Towards real-time object detection with region proposal networks . In Advances in neural information processing systems (pp. 91--99). Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91--99)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_11_1","DOI":"10.1007\/978-3-319-24574-4_28"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_12_1","DOI":"10.1109\/TASSP.1976.1162830"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_13_1","DOI":"10.5555\/3298023.3298188"},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_14_1","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_3_2_1_15_1","volume-title":"Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167","author":"Ioffe S.","year":"2015","unstructured":"Ioffe , S. , & Szegedy , C. ( 2015 ). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 . Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167."},{"key":"e_1_3_2_1_16_1","volume-title":"A stochastic approximation method. The annals of mathematical statistics, 400--407","author":"Robbins H.","year":"1951","unstructured":"[ 21 ] Robbins , H. , & Monro , S. ( 1951 ). A stochastic approximation method. The annals of mathematical statistics, 400--407 . [21] Robbins, H., & Monro, S. (1951). A stochastic approximation method. The annals of mathematical statistics, 400--407."},{"key":"e_1_3_2_1_17_1","volume-title":"Polyphonic sound event detection and localization using a two-stage strategy. arXiv preprint arXiv:1905.00268","author":"Cao Y.","year":"2019","unstructured":"Cao , Y. , Kong , Q. , Iqbal , T. , An , F. , Wang , W. , & Plumbley , M. D. ( 2019 ). Polyphonic sound event detection and localization using a two-stage strategy. arXiv preprint arXiv:1905.00268 . Cao, Y., Kong, Q., Iqbal, T., An, F., Wang, W., & Plumbley, M. D. (2019). Polyphonic sound event detection and localization using a two-stage strategy. arXiv preprint arXiv:1905.00268."},{"doi-asserted-by":"publisher","key":"e_1_3_2_1_18_1","DOI":"10.21437\/Interspeech.2017-1238"}],"event":{"sponsor":["Shenzhen University Shenzhen University"],"acronym":"ICMSSP 2020","name":"ICMSSP 2020: 2020 5th International Conference on Multimedia Systems and Signal Processing","location":"Chengdu China"},"container-title":["Proceedings of the 2020 5th International Conference on Multimedia Systems and Signal Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404716.3404726","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3404716.3404726","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:32Z","timestamp":1750199912000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3404716.3404726"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,28]]},"references-count":18,"alternative-id":["10.1145\/3404716.3404726","10.1145\/3404716"],"URL":"https:\/\/doi.org\/10.1145\/3404716.3404726","relation":{},"subject":[],"published":{"date-parts":[[2020,5,28]]},"assertion":[{"value":"2020-07-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}