{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T14:38:47Z","timestamp":1774449527746,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":48,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,7,6]],"date-time":"2022-07-06T00:00:00Z","timestamp":1657065600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"The 13th Five-Year All-Army Common Information System Equipment Pre-Research Project","award":["31514020501, 31514020503"],"award-info":[{"award-number":["31514020501, 31514020503"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,7,6]]},"DOI":"10.1145\/3477495.3531867","type":"proceedings-article","created":{"date-parts":[[2022,7,7]],"date-time":"2022-07-07T15:12:13Z","timestamp":1657206733000},"page":"938-948","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":39,"title":["Multimodal Entity Linking with Gated Hierarchical Fusion and Contrastive Training"],"prefix":"10.1145","author":[{"given":"Peng","family":"Wang","sequence":"first","affiliation":[{"name":"Southeast University, Nanjing, China"}]},{"given":"Jiangheng","family":"Wu","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, China"}]},{"given":"Xiaohang","family":"Chen","sequence":"additional","affiliation":[{"name":"Southeast University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2022,7,7]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Proceedings of the 12th Conference on Language Resources and Evaluation. 4285--4292","author":"Adjali Omar","year":"2020","unstructured":"Omar Adjali , Romaric Besancc on, Olivier Ferret , 2020 a. Building a Multimodal Entity Linking Dataset From Tweets . In Proceedings of the 12th Conference on Language Resources and Evaluation. 4285--4292 . Omar Adjali, Romaric Besancc on, Olivier Ferret, et almbox. 2020 a. Building a Multimodal Entity Linking Dataset From Tweets. In Proceedings of the 12th Conference on Language Resources and Evaluation. 4285--4292."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-45439-5_31"},{"key":"e_1_3_2_2_3_1","volume-title":"Abdulmotaleb El Saddik, et almbox","author":"Atrey Pradeep K","year":"2010","unstructured":"Pradeep K Atrey , M Anwar Hossain , Abdulmotaleb El Saddik, et almbox . 2010 . Multimodal Fusion for Multimedia Analysis: A Survey. Multimedia systems , Vol. 16 (2010), 345--379. Pradeep K Atrey, M Anwar Hossain, Abdulmotaleb El Saddik, et almbox. 2010. Multimodal Fusion for Multimedia Analysis: A Survey. Multimedia systems , Vol. 16 (2010), 345--379."},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685317"},{"key":"e_1_3_2_2_5_1","volume-title":"Improved baselines with momentum contrastive learning. arXiv preprint","author":"Chen Xinlei","year":"2020","unstructured":"Xinlei Chen , Haoqi Fan , Ross B. Girshick , and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint ( 2020 ), 2003.04297. Xinlei Chen, Haoqi Fan, Ross B. Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint (2020), 2003.04297."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00129"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11155"},{"key":"e_1_3_2_2_8_1","volume-title":"Autoregressive Entity Retrieval. In International Conference on Learning Representations (ICLR) .","author":"Cao Nicola De","year":"2021","unstructured":"Nicola De Cao , Gautier Izacard , Sebastian Riedel , and Fabio Petroni . 2021 . Autoregressive Entity Retrieval. In International Conference on Learning Representations (ICLR) . Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2021. Autoregressive Entity Retrieval. In International Conference on Learning Representations (ICLR) ."},{"key":"e_1_3_2_2_9_1","volume-title":"Proceedings of the 17th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 17th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186 . Jacob Devlin, Ming-Wei Chang, Kenton Lee, et almbox. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 17th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186."},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775178"},{"key":"e_1_3_2_2_11_1","volume-title":"Jan-Willem JR van't Klooster, et almbox","author":"Dolmans Tenzing C","year":"2020","unstructured":"Tenzing C Dolmans , Mannes Poel , Jan-Willem JR van't Klooster, et almbox . 2020 . Perceived Mental Workload Classification Using Intermediate Fusion Multimodal Deep Learning. Frontiers in human neuroscience , Vol. 14 (2020), 609096. Tenzing C Dolmans, Mannes Poel, Jan-Willem JR van't Klooster, et almbox. 2020. Perceived Mental Workload Classification Using Intermediate Fusion Multimodal Deep Learning. Frontiers in human neuroscience , Vol. 14 (2020), 609096."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K17-1008"},{"key":"e_1_3_2_2_13_1","volume-title":"Proceedings of the 26th International Conference on Neural Information Processing Systems. 2121--2129","author":"Frome Andrea","year":"2013","unstructured":"Andrea Frome , Greg S Corrado , Jonathon Shlens , 2013 . DeViSE: A Deep Visual-semantic Embedding Model . In Proceedings of the 26th International Conference on Neural Information Processing Systems. 2121--2129 . Andrea Frome, Greg S Corrado, Jonathon Shlens, et almbox. 2013. DeViSE: A Deep Visual-semantic Embedding Model. In Proceedings of the 26th International Conference on Neural Information Processing Systems. 2121--2129."},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.23919\/FUSION45008.2020.9190246"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475400"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1277"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2970398.2970406"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2396832"},{"key":"e_1_3_2_2_20_1","volume-title":"Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 782--792","author":"Hoffart Johannes","year":"2011","unstructured":"Johannes Hoffart , Mohamed Amir Yosef , Ilaria Bordino , 2011 . Robust Disambiguation of Named Entities in Text . In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 782--792 . Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, et almbox. 2011. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 782--792."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557073"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1148"},{"key":"e_1_3_2_2_23_1","unstructured":"I. Loshchilov and F. Hutter. 2017. Fixing Weight Decay Regularization in Adam. CoRR (2017). http:\/\/arxiv.org\/abs\/1711.05101  I. Loshchilov and F. Hutter. 2017. Fixing Weight Decay Regularization in Adam. CoRR (2017). http:\/\/arxiv.org\/abs\/1711.05101"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/3157096.3157129"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1321440.1321475"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458150"},{"key":"e_1_3_2_2_27_1","volume-title":"Proceedings of the 27th International Conference on Neural Information Processing Systems. 2204--2212","author":"Mnih Volodymyr","year":"2014","unstructured":"Volodymyr Mnih , Nicolas Heess , Alex Graves , 2014 . Recurrent Models of Visual Attention . In Proceedings of the 27th International Conference on Neural Information Processing Systems. 2204--2212 . Volodymyr Mnih, Nicolas Heess, Alex Graves, et almbox. 2014. Recurrent Models of Visual Attention. In Proceedings of the 27th International Conference on Neural Information Processing Systems. 2204--2212."},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1186"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13735-021-00207-4"},{"key":"e_1_3_2_2_30_1","volume-title":"Proceedings of the The 28th International Conference on Machine Learning. 689--696","author":"Ngiam Jiquan","year":"2011","unstructured":"Jiquan Ngiam , Aditya Khosla , Mingyu Kim , 2011 . Multimodal Deep Learning . In Proceedings of the The 28th International Conference on Machine Learning. 689--696 . Jiquan Ngiam, Aditya Khosla, Mingyu Kim, et almbox. 2011. Multimodal Deep Learning. In Proceedings of the The 28th International Conference on Machine Learning. 689--696."},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186012"},{"key":"e_1_3_2_2_33_1","volume-title":"Proceedings of the 9th International Conference on Language Resources and Evaluation. 3529--3533","author":"R\u00f6der Michael","year":"2014","unstructured":"Michael R\u00f6der , Ricardo Usbeck , Sebastian Hellmann , 2014 . N$^3$-A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format .. In Proceedings of the 9th International Conference on Language Resources and Evaluation. 3529--3533 . Michael R\u00f6der, Ricardo Usbeck, Sebastian Hellmann, et almbox. 2014. N$^3$-A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format.. In Proceedings of the 9th International Conference on Language Resources and Evaluation. 3529--3533."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018876"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2327028"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101236"},{"key":"e_1_3_2_2_37_1","volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems. 2222--2230","author":"Srivastava Nitish","year":"2012","unstructured":"Nitish Srivastava and Ruslan Salakhutdinov . 2012 . Multimodal Learning with Deep Boltzmann Machines . In Proceedings of the 25th International Conference on Neural Information Processing Systems. 2222--2230 . Nitish Srivastava and Ruslan Salakhutdinov. 2012. Multimodal Learning with Deep Boltzmann Machines. In Proceedings of the 25th International Conference on Neural Information Processing Systems. 2222--2230."},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1656"},{"key":"e_1_3_2_2_39_1","volume-title":"Proceedings of the 31st Neural Information Processing Systems. 5998--6008","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , 2017 . Attention is All you Need . In Proceedings of the 31st Neural Information Processing Systems. 5998--6008 . Ashish Vaswani, Noam Shazeer, Niki Parmar, et almbox. 2017. Attention is All you Need. In Proceedings of the 31st Neural Information Processing Systems. 5998--6008."},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629489"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.bdr.2020.100159"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33017216"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-3309"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.519"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413650"},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1128"},{"key":"e_1_3_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-73197-7_35"},{"key":"e_1_3_2_2_48_1","volume-title":"Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 5674--5681","author":"Zhang Qi","year":"2018","unstructured":"Qi Zhang , Jinlan Fu , Xiaoyu Liu , 2018 . Adaptive Co-attention Network for Named Entity Recognition in Tweets . In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 5674--5681 . Qi Zhang, Jinlan Fu, Xiaoyu Liu, et almbox. 2018. Adaptive Co-attention Network for Named Entity Recognition in Tweets. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 5674--5681."}],"event":{"name":"SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval","location":"Madrid Spain","acronym":"SIGIR '22","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3477495.3531867","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3477495.3531867","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:10:27Z","timestamp":1750183827000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3477495.3531867"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,6]]},"references-count":48,"alternative-id":["10.1145\/3477495.3531867","10.1145\/3477495"],"URL":"https:\/\/doi.org\/10.1145\/3477495.3531867","relation":{},"subject":[],"published":{"date-parts":[[2022,7,6]]},"assertion":[{"value":"2022-07-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}