{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,23]],"date-time":"2025-08-23T05:24:44Z","timestamp":1755926684453,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,4,25]],"date-time":"2022-04-25T00:00:00Z","timestamp":1650844800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,4,25]]},"DOI":"10.1145\/3487553.3524204","type":"proceedings-article","created":{"date-parts":[[2022,8,16]],"date-time":"2022-08-16T22:41:30Z","timestamp":1660689690000},"page":"41-51","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Multilingual Semantic Sourcing using Product Images for Cross-lingual Alignment"],"prefix":"10.1145","author":[{"given":"Sourab","family":"Mangrulkar","sequence":"first","affiliation":[{"name":"Amazon, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ankith","family":"M S","sequence":"additional","affiliation":[{"name":"Amazon, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vivek","family":"Sembium","sequence":"additional","affiliation":[{"name":"Amazon, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,8,16]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371852"},{"key":"e_1_3_2_1_2_1","unstructured":"Johannes Bjerva and Robert \u00d6stling. 2017. Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations. In NODALIDA.  Johannes Bjerva and Robert \u00d6stling. 2017. Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations. In NODALIDA."},{"key":"e_1_3_2_1_3_1","volume-title":"UNITER: UNiversal Image-TExt Representation Learning. In ECCV.","author":"Chen Yen-Chun","year":"2020","unstructured":"Yen-Chun Chen , Linjie Li , Licheng Yu , Ahmed\u00a0El Kholy , Faisal Ahmed , Zhe Gan , Yu Cheng , and Jingjing Liu . 2020 . UNITER: UNiversal Image-TExt Representation Learning. In ECCV. Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed\u00a0El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. UNITER: UNiversal Image-TExt Representation Learning. In ECCV."},{"key":"e_1_3_2_1_4_1","volume-title":"ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ArXiv abs\/2003.10555(2020).","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark , Minh-Thang Luong , Quoc\u00a0 V. Le , and Christopher\u00a0 D. Manning . 2020 . ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ArXiv abs\/2003.10555(2020). Kevin Clark, Minh-Thang Luong, Quoc\u00a0V. Le, and Christopher\u00a0D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ArXiv abs\/2003.10555(2020)."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In ACL.  Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzm\u00e1n Edouard Grave Myle Ott Luke Zettlemoyer and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In ACL.","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Karan Desai and Justin Johnson. 2021. VirTex: Learning Visual Representations from Textual Annotations. In CVPR.  Karan Desai and Justin Johnson. 2021. VirTex: Learning Visual Representations from Textual Annotations. In CVPR.","DOI":"10.1109\/CVPR46437.2021.01101"},{"key":"e_1_3_2_1_7_1","first-page":"19","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","volume":"1","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171\u20134186. https:\/\/doi.org\/10. 18653\/v1\/N 19 - 1423 10.18653\/v1 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171\u20134186. https:\/\/doi.org\/10.18653\/v1\/N19-1423"},{"key":"e_1_3_2_1_8_1","volume-title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly , Jakob Uszkoreit , and Neil Houlsby . 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR ( 2021 ). Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR (2021)."},{"key":"e_1_3_2_1_9_1","volume-title":"Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"He Kaiming","year":"2016","unstructured":"Kaiming He , X. Zhang , Shaoqing Ren , and Jian Sun . 2016 . Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770\u2013778. Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 770\u2013778."},{"key":"e_1_3_2_1_10_1","unstructured":"Pengcheng He Xiaodong Liu Jianfeng Gao and Weizhu Chen. 2021. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. ArXiv abs\/2006.03654(2021).  Pengcheng He Xiaodong Liu Jianfeng Gao and Weizhu Chen. 2021. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. ArXiv abs\/2006.03654(2021)."},{"key":"e_1_3_2_1_11_1","volume-title":"The 2020 SIGIR Workshop on eCommerce. ACM","author":"Hu Qie","year":"2020","unstructured":"Qie Hu , Hsiang-Fu Yu , Vishnu Narayanan , Ivan Davchev , Rahul Bhagat , and Inderjit\u00a0 S. Dhillon . 2020 . Query transformation for multi-lingual product search . In The 2020 SIGIR Workshop on eCommerce. ACM , San Diego, USA. https:\/\/www.amazon.science\/publications\/query-transformation-for-multi-lingual-product-search Qie Hu, Hsiang-Fu Yu, Vishnu Narayanan, Ivan Davchev, Rahul Bhagat, and Inderjit\u00a0S. Dhillon. 2020. Query transformation for multi-lingual product search. In The 2020 SIGIR Workshop on eCommerce. ACM, San Diego, USA. https:\/\/www.amazon.science\/publications\/query-transformation-for-multi-lingual-product-search"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"key":"e_1_3_2_1_14_1","volume-title":"Kingma and Jimmy Ba","author":"P.","year":"2015","unstructured":"Diederik\u00a0 P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds .). http:\/\/arxiv.org\/abs\/1412.6980 Diederik\u00a0P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http:\/\/arxiv.org\/abs\/1412.6980"},{"key":"e_1_3_2_1_15_1","unstructured":"Guillaume Lample and Alexis Conneau. 2019. Cross-lingual Language Model Pretraining. In NeurIPS.  Guillaume Lample and Alexis Conneau. 2019. Cross-lingual Language Model Pretraining. In NeurIPS."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-1056"},{"key":"e_1_3_2_1_17_1","volume-title":"BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv preprint arXiv:1910.13461(2019).","author":"Lewis Mike","year":"2019","unstructured":"Mike Lewis , Yinhan Liu , Naman Goyal , Marjan Ghazvininejad , Abdelrahman Mohamed , Omer Levy , Veselin Stoyanov , and Luke Zettlemoyer . 2019 . BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv preprint arXiv:1910.13461(2019). Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv preprint arXiv:1910.13461(2019)."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Gen Li Nan Duan Yuejian Fang Daxin Jiang and Ming Zhou. 2020. Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training. In AAAI.  Gen Li Nan Duan Yuejian Fang Daxin Jiang and Ming Zhou. 2020. Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training. In AAAI.","DOI":"10.1609\/aaai.v34i07.6795"},{"key":"e_1_3_2_1_19_1","unstructured":"Liunian\u00a0Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh and Kai-Wei Chang. 2019. VisualBERT: A Simple and Performant Baseline for Vision and Language. ArXiv abs\/1908.03557(2019).  Liunian\u00a0Harold Li Mark Yatskar Da Yin Cho-Jui Hsieh and Kai-Wei Chang. 2019. VisualBERT: A Simple and Performant Baseline for Vision and Language. ArXiv abs\/1908.03557(2019)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1092"},{"key":"e_1_3_2_1_21_1","unstructured":"Yinhan Liu Jiatao Gu Naman Goyal Xian Li Sergey Edunov Marjan Ghazvininejad Mike Lewis and Luke Zettlemoyer. 2020. Multilingual Denoising Pre-training for Neural Machine Translation. (2020). arxiv:2001.08210\u00a0[cs.CL]  Yinhan Liu Jiatao Gu Naman Goyal Xian Li Sergey Edunov Marjan Ghazvininejad Mike Lewis and Luke Zettlemoyer. 2020. Multilingual Denoising Pre-training for Neural Machine Translation. (2020). arxiv:2001.08210\u00a0[cs.CL]"},{"key":"e_1_3_2_1_22_1","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019).  Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019)."},{"key":"e_1_3_2_1_23_1","volume-title":"Proceedings of the 2021 Conference of the North American","author":"Lu Hanqing","year":"1865","unstructured":"Hanqing Lu , Youna Hu , Tong Zhao , Tony Wu , Yiwei Song , and Bing Yin . 2021. Graph-based Multilingual Product Retrieval in E-Commerce Search . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. Association for Computational Linguistics , Online , 146\u2013153. https:\/\/doi.org\/10. 1865 3\/v1\/2021.naacl-industry.19 10.18653\/v1 Hanqing Lu, Youna Hu, Tong Zhao, Tony Wu, Yiwei Song, and Bing Yin. 2021. Graph-based Multilingual Product Retrieval in E-Commerce Search. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. Association for Computational Linguistics, Online, 146\u2013153. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-industry.19"},{"key":"e_1_3_2_1_24_1","unstructured":"Jiasen Lu Dhruv Batra Devi Parikh and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS.  Jiasen Lu Dhruv Batra Devi Parikh and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330759"},{"volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","key":"e_1_3_2_1_26_1","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , Alban Desmaison , Andreas Kopf , Edward Yang , Zachary DeVito , Martin Raison , Alykhan Tejani , Sasank Chilamkurthy , Benoit Steiner , Lu Fang , Junjie Bai , and Soumith Chintala . 2019. PyTorch: An Imperative Style , High-Performance Deep Learning Library . In Advances in Neural Information Processing Systems 32, H.\u00a0Wallach, H.\u00a0Larochelle, A.\u00a0Beygelzimer, F.\u00a0d'Alch\u00e9-Buc, E.\u00a0Fox, and R.\u00a0Garnett (Eds.). Curran Associates, Inc., 8024\u20138035. http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H.\u00a0Wallach, H.\u00a0Larochelle, A.\u00a0Beygelzimer, F.\u00a0d'Alch\u00e9-Buc, E.\u00a0Fox, and R.\u00a0Garnett (Eds.). Curran Associates, Inc., 8024\u20138035. http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1493"},{"key":"e_1_3_2_1_28_1","unstructured":"Alec Radford Jong\u00a0Wook Kim Chris Hallacy A. Ramesh Gabriel Goh Sandhini Agarwal Girish Sastry Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In ICML.  Alec Radford Jong\u00a0Wook Kim Chris Hallacy A. Ramesh Gabriel Goh Sandhini Agarwal Girish Sastry Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In ICML."},{"key":"e_1_3_2_1_29_1","first-page":"1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter\u00a0 J. Liu . 2020 . Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer . Journal of Machine Learning Research 21 , 140 (2020), 1 \u2013 67 . http:\/\/jmlr.org\/papers\/v21\/20-074.html Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter\u00a0J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1\u201367. http:\/\/jmlr.org\/papers\/v21\/20-074.html","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In EMNLP\/IJCNLP.  Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In EMNLP\/IJCNLP.","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_2_1_31_1","unstructured":"Victor Sanh Lysandre Debut Julien Chaumond and Thomas Wolf. 2019. DistilBERT a distilled version of BERT: smaller faster cheaper and lighter. ArXiv abs\/1910.01108(2019).  Victor Sanh Lysandre Debut Julien Chaumond and Thomas Wolf. 2019. DistilBERT a distilled version of BERT: smaller faster cheaper and lighter. ArXiv abs\/1910.01108(2019)."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-2073"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Yelong Shen Xiaodong He Jianfeng Gao Li Deng and Gregoire Mesnil. 2014. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval. In CIKM. https:\/\/www.microsoft.com\/en-us\/research\/publication\/a-latent-semantic-model-with-convolutional-pooling-structure-for-information-retrieval\/  Yelong Shen Xiaodong He Jianfeng Gao Li Deng and Gregoire Mesnil. 2014. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval. In CIKM. https:\/\/www.microsoft.com\/en-us\/research\/publication\/a-latent-semantic-model-with-convolutional-pooling-structure-for-information-retrieval\/","DOI":"10.1145\/2661829.2661935"},{"key":"e_1_3_2_1_34_1","unstructured":"Karan Singhal Karthik Raman and Balder ten Cate. 2019. Learning Multilingual Word Embeddings Using Image-Text Data. CoRR abs\/1905.12260(2019). arxiv:1905.12260http:\/\/arxiv.org\/abs\/1905.12260  Karan Singhal Karthik Raman and Balder ten Cate. 2019. Learning Multilingual Word Embeddings Using Image-Text Data. CoRR abs\/1905.12260(2019). arxiv:1905.12260http:\/\/arxiv.org\/abs\/1905.12260"},{"key":"e_1_3_2_1_35_1","volume-title":"VL-BERT: Pre-training of Generic Visual-Linguistic Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=SygXPaEYvH","author":"Su Weijie","year":"2020","unstructured":"Weijie Su , Xizhou Zhu , Yue Cao , Bin Li , Lewei Lu , Furu Wei , and Jifeng Dai . 2020 . VL-BERT: Pre-training of Generic Visual-Linguistic Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=SygXPaEYvH Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2020. VL-BERT: Pre-training of Generic Visual-Linguistic Representations. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=SygXPaEYvH"},{"key":"e_1_3_2_1_36_1","volume-title":"Globetrotter: Unsupervised Multilingual Translation from Visual Alignment. CoRR abs\/2012.04631(2020). arxiv:2012.04631https:\/\/arxiv.org\/abs\/2012.04631","author":"Sur\u00eds D\u00eddac","year":"2020","unstructured":"D\u00eddac Sur\u00eds , Dave Epstein , and Carl Vondrick . 2020 . Globetrotter: Unsupervised Multilingual Translation from Visual Alignment. CoRR abs\/2012.04631(2020). arxiv:2012.04631https:\/\/arxiv.org\/abs\/2012.04631 D\u00eddac Sur\u00eds, Dave Epstein, and Carl Vondrick. 2020. Globetrotter: Unsupervised Multilingual Translation from Visual Alignment. CoRR abs\/2012.04631(2020). arxiv:2012.04631https:\/\/arxiv.org\/abs\/2012.04631"},{"key":"e_1_3_2_1_37_1","volume-title":"LXMERT: Learning Cross-Modality Encoder Representations from Transformers. In EMNLP.","author":"Tan Hao\u00a0Hao","year":"2019","unstructured":"Hao\u00a0Hao Tan and Mohit Bansal . 2019 . LXMERT: Learning Cross-Modality Encoder Representations from Transformers. In EMNLP. Hao\u00a0Hao Tan and Mohit Bansal. 2019. LXMERT: Learning Cross-Modality Encoder Representations from Transformers. In EMNLP."},{"key":"e_1_3_2_1_38_1","first-page":"2579","article-title":"Visualizing High-Dimensional Data Using t-SNE","volume":"9","author":"van der Maaten L.J.P.","year":"2008","unstructured":"L.J.P. van der Maaten and G.E. Hinton . 2008 . Visualizing High-Dimensional Data Using t-SNE . Journal of Machine Learning Research 9 (2008), 2579 \u2013 2605 . L.J.P. van der Maaten and G.E. Hinton. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9 (2008), 2579\u20132605.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_39_1","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan\u00a0 N. Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017 . Attention is All you Need . In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 , December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna\u00a0M. Wallach, Rob Fergus, S.\u00a0V.\u00a0N. Vishwanathan, and Roman Garnett (Eds.). 5998\u20136008. https:\/\/proceedings.neurips.cc\/paper\/ 2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan\u00a0N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna\u00a0M. Wallach, Rob Fergus, S.\u00a0V.\u00a0N. Vishwanathan, and Roman Garnett (Eds.). 5998\u20136008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html"},{"key":"e_1_3_2_1_40_1","unstructured":"Thomas Wolf Lysandre Debut Victor Sanh Julien Chaumond Clement Delangue Anthony Moi Pierric Cistac Tim Rault R\u00e9mi Louf Morgan Funtowicz and Jamie Brew. 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. CoRR abs\/1910.03771(2019). arxiv:1910.03771http:\/\/arxiv.org\/abs\/1910.03771  Thomas Wolf Lysandre Debut Victor Sanh Julien Chaumond Clement Delangue Anthony Moi Pierric Cistac Tim Rault R\u00e9mi Louf Morgan Funtowicz and Jamie Brew. 2019. HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. CoRR abs\/1910.03771(2019). arxiv:1910.03771http:\/\/arxiv.org\/abs\/1910.03771"},{"key":"e_1_3_2_1_41_1","volume-title":"Proceedings of the 2021 Conference of the North American","author":"Xue Linting","year":"1865","unstructured":"Linting Xue , Noah Constant , Adam Roberts , Mihir Kale , Rami Al-Rfou , Aditya Siddhant , Aditya Barua , and Colin Raffel . 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics , Online , 483\u2013498. https:\/\/doi.org\/10. 1865 3\/v1\/2021.naacl-main.41 10.18653\/v1 Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 483\u2013498. https:\/\/doi.org\/10.18653\/v1\/2021.naacl-main.41"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3449830"},{"key":"e_1_3_2_1_43_1","unstructured":"Yuhao Zhang Hang Jiang Y. Miura Christopher\u00a0D. Manning and C. Langlotz. 2020. Contrastive Learning of Medical Visual Representations from Paired Images and Text. ArXiv abs\/2010.00747(2020).  Yuhao Zhang Hang Jiang Y. Miura Christopher\u00a0D. Manning and C. Langlotz. 2020. Contrastive Learning of Medical Visual Representations from Paired Images and Text. ArXiv abs\/2010.00747(2020)."},{"key":"e_1_3_2_1_44_1","volume-title":"ERNIE: Enhanced Language Representation with Informative Entities. In ACL.","author":"Zhang Zhengyan","year":"2019","unstructured":"Zhengyan Zhang , Xu Han , Zhiyuan Liu , Xin Jiang , M. Sun , and Qun Liu . 2019 . ERNIE: Enhanced Language Representation with Informative Entities. In ACL. Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, M. Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In ACL."}],"event":{"name":"WWW '22: The ACM Web Conference 2022","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"],"location":"Virtual Event, Lyon France","acronym":"WWW '22"},"container-title":["Companion Proceedings of the Web Conference 2022"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3487553.3524204","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3487553.3524204","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:33Z","timestamp":1750188633000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3487553.3524204"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,25]]},"references-count":44,"alternative-id":["10.1145\/3487553.3524204","10.1145\/3487553"],"URL":"https:\/\/doi.org\/10.1145\/3487553.3524204","relation":{},"subject":[],"published":{"date-parts":[[2022,4,25]]},"assertion":[{"value":"2022-08-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}