{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T18:31:40Z","timestamp":1774463500794,"version":"3.50.1"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,11,29]],"date-time":"2021-11-29T00:00:00Z","timestamp":1638144000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"European Union\u2019s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie","award":["721321"],"award-info":[{"award-number":["721321"]}]},{"name":"Soonchunhyang University Research Fund"},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["336033, 315896"],"award-info":[{"award-number":["336033, 315896"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100014438","name":"Business Finland","doi-asserted-by":"crossref","award":["884\/31\/2018"],"award-info":[{"award-number":["884\/31\/2018"]}],"id":[{"id":"10.13039\/501100014438","id-type":"DOI","asserted-by":"crossref"}]},{"name":"EU H2020","award":["101016775"],"award-info":[{"award-number":["101016775"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Technol."],"published-print":{"date-parts":[[2022,8,31]]},"abstract":"<jats:p>Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.<\/jats:p>","DOI":"10.1145\/3426971","type":"journal-article","created":{"date-parts":[[2021,11,29]],"date-time":"2021-11-29T23:40:36Z","timestamp":1638229236000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["SANTM: Efficient Self-attention-driven Network for Text Matching"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2851-4260","authenticated-orcid":false,"given":"Prayag","family":"Tiwari","sequence":"first","affiliation":[{"name":"Department of Information Engineering, University of Padova, Padova PD, Italy"}]},{"given":"Amit Kumar","family":"Jaiswal","sequence":"additional","affiliation":[{"name":"Institute for Research in Applicable Computing, University of Bedfordshire, United Kingdom"}]},{"given":"Sahil","family":"Garg","sequence":"additional","affiliation":[{"name":"\u00c9cole de technologie sup\u00e9rieure, Montr\u00e9al, QC H3C 1K3, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0604-3445","authenticated-orcid":false,"given":"Ilsun","family":"You","sequence":"additional","affiliation":[{"name":"Department of Information Security Engineering, Soonchunhyang University, Asan 31538, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2021,11,29]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Tamer Alkhouli Gabriel Bretschner and Hermann Ney. 2018. On the alignment problem in multi-head attention-based neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers . 177\u2013185.","DOI":"10.18653\/v1\/W18-6318"},{"key":"e_1_3_2_3_2","unstructured":"Jimmy Lei Ba Jamie Ryan Kiros and Geoffrey E. Hinton. 2016. Layer normalization. arXiv:1607.06450. Retrieved from https:\/\/arxiv.org\/abs\/1607.06450."},{"key":"e_1_3_2_4_2","doi-asserted-by":"crossref","unstructured":"Samuel R. Bowman Gabor Angeli Christopher Potts and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing . 632\u2013642.","DOI":"10.18653\/v1\/D15-1075"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1171"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1224"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1152"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1053"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078186"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.2018.1700286"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.2019.1800239"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1132"},{"key":"e_1_3_2_13_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Gong Yichen","year":"2018","unstructured":"Yichen Gong, Heng Luo, and Jian Zhang. 2018. Natural language inference over interaction space. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1108"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2018.1700622"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33016586"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TWC.2019.2946140"},{"key":"e_1_3_2_19_2","article-title":"A distant supervision method based on paradigmatic relations for learning word embeddings","author":"Li Jianquan","year":"2020","unstructured":"Jianquan Li, Renfen Hu, Xiaokang Liu, Prayag Tiwari, Hari Mohan Pandey, Wei Chen, Benyou Wang, Yaohong Jin, and Kaicheng Yang. 2020. A distant supervision method based on paradigmatic relations for learning word embeddings. Neural Comput. Appl. 32, 12 (2020), 7759\u20137768.","journal-title":"Neural Comput. Appl."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.5555\/1614025.1614027"},{"key":"e_1_3_2_21_2","unstructured":"Xiaodong Liu Pengcheng He Weizhu Chen and Jianfeng Gao. 2019. Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . 4487\u20134496."},{"key":"e_1_3_2_22_2","unstructured":"Yang Liu Chengjie Sun Lei Lin and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv:1605.09090. Retrieved from https:\/\/arxiv.org\/abs\/1605.09090."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1405"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.5555\/3045390.3045573"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1147"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-2022"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-1038"},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Boyuan Pan Yazheng Yang Zhou Zhao Yueting Zhuang Deng Cai and Xiaofei He. 2019. Discourse marker augmented network with reinforcement learning for natural language inference. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . 989\u2013999.","DOI":"10.18653\/v1\/P18-1091"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1244"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_31_2","unstructured":"Silvia Quarteroni and Suresh Manandhar. 2007. A chatbot-based interactive question answering system. Decalog. In Proceedings of the 11th Workshop on the Semantics and Pragmatics of Dialogue . 83\u201390."},{"key":"e_1_3_2_32_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. [n. d.]. Improving language understanding by generative pre-training."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/3044805.3045035"},{"key":"e_1_3_2_35_2","unstructured":"Tim Rockt\u00e4schel Edward Grefenstette Karl Moritz Hermann Tom\u00e1\u0161 Ko\u010disk\u1ef3 and Phil Blunsom. 2015. Reasoning about entailment with neural attention. arXiv:1509.06664. Retrieved from https:\/\/arxiv.org\/abs\/1509.06664."},{"key":"e_1_3_2_36_2","unstructured":"Cicero dos Santos Ming Tan Bing Xiang and Bowen Zhou. 2016. Attentive pooling networks. arXiv:1602.03609. Retrieved from https:\/\/arxiv.org\/abs\/1602.03609."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767738"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-04221-9_46"},{"key":"e_1_3_2_39_2","doi-asserted-by":"crossref","unstructured":"Peter Shaw Jakob Uszkoreit and Ashish Vaswani. 2018. Self-attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Volume 2 (Short Papers) . 464\u2013468.","DOI":"10.18653\/v1\/N18-2074"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1122"},{"key":"e_1_3_2_41_2","unstructured":"Rupesh Kumar Srivastava Klaus Greff and J\u00fcrgen Schmidhuber. 2015. Highway Networks. arXiv preprint arXiv:1505.00387."},{"key":"e_1_3_2_42_2","unstructured":"Kai Sheng Tai Richard Socher and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . 1556\u20131566."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304383"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1185"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159664"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3269304"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.330110051"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2904624"},{"key":"e_1_3_2_49_2","first-page":"1","article-title":"TermInformer: Unsupervised term mining and analysis in biomedical literature","author":"Tiwari Prayag","year":"2020","unstructured":"Prayag Tiwari, Sagar Uprety, Shahram Dehdashti, and M Shamim Hossain. 2020. TermInformer: Unsupervised term mining and analysis in biomedical literature. Neural Comput. Appl. (2020), 1\u201314.","journal-title":"Neural Comput. Appl."},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_51_2","unstructured":"Ivan Vendrov Ryan Kiros Sanja Fidler and Raquel Urtasun. 2015. Order-embeddings of images and language. arXiv:1511.06361. Retrieved from https:\/\/arxiv.org\/abs\/1511.06361."},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2019.105913"},{"key":"e_1_3_2_53_2","first-page":"22","volume-title":"Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907)","author":"Wang Mengqiu","year":"2007","unstructured":"Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the Jeopardy model? A quasi-synchronous grammar for QA. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907). 22\u201332."},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1170"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.5555\/3171837.3171865"},{"key":"e_1_3_2_56_2","first-page":"1340","volume-title":"Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING\u201916)","author":"Wang Zhiguo","year":"2016","unstructured":"Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING\u201916). 1340\u20131349."},{"key":"e_1_3_2_57_2","doi-asserted-by":"crossref","unstructured":"Runqi Yang Jianhai Zhang Xing Gao Feng Ji and Haiqing Chen. 2019. Simple and effective text matching with richer alignment features. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . 4699\u20134709.","DOI":"10.18653\/v1\/P19-1465"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D15-1237"},{"key":"e_1_3_2_59_2","unstructured":"Zhilin Yang Bhuwan Dhingra Ye Yuan Junjie Hu William W. Cohen and Ruslan Salakhutdinov. 2016. Words or characters? fine-grained gating for reading comprehension. arXiv:1611.01724. Retrieved from https:\/\/arxiv.org\/abs\/1611.01724."},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358148"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10590-1_53"},{"key":"e_1_3_2_62_2","unstructured":"Zhuosheng Zhang Yuwei Wu Zuchao Li Shexia He Hai Zhao Xi Zhou and Xiang Zhou. 2018. I know what you want: Semantic learning for text comprehension. arXiv:1809.02794. Retrieved from https:\/\/arxiv.org\/abs\/1809.02794."},{"key":"e_1_3_2_63_2","unstructured":"A. Rakhlin. 2016. Convolutional neural networks for sentence classification. GitHub ."}],"container-title":["ACM Transactions on Internet Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3426971","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3426971","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:24Z","timestamp":1750197744000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3426971"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,29]]},"references-count":62,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,8,31]]}},"alternative-id":["10.1145\/3426971"],"URL":"https:\/\/doi.org\/10.1145\/3426971","relation":{},"ISSN":["1533-5399","1557-6051"],"issn-type":[{"value":"1533-5399","type":"print"},{"value":"1557-6051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,29]]},"assertion":[{"value":"2020-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}