{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,8]],"date-time":"2025-12-08T22:35:43Z","timestamp":1765233343101,"version":"3.41.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,2,20]],"date-time":"2023-02-20T00:00:00Z","timestamp":1676851200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"US National Science Foundation","doi-asserted-by":"crossref","award":["IIS-1838730"],"award-info":[{"award-number":["IIS-1838730"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2023,4,30]]},"abstract":"<jats:p>\n            Recent advancements in deep learning techniques have transformed the area of semantic text matching (STM). However, most state-of-the-art models are designed to operate with\n            <jats:italic>short<\/jats:italic>\n            documents such as tweets, user reviews, comments, and so on. These models have fundamental limitations when applied to long-form documents such as scientific papers, legal documents, and patents. When handling such long documents, there are three primary challenges: (i) the presence of different contexts for the same word throughout the document, (ii) small sections of contextually similar text between two documents, but dissimilar text in the remaining parts (this defies the basic understanding of \u201csimilarity\u201d), and (iii) the coarse nature of a single global similarity measure which fails to capture the heterogeneity of the document content. In this article, we describe\n            <jats:bold>CoLDE<\/jats:bold>\n            :\n            <jats:bold>Co<\/jats:bold>\n            ntrastive\n            <jats:bold>L<\/jats:bold>\n            ong\n            <jats:bold>D<\/jats:bold>\n            ocument\n            <jats:bold>E<\/jats:bold>\n            ncoder\u2014a transformer-based framework that addresses these challenges and allows for interpretable comparisons of long documents. CoLDE uses unique positional embeddings and a multi-headed chunkwise attention layer in conjunction with a supervised contrastive learning framework to capture similarity at three different levels: (i) high-level similarity scores between a pair of documents, (ii) similarity scores between different sections within and across documents, and (iii) similarity scores between different\n            <jats:italic>chunks<\/jats:italic>\n            in the same document and across other documents. These fine-grained similarity scores aid in better interpretability. We evaluate CoLDE on three long document datasets namely, ACL Anthology publications, Wikipedia articles, and USPTO patents. Besides outperforming the state-of-the-art methods on the document matching task, CoLDE is also robust to changes in document length and text perturbations and provides interpretable results. The code for the proposed model is publicly available at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/github.com\/InterDigitalInc\/CoLDE\">https:\/\/github.com\/InterDigitalInc\/CoLDE<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3542822","type":"journal-article","created":{"date-parts":[[2022,6,11]],"date-time":"2022-06-11T22:42:51Z","timestamp":1654987371000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Supervised Contrastive Learning for Interpretable Long-Form Document Matching"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2330-6806","authenticated-orcid":false,"given":"Akshita","family":"Jha","sequence":"first","affiliation":[{"name":"Virginia Tech, Arlington, VA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7586-0257","authenticated-orcid":false,"given":"Vineeth","family":"Rakesh","sequence":"additional","affiliation":[{"name":"InterDigital, Los Altos, CA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4615-7487","authenticated-orcid":false,"given":"Jaideep","family":"Chandrashekar","sequence":"additional","affiliation":[{"name":"InterDigital, Los Altos, CA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9232-6395","authenticated-orcid":false,"given":"Adithya","family":"Samavedhi","sequence":"additional","affiliation":[{"name":"Virginia Tech, Arlington, VA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2839-3662","authenticated-orcid":false,"given":"Chandan K.","family":"Reddy","sequence":"additional","affiliation":[{"name":"Virginia Tech, Arlington, VA"}]}],"member":"320","published-online":{"date-parts":[[2023,2,20]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Ashutosh Adhikari Achyudh Ram Raphael Tang and Jimmy Lin. 2019. Docbert: Bert for document classification. arXiv:1904.08398. Retrieved from https:\/\/arxiv.org\/abs\/1904.08398."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1042"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1177"},{"key":"e_1_3_2_5_2","unstructured":"Iz Beltagy Matthew E. Peters and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv:2004.05150. Retrieved from https:\/\/arxiv.org\/abs\/2004.05150."},{"key":"e_1_3_2_6_2","volume-title":"Proceedings of the ICLR (Poster)","author":"Chen Minmin","year":"2017","unstructured":"Minmin Chen. 2017. Efficient vector representation for documents through corruption. In Proceedings of the ICLR (Poster)."},{"key":"e_1_3_2_7_2","first-page":"1597","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. 1597\u20131607."},{"key":"e_1_3_2_8_2","unstructured":"Rewon Child Scott Gray Alec Radford and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv:1904.10509. Retrieved from https:\/\/arxiv.org\/abs\/1904.10509."},{"key":"e_1_3_2_9_2","unstructured":"Krzysztof Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane Tamas Sarlos Peter Hawkins Jared Davis Afroz Mohiuddin Lukasz Kaiser David Belanger Lucy Colwell and Adrian Weller. 2020. Rethinking attention with performers. In Proceedings of the International Conference on Learning Representations ."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1285"},{"key":"e_1_3_2_11_2","first-page":"4171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171\u20134186."},{"key":"e_1_3_2_12_2","first-page":"12792","volume-title":"Advances in Neural Information Processing Systems","author":"Ding Ming","year":"2020","unstructured":"Ming Ding, Chang Zhou, Hongxia Yang, and Jie Tang. 2020. CogLTX: Applying BERT to Long Texts. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 12792\u201312804. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/96671501524948bc3937b4b30d0e57b9Paper.pdf."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.552"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1537"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.72"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2005.1556215"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983769"},{"key":"e_1_3_2_18_2","unstructured":"Jonathan Ho Nal Kalchbrenner Dirk Weissenborn and Tim Salimans. 2019. Axial attention in multidimensional transformers. arXiv:1912.12180. Retrieved from https:\/\/arxiv.org\/abs\/1912.12180."},{"key":"e_1_3_2_19_2","first-page":"2042","volume-title":"Advances in Neural Information Processing Systems","author":"Hu Baotian","year":"2014","unstructured":"Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional Neural Network Architectures for Matching Natural Language Sentences. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger (Eds.). Vol. 27. Curran Associates, Inc. 2042\u20132050. https:\/\/proceedings.neurips.cc\/paper\/2014\/file\/b9d487a30398d42ecff55c228ed5652b-Paper.pdf."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_2_21_2","first-page":"3543","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.","author":"Jain Sarthak","year":"2019","unstructured":"Sarthak Jain and Byron C. Wallace. 2019. Attention is not explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.3543\u20133556."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313707"},{"key":"e_1_3_2_23_2","first-page":"5156","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Katharopoulos Angelos","year":"2020","unstructured":"Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and Fran\u00e7ois Fleuret. 2020. Transformers are RNNs: Fast autoregressive transformers with linear attention. In Proceedings of the International Conference on Machine Learning. 5156\u20135165."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806475"},{"key":"e_1_3_2_25_2","first-page":"18661","volume-title":"Advances in Neural Information Processing Systems","author":"Khosla Prannay","year":"2020","unstructured":"Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 18661\u201318673. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/d89a66c7c80a29b1bdbab0f2a1a94af8-Paper.pdf."},{"key":"e_1_3_2_26_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Kitaev Nikita","year":"2019","unstructured":"Nikita Kitaev, Lukasz Kaiser, and Anselm Levskaya. 2019. Reformer: The efficient transformer. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_27_2","first-page":"1188","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Le Quoc","year":"2014","unstructured":"Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188\u20131196."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1500"},{"key":"e_1_3_2_29_2","unstructured":"Dongsheng Luo Wei Cheng Jingchao Ni Wenchao Yu Xuchao Zhang Bo Zong Yanchi Liu Zhengzhang Chen Dongjin Song Haifeng Chen and Xiang Zhang. 2021. Unsupervised document embedding via contrastive augmentation. arXiv:2103.14542. Retrieved from https:\/\/arxiv.org\/abs\/2103.14542."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052579"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10341"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58545-7_19"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.findings-emnlp.232"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-012-9211-2"},{"key":"e_1_3_2_35_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Rae Jack W.","year":"2019","unstructured":"Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Chloe Hillier, and Timothy P. Lillicrap. 2019. Compressive transformers for long-range sequence modelling. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/371"},{"key":"e_1_3_2_38_2","first-page":"3319","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Sundararajan Mukund","year":"2017","unstructured":"Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the International Conference on Machine Learning. 3319\u20133328."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1002"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983818"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3411908"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1174"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159685"},{"key":"e_1_3_2_45_2","first-page":"17283","volume-title":"Advances in Neural Information Processing Systems","author":"Zaheer Manzil","year":"2020","unstructured":"Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. Big Bird: Transformers for Longer Sequences. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 17283\u201317297. https:\/\/proceedings.neurips.cc\/paper\/2020\/file\/c8512d142a2d849725f31a9a7a361ab9-Paper.pdf."},{"key":"e_1_3_2_46_2","first-page":"12437","volume-title":"Proceedings of the 38th International Conference on Machine Learning,","author":"Zhang Hang","year":"2021","unstructured":"Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, and Weizhu Chen. 2021. Poolingformer: Long document modeling with pooling attention. In Proceedings of the 38th International Conference on Machine Learning,Marina Meila and Tong Zhang (Eds.). PMLR, 12437\u201312446. Retrieved from https:\/\/proceedings.mlr.press\/v139\/zhang21h.html."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3542822","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3542822","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:22Z","timestamp":1750186942000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3542822"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,20]]},"references-count":45,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,4,30]]}},"alternative-id":["10.1145\/3542822"],"URL":"https:\/\/doi.org\/10.1145\/3542822","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2023,2,20]]},"assertion":[{"value":"2021-10-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-05-08","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}