{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T22:18:38Z","timestamp":1757629118669,"version":"3.44.0"},"reference-count":72,"publisher":"Association for Computing Machinery (ACM)","issue":"9","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2025,9,30]]},"abstract":"<jats:p>Plagiarism, the unauthorized reuse of text, fueled by the ease of access to online content, is a pressing concern for academia, publishers, and authors. Paraphrasing, a common tactic in textual plagiarism, compounds the problem further. The automatic detection of paraphrased plagiarism in text documents is a fundamental task in Natural Language Processing (NLP), crucial for maintaining academic integrity and authenticity. This article presents an extensive investigation into Urdu sentential paraphrased plagiarism detection leveraging advanced Deep Neural Networks (DNNs) and Large Language Models (LLMs). The study builds upon the foundational work and proposes modifications to the Deep Text Reuse and Paraphrased Plagiarism Detection (D-TRaPPD) architecture to incorporate state-of-the-art pre-trained LLMs. The proposed approach, SELLM-D-TRaPPD, integrates various language models, including contextualized sentence embedding-based LLMs, language-agnostic and multilingual transformer-based LLMs, and multilingual knowledge-distilled transformer-based LLMs. We evaluated these models against three benchmark Urdu sentential paraphrase corpora\u2014Urdu Sentential Paraphrase Corpus, Urdu Short Text Reuse Corpus, and Semi-automatic Urdu Sentential Paraphrase Corpus. The results demonstrate the effectiveness of SELLM-D-TRaPPD with LLMs, achieving F1 scores of 92.09%, 96.70%, and 98.23%, respectively. A comparative analysis with existing state-of-the-art methods shows significant performance improvements, establishing SELLM-D-TRaPPD as the new leading approach for Urdu sentential paraphrased plagiarism detection. These findings highlight the value of leveraging advanced neural network architectures and pre-trained LLMs in improving the accuracy and effectiveness of paraphrased plagiarism detection in Urdu, addressing a crucial gap in Urdu NLP research.<\/jats:p>","DOI":"10.1145\/3748320","type":"journal-article","created":{"date-parts":[[2025,7,11]],"date-time":"2025-07-11T11:08:06Z","timestamp":1752232086000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Urdu Sentential Paraphrased Plagiarism Detection Using Large Language Models"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2304-8459","authenticated-orcid":false,"given":"Hafiz Rizwan","family":"Iqbal","sequence":"first","affiliation":[{"name":"Information Technology University","place":["Lahore, Pakistan"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-5944-514X","authenticated-orcid":false,"given":"Muhammad","family":"Sharjeel","sequence":"additional","affiliation":[{"name":"COMSATS University Islamabad, Lahore Campus","place":["Lahore, Pakistan"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6427-3823","authenticated-orcid":false,"given":"Jawad","family":"Shafi","sequence":"additional","affiliation":[{"name":"COMSATS University Islamabad, Lahore Campus","place":["Lahore, Pakistan"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8430-5956","authenticated-orcid":false,"given":"Usama","family":"Mehmood","sequence":"additional","affiliation":[{"name":"Information Technology University","place":["Lahore, Pakistan"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0124-9783","authenticated-orcid":false,"given":"Agha Ali","family":"Raza","sequence":"additional","affiliation":[{"name":"Lahore University of Management Sciences","place":["Lahore, Pakistan"]}]}],"member":"320","published-online":{"date-parts":[[2025,9,10]]},"reference":[{"issue":"6","key":"e_1_3_3_2_2","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1016\/j.ipm.2018.06.005","article-title":"A deep network model for paraphrase detection in short text messages","volume":"54","author":"Agarwal Basant","year":"2018","unstructured":"Basant Agarwal, Heri Ramampiaro, Helge Langseth, and Massimiliano Ruocco. 2018. A deep network model for paraphrase detection in short text messages. Information Processing & Management 54, 6 (2018), 922\u2013937.","journal-title":"Information Processing & Management"},{"key":"e_1_3_3_3_2","first-page":"1586","volume-title":"Proceedings of the IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","author":"Al-Bataineh Hesham","year":"2019","unstructured":"Hesham Al-Bataineh, Wael Farhan, Ahmad Mustafa, Haitham Seelawi, and Hussein T. Al-Natsheh. 2019. Deep contextualized pairwise semantic similarity for Arabic language questions. In Proceedings of the IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). 1586\u20131591."},{"issue":"2","key":"e_1_3_3_4_2","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1109\/TSMCC.2011.2134847","article-title":"Understanding plagiarism linguistic patterns, textual features, and detection methods","volume":"42","author":"Alzahrani Salha M.","year":"2011","unstructured":"Salha M. Alzahrani, Naomie Salim, and Ajith Abraham. 2011. Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42, 2 (2011), 133\u2013149.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)"},{"issue":"1","key":"e_1_3_3_5_2","first-page":"101","article-title":"Transfer fine-tuning of BERT with phrasal paraphrases","volume":"66","author":"Arase Yuki","year":"2021","unstructured":"Yuki Arase and Junichi Tsujii. 2021. Transfer fine-tuning of BERT with phrasal paraphrases. Computer Speech & Language 66, 1 (2021), 101\u2013164.","journal-title":"Computer Speech & Language"},{"key":"e_1_3_3_6_2","first-page":"1","volume-title":"Proceedings of the 5th International Conference on Learning Representations (ICLR)","author":"Arora Sanjeev","year":"2017","unstructured":"Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the 5th International Conference on Learning Representations (ICLR). 1\u201316."},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00288"},{"key":"e_1_3_3_8_2","article-title":"On the Mono-and Cross-Language Detection of Text Re-Use and Plagiarism. Thesis","author":"Barr\u00f3n-Cedeno Alberto","year":"2012","unstructured":"Alberto Barr\u00f3n-Cedeno. 2012. On the Mono-and Cross-Language Detection of Text Re-Use and Plagiarism. Thesis. Departmento de Sistemas Inform\u00e1ticos y Computaci\u00f3n, Universidad Polit\u00e9cnica de Valencia.","journal-title":"Departmento de Sistemas Inform\u00e1ticos y Computaci\u00f3n, Universidad Polit\u00e9cnica de Valencia"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00153"},{"key":"e_1_3_3_10_2","article-title":"Paraphrase acquisition via crowdsourcing and machine learning","volume":"4","author":"Burrows Steven","year":"2013","unstructured":"Steven Burrows, Martin Potthast, and Benno Stein. 2013. Paraphrase acquisition via crowdsourcing and machine learning. ACM Transactions on Intelligent Systems and Technology (TIST) 4 (2013).","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compedu.2008.12.001"},{"key":"e_1_3_3_12_2","first-page":"1","volume-title":"Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)","author":"Cer Daniel","year":"2017","unstructured":"Daniel Cer, Mona Diab, Eneko Agirre, Inigo Lopez-Gazpio, and Lucia Specia. 2017. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 1\u201314."},{"issue":"2","key":"e_1_3_3_13_2","first-page":"1","article-title":"Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: The case of five AI content detection tools","volume":"6","author":"Chaka Chaka","year":"2023","unstructured":"Chaka Chaka. 2023. Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: The case of five AI content detection tools. Journal of Applied Learning and Teaching 6, 2 (2023), 1\u201311.","journal-title":"Journal of Applied Learning and Teaching"},{"key":"e_1_3_3_14_2","first-page":"55","volume-title":"Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies","author":"Che Wanxiang","year":"2018","unstructured":"Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, and Ting Liu. 2018. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. 55\u201364."},{"key":"e_1_3_3_15_2","first-page":"1","article-title":"Plagiarism: Taxonomy, tools and detection techniques","author":"Chowdhury Hussain A.","year":"2018","unstructured":"Hussain A. Chowdhury and Dhruba K. Bhattacharyya. 2018. Plagiarism: Taxonomy, tools and detection techniques. In Proceedings of the 19th National Convention on Knowledge, Library and Information Networking (NACLIN). 1\u201317.","journal-title":"In Proceedings of the 19th National Convention on Knowledge, Library and Information Networking (NACLIN)"},{"issue":"2","key":"e_1_3_3_16_2","first-page":"85","article-title":"ChatGPT as a tool for developing paraphrasing skills among ESL learners","volume":"11","author":"Chui Ho Chui","year":"2023","unstructured":"Ho Chui Chui. 2023. ChatGPT as a tool for developing paraphrasing skills among ESL learners. Journal of Creative Practices in Language Learning and Teaching (CPLT) 11, 2 (2023), 85\u2013105.","journal-title":"Journal of Creative Practices in Language Learning and Teaching (CPLT)"},{"key":"e_1_3_3_17_2","first-page":"1249","article-title":"Corpora and text re-use","author":"Clough Paul","year":"2009","unstructured":"Paul Clough and Robert Gaizauskas. 2009. Corpora and text re-use. In Handbook of Corpus Linguistics, Handbooks of Linguistics and Communication Science. 1249\u20131271.","journal-title":"Handbook of Corpus Linguistics, Handbooks of Linguistics and Communication Science"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1162\/coli.08-003-R1-07-044"},{"key":"e_1_3_3_19_2","doi-asserted-by":"crossref","first-page":"8440","DOI":"10.18653\/v1\/2020.acl-main.747","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Online","author":"Conneau Alexis","year":"2020","unstructured":"Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzm\u00e1n, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Online. 8440\u20138451."},{"issue":"1","key":"e_1_3_3_20_2","first-page":"1","article-title":"Cross-lingual language model pretraining","volume":"32","author":"Conneau Alexis","year":"2019","unstructured":"Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. Advances in Neural Information Processing Systems 32, 1 (2019), 1\u201310.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_21_2","first-page":"4087","volume-title":"Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC)","author":"Demir Seniz","year":"2012","unstructured":"Seniz Demir, Ilknur Durgar El-Kahlout, Erdem Unal, and Hamza Kaya. 2012. Turkish paraphrase corpus. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC). 4087\u20134091."},{"key":"e_1_3_3_22_2","first-page":"4171","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171\u20134186."},{"issue":"8887","key":"e_1_3_3_23_2","first-page":"1","article-title":"A hybrid model for paraphrase detection combines pros of text similarity with deep learning","volume":"975","author":"Desouki Mohamed I. El","year":"2019","unstructured":"Mohamed I. El Desouki, Wael H. Gomaa, and Hawaf Abdalhakim. 2019. A hybrid model for paraphrase detection combines pros of text similarity with deep learning. International Journal of Computer Applications 975, 8887 (2019), 1\u201306.","journal-title":"International Journal of Computer Applications"},{"issue":"9","key":"e_1_3_3_24_2","first-page":"470","article-title":"A new online plagiarism detection system based on deep learning","volume":"11","author":"Hambi Faouzia Benabbou El Mostafa","year":"2020","unstructured":"Faouzia Benabbou El Mostafa Hambi. 2020. A new online plagiarism detection system based on deep learning. International Journal of Advanced Computer Sciences and Applications 11, 9 (2020), 470\u2013478.","journal-title":"International Journal of Advanced Computer Sciences and Applications"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-023-15703-4"},{"key":"e_1_3_3_26_2","first-page":"91","volume-title":"Proceedings of the International Conference on Web Research (ICWR)","author":"Emami Zahra Sadat Hosseini Moghadam","year":"2021","unstructured":"Zahra Sadat Hosseini Moghadam Emami, Shohreh Tabatabayiseifi, Mohammad Izadi, and Mohammad Tavakoli. 2021. Designing a deep neural network model for finding semantic similarity between short Persian texts using a parallel corpus. In Proceedings of the International Conference on Web Research (ICWR). 91\u201396."},{"key":"e_1_3_3_27_2","first-page":"149","volume-title":"Proceedings of the Forum for Information Retrieval Evaluation (FIRE) (Working Notes)","author":"Esteki Fezeh","year":"2016","unstructured":"Fezeh Esteki and Faramarz Safi Esfahani. 2016. A plagiarism detection approach based on SVM for Persian texts. In Proceedings of the Forum for Information Retrieval Evaluation (FIRE) (Working Notes). 149\u2013153."},{"key":"e_1_3_3_28_2","first-page":"1608","volume-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Fader Anthony","year":"2013","unstructured":"Anthony Fader, Luke Zettlemoyer, and Oren Etzioni. 2013. Paraphrase-driven learning for open question answering. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1608\u20131618."},{"key":"e_1_3_3_29_2","first-page":"271","volume-title":"Proceedings of the 21st Nordic Conference on Computational Linguistics","author":"Fares Murhaf","year":"2017","unstructured":"Murhaf Fares, Andrey Kutuzov, Stephan Oepen, and Erik Velldal. 2017. Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In Proceedings of the 21st Nordic Conference on Computational Linguistics. 271\u2013276."},{"key":"e_1_3_3_30_2","unstructured":"Fangxiaoyu Feng Yinfei Yang Daniel Cer Naveen Arivazhagan and Wei Wang. 2020. Language-agnostic BERT sentence embedding. arXiv:2007.01852. Retrieved from https:\/\/arxiv.org\/abs\/2007.01852"},{"issue":"1","key":"e_1_3_3_31_2","first-page":"1","article-title":"Academic plagiarism detection: A systematic literature review","volume":"52","author":"Folt\u1ef3nek Tom\u00e1\u0161","year":"2019","unstructured":"Tom\u00e1\u0161 Folt\u1ef3nek, Norman Meuschke, and Bela Gipp. 2019. Academic plagiarism detection: A systematic literature review. ACM Computing Surveys 52, 1 (2019), 1\u201342.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_3_32_2","first-page":"816","volume-title":"Proceedings of the iConference, 15th International Conference, iConference 2020","author":"Folt\u1ef3nek Tom\u00e1\u0161","year":"2020","unstructured":"Tom\u00e1\u0161 Folt\u1ef3nek, Terry Ruas, Philipp Scharpf, Norman Meuschke, Moritz Schubotz, William Grosky, and Bela Gipp. 2020. Detecting machine-obfuscated plagiarism. In Proceedings of the iConference, 15th International Conference, iConference 2020. 816\u2013827."},{"key":"e_1_3_3_33_2","first-page":"758","volume-title":"Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)","author":"Ganitkevitch Juri","year":"2013","unstructured":"Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 758\u2013764."},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/3176748.3176757"},{"key":"e_1_3_3_35_2","article-title":"Neural Network Methods in Natural Language Processing. Morgan & Claypool Publishers","author":"Goldberg Yoav","year":"2017","unstructured":"Yoav Goldberg and Graeme Hirst. 2017. Neural Network Methods in Natural Language Processing. Morgan & Claypool Publishers. 9781627052986 (Zitiert auf Seite 69) ([n. d.]).","journal-title":"9781627052986 (Zitiert auf Seite 69)"},{"key":"e_1_3_3_36_2","first-page":"3483","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC)","author":"Grave Edouard","year":"2018","unstructured":"Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning word vectors for 157 languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC). 3483\u20133487."},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3586009"},{"key":"e_1_3_3_38_2","first-page":"964","volume-title":"Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC)","author":"Haider Samar","year":"2018","unstructured":"Samar Haider. 2018. Urdu word embeddings. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC). 964\u2013968."},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.25046\/aj050559"},{"key":"e_1_3_3_40_2","first-page":"1","article-title":"Cross-language Urdu\u2013English (clue) text alignment corpus","author":"Hanif Israr","year":"2015","unstructured":"Israr Hanif, Rao Muhammad Adeel Nawab, Affiffa Arbab, Huma Jamshed, Sara Riaz, and Ehsan Ullah Munir. 2015. Cross-language Urdu\u2013English (clue) text alignment corpus. Cross-Language Urdu-English (CLUE) Text Alignment Corpus: Notebook for PAN at CLEF, Toulouse, France (2015), 1\u201309.","journal-title":"Cross-Language Urdu-English (CLUE) Text Alignment Corpus: Notebook for PAN at CLEF, Toulouse, France"},{"issue":"1","key":"e_1_3_3_41_2","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1016\/j.engappai.2015.07.011","article-title":"On retrieving intelligently plagiarized documents using semantic similarity","volume":"45","author":"Hussain Syed Fawad","year":"2015","unstructured":"Syed Fawad Hussain and Asif Suryani. 2015. On retrieving intelligently plagiarized documents using semantic similarity. Engineering Applications of Artificial Intelligence 45, 1 (2015), 246\u2013258.","journal-title":"Engineering Applications of Artificial Intelligence"},{"issue":"1","key":"e_1_3_3_42_2","first-page":"354","article-title":"Urdu paraphrase detection: A novel DNN-based implementation using a semi-automatically generated corpus","volume":"30","author":"Iqbal Hafiz Rizwan","year":"2023","unstructured":"Hafiz Rizwan Iqbal, Rashad Maqsood, Agha Ali Raza, and Saeed-Ul Hassan. 2023. Urdu paraphrase detection: A novel DNN-based implementation using a semi-automatically generated corpus. Natural Language Engineering 30, 1 (2023), 354\u2013384.","journal-title":"Natural Language Engineering"},{"key":"e_1_3_3_43_2","volume-title":"Proceedings of the 3rd International Conference on Learning Representations","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations."},{"key":"e_1_3_3_44_2","first-page":"1","volume-title":"Proceedings of the 2021 RIVF International Conference on Computing and Communication Technologies (RIVF)","author":"Le Huong T.","year":"2021","unstructured":"Huong T. Le, Dung T. Cao, Trung H. Bui, Long T. Luong, and Huy Q. Nguyen. 2021. Improve quora question pair dataset for question similarity task. In Proceedings of the 2021 RIVF International Conference on Computing and Communication Technologies (RIVF). IEEE, 1\u20135."},{"key":"e_1_3_3_45_2","unstructured":"Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy Mike Lewis Luke Zettlemoyer and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692. Retrieved from https:\/\/arxiv.org\/abs\/1907.11692"},{"key":"e_1_3_3_46_2","first-page":"545","volume-title":"Proceedings of the Pacific Asia Conference on Language, Information and Computation (PACLIC)","author":"Mahmoud Adnen","year":"2022","unstructured":"Adnen Mahmoud and Mounir Zrigui. 2022. Siamese AraBERT-LSTM model based approach for Arabic paraphrase detection. In Proceedings of the Pacific Asia Conference on Language, Information and Computation (PACLIC). 545\u2013553."},{"issue":"8","key":"e_1_3_3_47_2","first-page":"1050","article-title":"Plagiarism-A survey.","volume":"12","author":"Maurer Hermann A.","year":"2006","unstructured":"Hermann A. Maurer, Frank Kappe, and Bilal Zaka. 2006. Plagiarism-A survey. Journal of Universal Computer Science 12, 8 (2006), 1050\u20131084.","journal-title":"Journal of Universal Computer Science"},{"issue":"1","key":"e_1_3_3_48_2","first-page":"121","article-title":"Urdu text reuse detection at phrasal level using sentence transformer-based approach","volume":"234","author":"Mehak Gull","year":"2023","unstructured":"Gull Mehak, Iqra Muneer, and Rao Muhammad Adeel Nawab. 2023. Urdu text reuse detection at phrasal level using sentence transformer-based approach. Expert Systems with Applications 234, 1 (2023), 121\u2013163.","journal-title":"Expert Systems with Applications"},{"key":"e_1_3_3_49_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR)","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Kai Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the International Conference on Learning Representations (ICLR). 1\u201312."},{"key":"e_1_3_3_50_2","first-page":"156","volume-title":"Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Online","author":"Mori Yusuke","year":"2020","unstructured":"Yusuke Mori, Hiroaki Yamane, Yusuke Mukuta, and Tatsuya Harada. 2020. Finding and generating a missing part for story completion. In Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Online. 156\u2013166."},{"key":"e_1_3_3_51_2","volume-title":"Mono-and Cross-Lingual Paraphrased Text Reuse and Extrinsic Plagiarism Detection.","author":"Muhammad Sharjeel","year":"2020","unstructured":"Sharjeel Muhammad. 2020. Mono-and Cross-Lingual Paraphrased Text Reuse and Extrinsic Plagiarism Detection.Thesis. Ph. D. Dissertation. Lancaster University, U.K."},{"key":"e_1_3_3_52_2","volume-title":"Proceedings of the International Conference on International Conference on Machine Learning","author":"Nair Vinod","year":"2010","unstructured":"Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the International Conference on International Conference on Machine Learning."},{"issue":"1","key":"e_1_3_3_53_2","first-page":"109","article-title":"Mono-lingual text reuse detection for the Urdu language at lexical level","volume":"136","author":"Noreen Ayesha","year":"2024","unstructured":"Ayesha Noreen, Iqra Muneer, and Rao Muhammad Adeel Nawab. 2024. Mono-lingual text reuse detection for the Urdu language at lexical level. Engineering Applications of Artificial Intelligence 136, 1 (2024), 109\u2013123.","journal-title":"Engineering Applications of Artificial Intelligence"},{"key":"e_1_3_3_54_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_3_55_2","first-page":"3484","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference (LREC)","author":"Qasmi Namoos Hayat","year":"2020","unstructured":"Namoos Hayat Qasmi, Haris Bin Zia, Awais Athar, and Agha Ali Raza. 2020. SimplifyUR: Unsupervised lexical text simplification for Urdu. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC). 3484\u20133489."},{"key":"e_1_3_3_56_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018) Preprint. 1\u201312."},{"key":"e_1_3_3_57_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_3_58_2","doi-asserted-by":"crossref","first-page":"4512","DOI":"10.18653\/v1\/2020.emnlp-main.365","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online","author":"Reimers Nils","year":"2020","unstructured":"Nils Reimers and Iryna Gurevych. 2020. Making monolingual sentence embeddings multilingual using knowledge distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online. 4512\u20134525."},{"issue":"1","key":"e_1_3_3_59_2","first-page":"7412","article-title":"Measuring short text reuse for the Urdu language","volume":"6","author":"Sameen Sara","year":"2017","unstructured":"Sara Sameen, Muhammad Sharjeel, Rao Muhammad Adeel Nawab, Paul Rayson, and Iqra Muneer. 2017. Measuring short text reuse for the Urdu language. IEEE Access 6, 1 (2017), 7412\u20137421.","journal-title":"IEEE Access"},{"key":"e_1_3_3_60_2","volume-title":"An Urdu Semantic Tagger-Lexicons, Corpora, Methods and Tools.","author":"Shafi Jawad","year":"2019","unstructured":"Jawad Shafi. 2019. An Urdu Semantic Tagger-Lexicons, Corpora, Methods and Tools.Thesis. The Lancaster University, United Kingdom."},{"key":"e_1_3_3_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3582496"},{"key":"e_1_3_3_62_2","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324921000425"},{"key":"e_1_3_3_63_2","volume-title":"Mono-and Cross-Lingual Paraphrased Text Reuse and Extrinsic Plagiarism Detection","author":"Sharjeel Muhammad","year":"2020","unstructured":"Muhammad Sharjeel. 2020. Mono-and Cross-Lingual Paraphrased Text Reuse and Extrinsic Plagiarism Detection. Thesis. Lancaster University, U.K."},{"key":"e_1_3_3_64_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-016-9367-2"},{"key":"e_1_3_3_65_2","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_3_3_66_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"issue":"5","key":"e_1_3_3_67_2","first-page":"1","article-title":"Corpus-based paraphrase detection experiments and review","volume":"11","author":"Vrbanec Tedo","year":"2020","unstructured":"Tedo Vrbanec and Ana Me\u0161trovi\u0107. 2020. Corpus-based paraphrase detection experiments and review. Information 11, 5 (2020), 1\u201324.","journal-title":"Information"},{"key":"e_1_3_3_68_2","first-page":"1","volume-title":"Proceedings of the 4th International Conference on Learning Representations (ICLR)","author":"Wieting John","year":"2016","unstructured":"John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2016. Towards universal paraphrastic sentence embeddings. In Proceedings of the 4th International Conference on Learning Representations (ICLR). 1\u201319."},{"key":"e_1_3_3_69_2","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.18653\/v1\/P17-1190","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL)","author":"Wieting John","year":"2017","unstructured":"John Wieting and Kevin Gimpel. 2017. Revisiting recurrent networks for paraphrastic sentence embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL). 2078\u20132088."},{"key":"e_1_3_3_70_2","doi-asserted-by":"crossref","first-page":"87","DOI":"10.18653\/v1\/2020.acl-demos.12","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online","author":"Yang Yinfei","year":"2020","unstructured":"Yinfei Yang, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-Hsuan Sung, et\u00a0al. 2020. Multilingual universal sentence encoder for semantic retrieval. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online. 87\u201394."},{"key":"e_1_3_3_71_2","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324922000031"},{"key":"e_1_3_3_72_2","first-page":"316","volume-title":"Proceedings of the International Conference on Artificial Intelligence and Robotics (QICAR)","author":"Zareshahi Ali","year":"2024","unstructured":"Ali Zareshahi, MohammadAli Javadzade, and Esmaeel Bastami. 2024. Measuring semantic similarity of Persian sentences using ParsBERT model. In Proceedings of the International Conference on Artificial Intelligence and Robotics (QICAR). 316\u2013321."},{"key":"e_1_3_3_73_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-2512"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3748320","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T13:29:18Z","timestamp":1757510958000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3748320"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,10]]},"references-count":72,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,9,30]]}},"alternative-id":["10.1145\/3748320"],"URL":"https:\/\/doi.org\/10.1145\/3748320","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2025,9,10]]},"assertion":[{"value":"2024-06-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-10","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}