{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,7]],"date-time":"2026-06-07T07:35:49Z","timestamp":1780817749503,"version":"3.54.1"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"12","license":[{"start":{"date-parts":[[2023,12,19]],"date-time":"2023-12-19T00:00:00Z","timestamp":1702944000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,12,31]]},"abstract":"<jats:p>Named-entity Recognition (NER) is challenging for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER Model. We propose a customized model based on linguistic properties to compensate for this lack of resources in low-resource languages like Persian. According to pronoun-dropping and subject-object-verb word order specifications of Persian, we propose new weighted relative positional encoding in the self-attention mechanism. Using the pointwise mutual information factor, we inject co-occurrence information into context representation. We trained and tested our model on three different datasets: Arman, Peyma, and ParsTwiNER, and our method achieved 94.16%, 93.36%, and 84.49% word-level F1 scores, respectively. The experiments showed that our proposed model performs better than other Persian NER models. Ablation Study and Case Study also showed that our method can converge faster and is less prone to overfitting.<\/jats:p>","DOI":"10.1145\/3633513","type":"journal-article","created":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T12:35:26Z","timestamp":1700656526000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Named Entity Recognition in Persian Language based on Self-attention Mechanism with Weighted Relational Position Encoding"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3360-6837","authenticated-orcid":false,"given":"Ebrahim","family":"Ganjalipour","sequence":"first","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Lahijan Branch Islamic Azad University, Lahijan, Iran"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1664-5471","authenticated-orcid":false,"given":"Amir Hossein","family":"Refahi Sheikhani","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Lahijan Branch Islamic Azad University, Lahijan, Iran"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0081-8008","authenticated-orcid":false,"given":"Sohrab","family":"Kordrostami","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Lahijan Branch Islamic Azad University, Lahijan, Iran"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0592-1795","authenticated-orcid":false,"given":"Ali Asghar","family":"Hosseinzadeh","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Lahijan Branch Islamic Azad University, Lahijan, Iran"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,12,19]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.10.101"},{"key":"e_1_3_1_3_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani A.","year":"2017","unstructured":"A. Vaswani et al. 2017. Attention is all you need. Adv. Neural Info. Process. Syst. 30 (2017).","journal-title":"Adv. Neural Info. Process. Syst."},{"key":"e_1_3_1_4_2","doi-asserted-by":"crossref","unstructured":"P. Shaw J. Uszkoreit and A. Vaswani. 2018. Self-attention with relative position representations. Retrieved from https:\/\/arXiv:1803.02155","DOI":"10.18653\/v1\/N18-2074"},{"key":"e_1_3_1_5_2","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Devlin J.","year":"2019","unstructured":"J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies."},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","unstructured":"A. Conneau et al. 2019. Unsupervised cross-lingual representation learning at scale. Retrieved from https:\/\/arXiv:1911.02116","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_3_1_8_2","volume-title":"Proceedings of the 3rd Workshop on Computational Approaches to Linguistic Code-Switching","author":"Wang C.","year":"2018","unstructured":"C. Wang, K. Cho, and D. Kiela. 2018. Code-switched named entity recognition with embedding attention. In Proceedings of the 3rd Workshop on Computational Approaches to Linguistic Code-Switching."},{"key":"e_1_3_1_9_2","volume-title":"Proceedings of the 15th National Computer Society of Iran Conference","author":"Mortazavi P. S.","year":"2009","unstructured":"P. S. Mortazavi and M. Shamsfard. 2009. Named entity recognition in Persian texts. In Proceedings of the 15th National Computer Society of Iran Conference."},{"key":"e_1_3_1_10_2","volume-title":"Proceedings of the Signal and Data Processing Conference","author":"Rahati-Ghoochani S.","year":"2010","unstructured":"S. Rahati-Ghoochani, S. A. Esfahani, and J. Nader. 2010. Persian name entity recognition and classification. In Proceedings of the Signal and Data Processing Conference."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.5120\/17510-8062"},{"key":"e_1_3_1_12_2","volume-title":"Proceedings of the 7th Conference on Information and Knowledge Technology (IKT\u201915)","author":"Ahmadi F.","year":"2015","unstructured":"F. Ahmadi and H. Moradi. 2015. A hybrid method for Persian named entity recognition. In Proceedings of the 7th Conference on Information and Knowledge Technology (IKT\u201915)."},{"key":"e_1_3_1_13_2","volume-title":"Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING\u201916)","author":"Poostchi H.","year":"2016","unstructured":"H. Poostchi, E. Z. Borzeshi, M. Abdous, and M. Piccardi. 2016. PersoNER: Persian named-entity recognition. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING\u201916)."},{"key":"e_1_3_1_14_2","first-page":"381","volume-title":"Proceedings of the 9th International Symposium on Telecommunications (IST\u201918).","author":"Bokaei M. H.","year":"2018","unstructured":"M. H. Bokaei and M. Mahmoudi. 2018. Improved deep Persian named entity recognition. In Proceedings of the 9th International Symposium on Telecommunications (IST\u201918). 381\u2013386."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.29252\/jsdp.16.1.91"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.29252\/jsdp.16.4.93"},{"key":"e_1_3_1_17_2","first-page":"75","volume-title":"Proceedings of the Congress on Intelligent Systems","author":"Balouchzahi F.","year":"2020","unstructured":"F. Balouchzahi and H. Shashirekha. 2020. PUNER-Parsi ULMFiT for named-entity recognition in Persian texts. In Proceedings of the Congress on Intelligent Systems. Springer, 75\u201388"},{"issue":"3","key":"e_1_3_1_18_2","first-page":"1","article-title":"Joined type length encoding for nested named entity recognition","volume":"21","author":"Sheikhaei M. S.","year":"2021","unstructured":"M. S. Sheikhaei, H. Zafari, and Y. Tian. 2021. Joined type length encoding for nested named entity recognition. Trans. Asian Low-Resour. Lang. Info. Process. 21, 3 (2021), 1\u201323.","journal-title":"Trans. Asian Low-Resour. Lang. Info. Process."},{"key":"e_1_3_1_19_2","unstructured":"A. Radford K. Narasimhan T. Salimans and I. Sutskever. 2018. Improving language understanding by generative pre-training. http:\/\/scholar.google.com\/scholar_lookup?hl=en&publication_year=2018&author=Alec+Radford&author=Karthik+Narasimhan&author=Tim+Salimans&author=Ilya+Sutskever&title=Improving+language+understanding+by+generative+pre-training"},{"key":"e_1_3_1_20_2","unstructured":"J. Devlin M.-W. Chang K. Lee and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https:\/\/arXiv:1810.04805"},{"key":"e_1_3_1_21_2","unstructured":"N. Taghizadeh Z. Borhanifard M. GolestaniPour and H. Faili. 2020. NSURL-2019 task 7: Named entity recognition (NER) in Farsi. Retrieved from https:\/\/arXiv:2003.09029"},{"key":"e_1_3_1_22_2","volume-title":"Proceedings of the 1st International Workshop on NLP Solutions for Under Resourced Languages Co-located with ICNLSP","author":"Mohseni M.","year":"2019","unstructured":"M. Mohseni and A. Tebbifakhr. 2019. MorphoBERT: A Persian NER system with BERT and morphological analysis. In Proceedings of the 1st International Workshop on NLP Solutions for Under Resourced Languages Co-located with ICNLSP."},{"key":"e_1_3_1_23_2","unstructured":"E. Taher S. A. Hoseini and M. Shamsfard. 2020. Beheshti-NER: Persian named entity recognition using BERT. Retrieved from https:\/\/arXiv:2003.08875"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-021-10528-4"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","first-page":"131","DOI":"10.18653\/v1\/2021.wnut-1.16","volume-title":"Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT\u201921)","author":"Aghajani M.","year":"2021","unstructured":"M. Aghajani, A. Badri, and H. Beigy. 2021. ParsTwiNER: A corpus for named entity recognition at informal Persian. In Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT\u201921). 131\u2013136."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.4218\/etrij.2021-0269"},{"key":"e_1_3_1_27_2","doi-asserted-by":"crossref","unstructured":"T. Kudo and J. Richardson. 2018. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. Retrieved from https:\/\/arXiv:1808.06226","DOI":"10.18653\/v1\/D18-2012"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.5555\/1861751.1861756"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.3758\/BF03193020"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-1503"},{"key":"e_1_3_1_31_2","unstructured":"Y. Liu et al. 2019. Roberta: A robustly optimized bert pretraining approach. Retrieved from https:\/\/arXiv:1907.11692"},{"key":"e_1_3_1_32_2","unstructured":"E. F. Sang and F. De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. Retrieved from https:\/\/arxiv.org\/abs\/cs\/0306050"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.29252\/jsdp.14.3.127"},{"key":"e_1_3_1_34_2","first-page":"79","volume-title":"Proceedings of the IEEE 16th International Conference on Cognitive Informatics and Cognitive Computing (ICCI*CC\u201917)","author":"Dashtipour K.","year":"2017","unstructured":"K. Dashtipour, M. Gogate, A. Adeel, A. Algarafi, N. Howard, and A. Hussain. 2017. Persian named entity recognition. In Proceedings of the IEEE 16th International Conference on Cognitive Informatics and Cognitive Computing (ICCI*CC\u201917). IEEE, 79\u201383."}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3633513","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3633513","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:50:08Z","timestamp":1750287008000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3633513"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,19]]},"references-count":33,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,12,31]]}},"alternative-id":["10.1145\/3633513"],"URL":"https:\/\/doi.org\/10.1145\/3633513","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"value":"2375-4699","type":"print"},{"value":"2375-4702","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,19]]},"assertion":[{"value":"2023-01-20","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-11-17","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}