{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:53:52Z","timestamp":1777704832611,"version":"3.51.4"},"reference-count":36,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2020,6,12]],"date-time":"2020-06-12T00:00:00Z","timestamp":1591920000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2020,8,31]]},"abstract":"<jats:p>Automatic validation of compositionality vs non-compositionality is a very challenging problem in NLP. A very small number of papers in literature report results in this particular problem. Recently, some new approaches have arised with respect to this particular linguistic task. One of these approaches that have called our attention is based on what authors call \u201clexical domain\u201d. In this paper, we analyze the use of Pointwise Mutual Information for constructing thesauri on the fly, which can be further employed instead of dictionaries for determining whether or not a given phraseological unit is compositional or not. The experimental results carried out in this paper show that this dissimilarity measure (PMI), can effectively be used when determining compositionality of a given verbal phraseological unit. Moreover, we show that the use of thesauri improves the results obtained in comparison with those experiments employing dictionaries, highlighting the use of self-constructed lexical resources which are, in fact, taking advantage of the same vocabulary of the target dataset.<\/jats:p>","DOI":"10.3233\/jifs-179872","type":"journal-article","created":{"date-parts":[[2020,6,12]],"date-time":"2020-06-12T12:39:57Z","timestamp":1591965597000},"page":"2061-2070","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Using automatic constructed thesauri instead of dictionaries in the verbal phraseological units validation task"],"prefix":"10.1177","volume":"39","author":[{"given":"David","family":"Pinto","sequence":"first","affiliation":[{"name":"Faculty of Computer Science, Benem\u00e9rita Universidad Aut\u00f3noma de Puebla, PUE, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bel\u00e9m","family":"Priego","sequence":"additional","affiliation":[{"name":"Department of Systems, Universidad Aut\u00f3noma Metropolitana Unidad Azcapotzalco, CDMX, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2020,6,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"crossref","unstructured":"AgirreE. and RigauG. Word sense disambiguation using conceptual density In Proceedings of the 16th Conference on Computational Linguistics - Volume 1 COLING \u201996 (1996) pages 16\u201322 Stroudsburg PA USA. Association for Computational Linguistics.","DOI":"10.3115\/992628.992635"},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","unstructured":"Avenda\u00f1oD.P. Jim\u00e9nez-SalazarH. and RossoP. Clustering abstracts of scientific texts using the transition point technique. In Gelbukh A. F. editor Computational Linguistics and Intelligent Text Processing 7th International Conference CICLing 2006 Mexico City Mexico February 19\u201325 2006 Proceedings volume 3878 of Lecture Notes in Computer Science (2006) pp. 536\u2013546. Springer.","DOI":"10.1007\/11671299_55"},{"key":"e_1_3_1_4_2","doi-asserted-by":"crossref","unstructured":"BaldwinT. and VillavicencioA. Extracting the unextractable: A case study on verb-particles (2002).","DOI":"10.3115\/1118853.1118854"},{"key":"e_1_3_1_5_2","unstructured":"BiemannC. and GiesbrechtE. Distributional semantics and compositionality 2011: Shared task description and results. In Proceedings of the Workshop on Distributional Semantics and Compositionality DiSCo \u201911 (2011) pp. 21\u201328 Stroudsburg PA USA. Association for Computational Linguistics."},{"key":"e_1_3_1_6_2","unstructured":"CaseliH.D.M. VillavicencioA. MachadoA. and FinattoM.J. Statistically-driven alignment-based multiword expression identification for technical domains Association for Computational Linguistics. Accessed on 2020\/01\/23. (2009)."},{"key":"e_1_3_1_7_2","unstructured":"ChouekaY. Looking for needles in a haystack or locating interesting collocational expressions in large textual databases. In C. Fluhr and D.E. Walker editors RIAO (1988) pp. 609\u2013624. CID."},{"issue":"1","key":"e_1_3_1_8_2","first-page":"22","article-title":"Word association norms, mutual information, and lexicography","volume":"16","author":"Church K.W.","year":"1990","unstructured":"ChurchK.W. and HanksP., Word association norms, mutual information, and lexicography, Computational Linguistics16(1) (1990), 22\u201329.","journal-title":"Computational Linguistics"},{"key":"e_1_3_1_9_2","doi-asserted-by":"crossref","unstructured":"ConstantM. and NivreJ. A transition-based system for joint lexical and syntactic analysis In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) pp. 161\u2013171 Berlin Germany. Association for Computational Linguistics. (2016).","DOI":"10.18653\/v1\/P16-1016"},{"key":"e_1_3_1_10_2","unstructured":"DailleB. Study and implementation of combined techniques for automatic extraction of terminology. In The Balancing Act: Combining Symbolic and Statistical Approaches to Language. (1994)."},{"key":"e_1_3_1_11_2","doi-asserted-by":"crossref","unstructured":"DiasG. Multiword unit hybrid extraction. In Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis Acquisition and Treatment pages 41\u201348 Sapporo Japan. Association for Computational Linguistics. (2003).","DOI":"10.3115\/1119282.1119288"},{"key":"e_1_3_1_12_2","unstructured":"EhrenR. LichteT. and SamihY. Mumpitz at PARSEME shared task 2018: A bidirectional LSTM for the identification of verbal multiword expressions. In Proceedings of the Joint Workshop on Linguistic Annotation Multiword Expressions and Constructions (LAW-MWE-CxG-2018) (2018). pp. 261\u2013267 Santa Fe New Mexico USA. Association for Computational Linguistics."},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"GharbiehW. BhavsarV. and CookP. Deep learning models for multiword expression identification. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017) pages 54\u201364 Vancouver Canada. Association for Computational Linguistics. (2017).","DOI":"10.18653\/v1\/S17-1006"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","unstructured":"JacqueminC. Recycling terms into a partial parser In Fourth Conference on Applied Natural Language Processing pp. 113\u2013118 Stuttgart Germany. Association for Computational Linguistics. (1994).","DOI":"10.3115\/974358.974384"},{"key":"e_1_3_1_15_2","unstructured":"JiangJ.J. and ConrathD.W. Semantic similarity based on corpus statistics and lexical taxonomy. In Proc of 10th International Conference on Research in Computational Linguistics ROCLING 97. (1997)."},{"key":"e_1_3_1_16_2","unstructured":"KolesnikovaO. and GelbukhA.F. Measuring noncompositionality of verb-noun collocations using lexical functions andwordnet hypernyms. In O. Pichardo-Lagunas O. Herrera-Alc\u00e1ntara and G. Arroyo-Figueroa editors Advances in Artificial Intelligence and Its Applications - 14th Mexican International Conference on Artificial Intelligence MICAI 2015 Cuernavaca Morelos Mexico October 25-31 2015. Proceedings Part II volume 9414 of Lecture Notes in Computer Science (2015) pp. 3\u201325. Springer."},{"key":"e_1_3_1_17_2","volume-title":"Combining Local Context and WordNet Similarity for Word Sense Identification","author":"Leacock C.","unstructured":"LeacockC. and ChodorowM., Combining Local Context and WordNet Similarity for Word Sense Identification. In FellbaumC., editor, WordNet: An electronic lexical database., chapter 13, (1998), pp. 265\u2013283. MIT Press."},{"key":"e_1_3_1_18_2","doi-asserted-by":"crossref","unstructured":"LeskM. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th Annual International Conference on Systems Documentation SIGDOC \u201986 (1986) pp. 24\u201326 New York NY USA. ACM.","DOI":"10.1145\/318723.318728"},{"key":"e_1_3_1_19_2","unstructured":"LinD. An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning ICML \u201998 (1998) pp. 296\u2013304 San Francisco CA USA. Morgan Kaufmann Publishers Inc."},{"key":"e_1_3_1_20_2","unstructured":"ManningC.D. and Sch\u00fctzeH. Foundations of Statistical Natural Language Processing. MIT Press Cambridge MA USA. (1999)."},{"key":"e_1_3_1_21_2","unstructured":"MarkantonatouS. RamischC. SavaryA. and VinczeV. editors (2017). Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) Valencia Spain. Association for Computational Linguistics."},{"key":"e_1_3_1_22_2","unstructured":"MichielsA. and DufourN. Defi a tool for automatic multi-word unit recognition Meaning Assignment and Translation Selection. (1998)."},{"key":"e_1_3_1_23_2","unstructured":"PastorG. Manual de fraseolog\u00eda espa\u00f1ola. Gredos Madrid. (1996)."},{"key":"e_1_3_1_24_2","unstructured":"Pinto Avenda\u00f1oD. On clustering and evaluation of narrow domain short-text corpora Procesamiento del Lenguaje Natural 42. (2009)."},{"issue":"4","key":"e_1_3_1_25_2","doi-asserted-by":"crossref","DOI":"10.13053\/cys-19-4-2328","article-title":"Identification of verbal phraseological units in mexican news stories","volume":"19","author":"Priego S\u00e1nchez B.","year":"2015","unstructured":"Priego S\u00e1nchezB. and PintoD., Identification of verbal phraseological units in mexican news stories, Computaci\u00f3n y Sistemas19(4) (2015).","journal-title":"Computaci\u00f3n y Sistemas"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.3233\/JIFS-179009"},{"key":"e_1_3_1_27_2","first-page":"1","article-title":"Compositionality versus non-compositionality verification based on lexical domain for verbal phraseological units","volume":"34","author":"Priego S\u00e1nchez B.","year":"2018","unstructured":"Priego S\u00e1nchezB., PintoD. and SinghV., Compositionality versus non-compositionality verification based on lexical domain for verbal phraseological units, Journal of Intelligent & Fuzzy Systems34 (2018), 1\u20139.","journal-title":"Journal of Intelligent & Fuzzy Systems"},{"key":"e_1_3_1_28_2","unstructured":"RamischC. CordeiroS.R. SavaryA. VinczeV. BarbuV. MititeluA. BhatiaM. BuljanM. CanditoP. GantarV. GiouliT. G\u00fcng\u00f6rA. HawwariA. I\u00f1urrietaU. Kovalevskait\u0117J. KrekS. LichteT. LiebeskindC. MontiJ. Parra Escart\u00ednC. Qasemi ZadehB. RamischR. SchneiderN. StoyanovaI. VaidyaA. and WalshA. Edition 1.1 of the PARSEME shared task on automatic identification of verbal multiword expressions. In Proceedings of the JointWorkshop on Linguistic Annotation Multiword Expressions and Constructions (LAW-MWECxG-2018) (2018) pp. 222\u2013240 Santa Fe New Mexico USA. Association for Computational Linguistics."},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","unstructured":"RamischC. de Medeiros CaseliH. VillavicencioA. MachadoA. and FinattoM.J. A hybrid approach for multiword expression identification. In T.A.S. Pardo A. Branco A. Klautau R. Vieira and V.L.S. de Lima editors Computational Processing of the Portuguese Language (2010) pp. 65\u201374 Berlin Heidelberg. Springer Berlin Heidelberg.","DOI":"10.1007\/978-3-642-12320-7_9"},{"key":"e_1_3_1_30_2","unstructured":"ResnikP. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 1 IJCAI\u201995 pages 448\u2013453 San Francisco CA USA. Morgan Kaufmann Publishers Inc. (1995)."},{"issue":"1993","key":"e_1_3_1_31_2","first-page":"143","article-title":"Retrieving collocations from text: Xtract","volume":"19","author":"Smadja F.","unstructured":"SmadjaF., Retrieving collocations from text: Xtract, Computational Linguistics19(1993), 143\u2013177.","journal-title":"Computational Linguistics"},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","unstructured":"TapanainenP. PiitulainenJ. and J\u00e4rvinenT. Idiomatic object usage and support verbs. In Proceedings of the 17th International Conference on Computational Linguistics - Volume 2 COLING\u2019 98 (1998) pp. 1289\u20131293 Stroudsburg PA USA. Association for Computational Linguistics.","DOI":"10.3115\/980432.980779"},{"key":"e_1_3_1_33_2","unstructured":"TaslimipoorS. and RohanianO. SHOMA at parseme shared task on automatic identification of vmwes: Neural multiword expression tagging with high generalisation. CoRR abs\/1809.03056. (2018)."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.13053\/rcs-85-1-5"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/2483691.2483695"},{"key":"e_1_3_1_36_2","doi-asserted-by":"crossref","unstructured":"WuZ. and PalmerM. Verbs semantics and lexical selection In Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics ACL \u201994 pp. 133\u2013138 Stroudsburg PA USA. Association for Computational Linguistics. (1994).","DOI":"10.3115\/981732.981751"},{"key":"e_1_3_1_37_2","doi-asserted-by":"crossref","unstructured":"YazdaniM. FarahmandM. and HendersonJ.H. Learning semantic composition to detect non-compositionality of multiword expressions. In EMNLP. (2015).","DOI":"10.18653\/v1\/D15-1201"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179872","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-179872","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179872","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:42:07Z","timestamp":1777455727000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-179872"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,12]]},"references-count":36,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,8,31]]}},"alternative-id":["10.3233\/JIFS-179872"],"URL":"https:\/\/doi.org\/10.3233\/jifs-179872","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,12]]}}}