{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T03:06:07Z","timestamp":1777431967454,"version":"3.51.4"},"reference-count":47,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2024,11,1]],"date-time":"2024-11-01T00:00:00Z","timestamp":1730419200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Applied Ontology"],"published-print":{"date-parts":[[2024,11]]},"abstract":"<jats:p>Classifying domain entities into their respective top-level ontology concepts is a complex problem that typically demands manual analysis and deep expertise in the domain of interest and ontology engineering. Using an efficient approach to classify domain entities enhances data integration, interoperability, and the semantic clarity of ontologies, which are crucial for structured knowledge representation and modeling. Based on this, our main motivation is to help an ontology engineer with an automated approach to classify domain entities into top-level ontology concepts using informal definitions of these domain entities during the ontology development process. In this context, we hypothesize that the informal definitions encapsulate semantic information crucial for associating domain entities with specific top-level ontology concepts. Our approach leverages state-of-the-art language models to explore our hypothesis across multiple languages and informal definitions from different knowledge resources. In order to evaluate our proposal, we extracted multi-label datasets from the alignment of the OntoWordNet ontology and the BabelNet semantic network, covering the entire structure of the Dolce-Lite-Plus top-level ontology from most generic to most specific concepts. These datasets contain several different textual representation approaches of domain entities, including terms, example sentences, and informal definitions. Our experiments conducted 3 study cases, investigating the effectiveness of our proposal across different textual representation approaches, languages, and knowledge resources. We demonstrate that the best results are achieved using a classification pipeline with a K-Nearest Neighbor (KNN) method to classify the embedding representation of informal definitions from the Mistral large language model. The findings underscore the potential of informal definitions in reflecting top-level ontology concepts and point towards developing automated tools that could significantly aid ontology engineers during the ontology development process.<\/jats:p>","DOI":"10.3233\/ao-240032","type":"journal-article","created":{"date-parts":[[2024,7,5]],"date-time":"2024-07-05T11:43:43Z","timestamp":1720179823000},"page":"311-333","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":1,"title":["How to Classify Domain Entities Into Top-Level Ontology Concepts Using Large Language Models"],"prefix":"10.1177","volume":"19","author":[{"given":"Alcides","family":"Lopes","sequence":"first","affiliation":[{"name":"Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501970, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joel","family":"Carbonera","sequence":"additional","affiliation":[{"name":"Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501970, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fabricio","family":"Rodrigues","sequence":"additional","affiliation":[{"name":"Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501970, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luan","family":"Garcia","sequence":"additional","affiliation":[{"name":"Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501970, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mara","family":"Abel","sequence":"additional","affiliation":[{"name":"Institute of Informatics, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501970, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2025,3,17]]},"reference":[{"key":"e_1_3_3_2_1","doi-asserted-by":"crossref","unstructured":"Arp R. Smith B. & Spear A.D. (2015). Building Ontologies with Basic Formal Ontology. Mit Press.","DOI":"10.7551\/mitpress\/9780262527811.001.0001"},{"key":"e_1_3_3_3_1","doi-asserted-by":"crossref","unstructured":"Babaei Giglou H. D\u2019Souza J. & Auer S. (2023). LLMs4OL: Large language models for ontology learning. In International Semantic Web Conference (pp. 408\u2013427). Springer.","DOI":"10.1007\/978-3-031-47240-4_22"},{"key":"e_1_3_3_4_1","doi-asserted-by":"publisher","DOI":"10.3233\/AO-210259"},{"key":"e_1_3_3_5_1","doi-asserted-by":"crossref","unstructured":"Chen J. He Y. Geng Y. Jim\u00e9nez-Ruiz E. Dong H. & Horrocks I. (2023). Contextual semantic embeddings for ontology subsumption prediction. World Wide Web 1\u201323.","DOI":"10.1007\/s11280-023-01169-9"},{"key":"e_1_3_3_6_1","doi-asserted-by":"publisher","unstructured":"Cicconeto F. Vieira L.V. Abel M. dos Santos Alvarenga R. Carbonera J.L. & Garcia L.F. (2022). GeoReservoir: An ontology for deep-marine depositional system geometry description. Computers & Geosciences 159 105005. doi:10.1016\/j.cageo.2021.105005.","DOI":"10.1016\/j.cageo.2021.105005"},{"key":"e_1_3_3_7_1","unstructured":"Devlin J. Chang M.-W. Lee K. & Toutanova K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint. arXiv:1810.04805."},{"key":"e_1_3_3_8_1","doi-asserted-by":"crossref","unstructured":"Gangemi A. Guarino N. Masolo C. Oltramari A. & Schneider L. (2002). Sweetening ontologies with DOLCE. In International Conference on Knowledge Engineering and Knowledge Management (pp. 166\u2013181). Springer.","DOI":"10.1007\/3-540-45810-7_18"},{"key":"e_1_3_3_9_1","doi-asserted-by":"publisher","unstructured":"Gangemi A. Navigli R. & Velardi P. (2003). The OntoWordNet project: Extension and axiomatization of conceptual relations in WordNet. In Meersman R. Tari Z. Schmidt D.C. (Eds.) On the Move to Meaningful Internet Systems 2003: CoopIS DOA and ODBASE Berlin Heidelberg: Springer Berlin Heidelberg (pp. 820\u2013838). doi:10.1007\/978-3-540-39964-3_52.","DOI":"10.1007\/978-3-540-39964-3_52"},{"key":"e_1_3_3_10_1","unstructured":"Guarino N. (1998). Formal Ontology in Information Systems: Proceedings of the First International Conference (FOIS\u201998) (Vol.\u00a046). IOS Press."},{"key":"e_1_3_3_11_1","doi-asserted-by":"crossref","unstructured":"Guarino N. Oberle D. & Staab S. (2009). What is an ontology? In Handbook on Ontologies.","DOI":"10.1007\/978-3-540-92673-3_0"},{"key":"e_1_3_3_12_1","doi-asserted-by":"publisher","DOI":"10.3233\/AO-210256"},{"key":"e_1_3_3_13_1","doi-asserted-by":"publisher","DOI":"10.1080\/00437956.1954.11659520"},{"key":"e_1_3_3_14_1","doi-asserted-by":"crossref","unstructured":"He Y. Chen J. Antonyrajah D. & Horrocks I. (2022). BERTMap: A BERT-based ontology alignment system. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol.\u00a036 pp. 5684\u20135691).","DOI":"10.1609\/aaai.v36i5.20510"},{"key":"e_1_3_3_15_1","doi-asserted-by":"crossref","unstructured":"He Y. Chen J. Dong H. Horrocks I. Allocca C. Kim T. & Sapkota B. (2023). DeepOnto: A Python Package for Ontology Engineering with Deep Learning.","DOI":"10.3233\/SW-243568"},{"key":"e_1_3_3_16_1","unstructured":"Jiang A.Q. Sablayrolles A. Mensch A. Bamford C. Chaplot D.S. de las Casas D. Bressand F. Lengyel G. Lample G. Saulnier L. Lavaud L.R. Lachaux M.-A. Stock P. Scao T.L. Lavril T. Wang T. Lacroix T. & Sayed W.E. (2023). Mistral 7B."},{"key":"e_1_3_3_17_1","unstructured":"Jiang A.Q. Sablayrolles A. Roux A. Mensch A. Savary B. Bamford C. Chaplot D.S. de las Casas D. Hanna E.B. Bressand F. Lengyel G. Bour G. Lample G. Lavaud L.R. Saulnier L. Lachaux M.-A. Stock P. Subramanian S. Yang S. Antoniak S. Scao T.L. Gervet T. Lavril T. Wang T. Lacroix T. & Sayed W.E. (2024). Mixtral of Experts."},{"key":"e_1_3_3_18_1","doi-asserted-by":"publisher","unstructured":"Khadir A.C. Aliane H. & Guessoum A. (2021). Ontology learning: Grand tour and challenges. Computer Science Review 39 100339. doi:10.1016\/j.cosrev.2020.100339.","DOI":"10.1016\/j.cosrev.2020.100339"},{"key":"e_1_3_3_19_1","unstructured":"Kulvatunyou B. Drobnjakovic M. Ameri F. Will C. & Smith B. (2022). The Industrial Ontologies Foundry (IOF) Core Ontology. Formal Ontologies Meet Industry (FOMI) 2022 Tarbes FR. https:\/\/tsapps.nist.gov\/publication\/get_pdf.cfm?pub_id=935068."},{"key":"e_1_3_3_20_1","unstructured":"Lan Z. Chen M. Goodman S. Gimpel K. Sharma P. & Soricut R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint. arXiv:1909.11942."},{"key":"e_1_3_3_21_1","doi-asserted-by":"crossref","unstructured":"Lewis M. Liu Y. Goyal N. Ghazvininejad M. Mohamed A. Levy O. Stoyanov V. & Zettlemoyer L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation translation and comprehension. arXiv preprint. arXiv:1910.13461.","DOI":"10.18653\/v1\/2020.acl-main.703"},{"key":"e_1_3_3_22_1","unstructured":"Liu Y. Ott M. Goyal N. Du J. Joshi M. Chen D. Levy O. Lewis M. Zettlemoyer L. & Stoyanov V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint. arXiv:1907.11692."},{"key":"e_1_3_3_23_1","doi-asserted-by":"publisher","unstructured":"Lopes A. Carbonera J. Schmidt D. Garcia L. Rodrigues F. & Abel M. (2023). Using terms and informal definitions to classify domain entities into top-level ontology concepts: An approach based on language models. Knowledge-Based Systems 265 110385. doi:10.1016\/j.knosys.2023.110385.","DOI":"10.1016\/j.knosys.2023.110385"},{"key":"e_1_3_3_24_1","doi-asserted-by":"publisher","unstructured":"Lopes A. Carbonera J.L. Schimidt D. & Abel M. (2022). Predicting the top-level ontological concepts of domain entities using word embeddings informal definitions and deep learning. Expert Systems with Applications 203 117291. doi:10.1016\/j.eswa.2022.117291.","DOI":"10.1016\/j.eswa.2022.117291"},{"key":"e_1_3_3_25_1","unstructured":"Mikolov T. Chen K. Corrado G. & Dean J. (2013b). Efficient estimation of word representations in vector space. In Proceedings of the International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_3_26_1","unstructured":"Mikolov T. Sutskever I. Chen K. Corrado G.S. & Dean J. (2013a). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (pp. 3111\u20133119)."},{"key":"e_1_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_3_3_28_1","doi-asserted-by":"crossref","unstructured":"Navigli R. Bevilacqua M. Conia S. Montagnini D. & Cecconi F. (2021). Ten years of BabelNet: A survey. In IJCAI (pp. 4559\u20134567).","DOI":"10.24963\/ijcai.2021\/620"},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120104323093276"},{"key":"e_1_3_3_30_1","doi-asserted-by":"crossref","unstructured":"Niles I. & Pease A. (2001). Towards a standard upper ontology. In Proceedings of the International Conference on Formal Ontology in Information Systems-Volume 2001 (pp. 2\u20139).","DOI":"10.1145\/505168.505170"},{"key":"e_1_3_3_31_1","doi-asserted-by":"publisher","DOI":"10.3233\/AO-220262"},{"key":"e_1_3_3_32_1","doi-asserted-by":"crossref","unstructured":"Pennington J. Socher R. & Manning C.D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532\u20131543).","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_3_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2013.04.005"},{"key":"e_1_3_3_34_1","doi-asserted-by":"publisher","unstructured":"Qu Y. Perrin M. Torabi A. Abel M. & Giese M. (2024). GeoFault: A well-founded fault ontology for interoperability in geological modeling. Computers & Geosciences 182 105478. doi:10.1016\/j.cageo.2023.105478.","DOI":"10.1016\/j.cageo.2023.105478"},{"key":"e_1_3_3_35_1","unstructured":"Radford A. Narasimhan K. Salimans T. & Sutskever I. (2018). Improving language understanding by generative pre-training. https:\/\/s3-us-west-2.amazonaws.com\/openai-assets\/research-covers\/language-unsupervised\/language_understanding_paper.pdf."},{"key":"e_1_3_3_36_1","unstructured":"Raffel C. Shazeer N. Roberts A. Lee K. Narang S. Matena M. Zhou Y. Li W. & Liu P.J. (2023). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer."},{"key":"e_1_3_3_37_1","unstructured":"Robinson R. (1950). Definition. Oxford: Clarendon Press."},{"key":"e_1_3_3_38_1","doi-asserted-by":"crossref","unstructured":"Rodrigues F.H. Lopes A.G. dos Santos N.O. Garcia L.F. Carbonera J.L. & Abel M. (2023). On the use of ChatGPT for classifying domain terms according to upper ontologies. In International Conference on Conceptual Modeling (pp. 249\u2013258). Springer.","DOI":"10.1007\/978-3-031-47112-4_24"},{"key":"e_1_3_3_39_1","unstructured":"Sanh V. Debut L. Chaumond J. & Wolf T. (2020). DistilBERT a distilled version of BERT: Smaller faster cheaper and lighter."},{"key":"e_1_3_3_40_1","first-page":"173","article-title":"Definitions in ontologies","volume":"2016","author":"Sepp\u00e4l\u00e4 S.","year":"2016","unstructured":"Sepp\u00e4l\u00e4 S., Ruttenberg A., Schreiber Y. & Smith B. (2016). Definitions in ontologies. Cahiers de Lexicologie, 2016, 173\u2013205.","journal-title":"Cahiers de Lexicologie"},{"key":"e_1_3_3_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-023X(97)00056-6"},{"key":"e_1_3_3_42_1","doi-asserted-by":"crossref","unstructured":"Su\u00e1rez-Figueroa M.C. G\u00f3mez-P\u00e9rez A. & Fern\u00e1ndez-L\u00f3pez M. (2011). The NeOn methodology for ontology engineering. In Ontology Engineering in a Networked World (pp. 9\u201334). Springer.","DOI":"10.1007\/978-3-642-24794-1_2"},{"key":"e_1_3_3_43_1","doi-asserted-by":"publisher","DOI":"10.3233\/AO-150145"},{"key":"e_1_3_3_44_1","unstructured":"Team G. Mesnard T. Hardin C. Dadashi R. Bhupatiraju S. Pathak S. Sifre L. Rivi\u00e8re M. Kale M.S. Love J. et al.\u00a0(2024). Gemma: Open models based on gemini research and technology. arXiv preprint. arXiv:2403.08295."},{"key":"e_1_3_3_45_1","unstructured":"Touvron H. Lavril T. Izacard G. Martinet X. Lachaux M.-A. Lacroix T. Rozi\u00e8re B. Goyal N. Hambro E. Azhar F. Rodriguez A. Joulin A. Grave E. & Lample G. (2023b). LLaMA: Open and Efficient Foundation Language Models."},{"key":"e_1_3_3_46_1","unstructured":"Touvron H. Martin L. Stone K. Albert P. Almahairi A. Babaei Y. Bashlykov N. Batra S. Bhargava P. Bhosale S. Bikel D. Blecher L. Ferrer C.C. Chen M. Cucurull G. Esiobu D. Fernandes J. Fu J. Fu W. Fuller B. Gao C. Goswami V. Goyal N. Hartshorn A. Hosseini S. Hou R. Inan H. Kardas M. Kerkez V. Khabsa M. Kloumann I. Korenev A. Koura P.S. Lachaux M.-A. Lavril T. Lee J. Liskovich D. Lu Y. Mao Y. Martinet X. Mihaylov T. Mishra P. Molybog I. Nie Y. Poulton A. Reizenstein J. Rungta R. Saladi K. Schelten A. Silva R. Smith E.M. Subramanian R. Tan X.E. Tang B. Taylor R. Williams A. Kuan J.X. Xu P. Yan Z. Zarov I. Zhang Y. Fan A. Kambadur M. Narang S. Rodriguez A. Stojnic R. Edunov S. & Scialom T. (Eds.) (2023a). Llama 2: Open Foundation and Fine-Tuned Chat Models."},{"key":"e_1_3_3_47_1","unstructured":"Vaswani A. Shazeer N. Parmar N. Uszkoreit J. Jones L. Gomez A.N. Kaiser \u0141. & Polosukhin I. (2017). Attention is all you need. Advances in neural information processing systems 30."},{"key":"e_1_3_3_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333112.2333115"}],"container-title":["Applied Ontology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/AO-240032","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/AO-240032","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/AO-240032","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T11:46:30Z","timestamp":1777376790000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/AO-240032"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,11]]}},"alternative-id":["10.3233\/AO-240032"],"URL":"https:\/\/doi.org\/10.3233\/ao-240032","relation":{},"ISSN":["1570-5838","1875-8533"],"issn-type":[{"value":"1570-5838","type":"print"},{"value":"1875-8533","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11]]}}}