{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,3,7]],"date-time":"2024-03-07T01:24:28Z","timestamp":1709774668845},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2010,8,20]],"date-time":"2010-08-20T00:00:00Z","timestamp":1282262400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"},{"start":{"date-parts":[[2010,8,20]],"date-time":"2010-08-20T00:00:00Z","timestamp":1282262400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Braz Comput Soc"],"published-print":{"date-parts":[[2010,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The need for domain ontologies motivates the research on structured information extraction from texts. A\u00a0foundational part of this process is the identification of domain relevant compound terms. This paper presents an evaluation of compound terms extraction from a <jats:italic>corpus<\/jats:italic> of the domain of Pediatrics. Bigrams and trigrams were automatically extracted from a <jats:italic>corpus<\/jats:italic> composed by 283 texts from a Portuguese journal, Jornal de Pediatria, using three different extraction methods. Considering that these methods generate an elevated number of candidates, we analyzed the quality of the resulting terms according to different methods and cut-off points. The evaluation is reported by metrics such as precision, recall and f-measure, which are computed on the basis of a hand-made reference list of domain relevant compounds.<\/jats:p>","DOI":"10.1007\/s13173-010-0020-4","type":"journal-article","created":{"date-parts":[[2010,8,19]],"date-time":"2010-08-19T21:42:53Z","timestamp":1282254173000},"page":"247-259","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Extracting compound terms from domain corpora"],"prefix":"10.1007","volume":"16","author":[{"given":"Lucelene","family":"Lopes","sequence":"first","affiliation":[]},{"given":"Renata","family":"Vieira","sequence":"additional","affiliation":[]},{"given":"Maria Jos\u00e9","family":"Finatto","sequence":"additional","affiliation":[]},{"given":"Daniel","family":"Martins","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2010,8,20]]},"reference":[{"key":"20_CR1","first-page":"380","volume":"4139","author":"S Aubin","year":"2006","unstructured":"Aubin S, Hamon T (2006) Improving term extraction with terminological resources. In: FinTAL 2006. LNAI, vol\u00a04139, pp\u00a0380\u2013387","journal-title":"LNAI, vol"},{"key":"20_CR2","doi-asserted-by":"crossref","unstructured":"Baptista J, Batista F, Mamede N (2006) Building a dictionary of anthroponyms. In: Proceedings of 7th PROPOR","DOI":"10.1007\/11751984_3"},{"key":"20_CR3","series-title":"Proceedings of the 4th LREC","first-page":"1313","volume-title":"BootCaT: Bootstrapping Corpora and Terms from the Web","author":"M Baroni","year":"2004","unstructured":"Baroni M, Bernadini S (2004) BootCaT: Bootstrapping Corpora and Terms from the Web. In: Proceedings of the 4th LREC, pp\u00a01313\u20131316"},{"key":"20_CR4","unstructured":"Bas\u00e9gio T (2006) Uma abordagem semi-autom\u00e1tica para identifica\u00e7\u00e3o de estruturas ontol\u00f3gicas a partir de textos na l\u00edngua Portuguesa do Brasil. Dissertation (MSc), PUCRS"},{"key":"20_CR5","unstructured":"Bick E (2000) The parsing system \u201cPalavras\u201d: automatic grammatical analysis of Portuguese in a constraint grammar framework. PhD thesis, Arhus University"},{"key":"20_CR6","unstructured":"Bourigault D (2002) UPERY: un outil d\u2019analyse distributionnelle \u00e9tendue pour la construction d\u2019ontologies a partir de corpus. In: TALN, Nancy"},{"issue":"1","key":"20_CR7","first-page":"1","volume":"43","author":"D Bourigault","year":"2002","unstructured":"Bourigault D, Lame G (2002) Analyse distributionnelle et structuration de terminologie\u2014application a la construction d\u2019une ontologie documentaire du Droit. In: TAL, vol\u00a043(1), pp\u00a01\u201322","journal-title":"TAL"},{"key":"20_CR8","unstructured":"Bourigault D, Fabre C, Fr\u00e9rot C, Jacques M, Ozdowska S (2005) SYNTEX, analyseur syntaxique de\u00a0corpus. In: TALN, Dourdan"},{"key":"20_CR9","series-title":"Frontiers in artificial intelligence and applications","volume-title":"Ontology learning from text: methods, evaluation and applications","author":"P Buitelaar","year":"2005","unstructured":"Buitelaar P, Cimiano P, Magnini B (2005) Ontology learning from text: An overview. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, evaluation and applications. Frontiers in artificial intelligence and applications, vol 123. IOS Press, Amsterdam"},{"key":"20_CR10","unstructured":"Coulthard RJ (2005) The application of corpus methodology to translation: the JPED parallel corpus and the pediatrics comparable corpus. Dissertation (MSc), UFSC"},{"key":"20_CR11","first-page":"626","volume":"5351","author":"B Fortuna","year":"2008","unstructured":"Fortuna B, Lavrac N, Velardi P (2008) Advancing topic ontology learning through term extraction. In: PRICAI 2008. LNAI, vol\u00a05351, pp\u00a0626\u2013635","journal-title":"LNAI"},{"key":"20_CR12","doi-asserted-by":"crossref","unstructured":"Hulth A (2004) Enhancing linguistically oriented automatic keyword extraction. In: HLT-NAACL, ACL","DOI":"10.3115\/1613984.1613989"},{"key":"20_CR13","unstructured":"Ide N, Bonhomme P, Romary L (2000) Xces: An xml-based encoding standart for linguistic corpora. In: Proceedings of the second LREC"},{"key":"20_CR14","series-title":"Proceedings of the 13th ACM CIKM","first-page":"615","volume-title":"Distributional term representations: an experimental comparison","author":"A Lavelli","year":"2004","unstructured":"Lavelli A, Sebastiani F, Zanoli R (2004) Distributional term representations: an experimental comparison. In: Proceedings of the 13th ACM CIKM, pp\u00a0615\u2013624"},{"issue":"1","key":"20_CR15","first-page":"72","volume":"3","author":"L Lopes","year":"2009","unstructured":"Lopes L, Vieira R, Finatto MJ, Zanette A, Martins D, Ribeiro LC Jr (2009) Automatic extraction of composite terms for construction of ontologies: an experiment in the health care area. RECIIS\u2014Electron J Commun Inf Innov Health 3(1):72\u201384","journal-title":"RECIIS\u2014Electron J Commun Inf Innov Health"},{"key":"20_CR16","unstructured":"Lopes L, Fernandes P, Vieira R, Fedrizzi G (2009) ExATOlp: An automatic tool for term extraction from Portuguese language corpora. In: Proceedings of the fourth language & technology conference: human language technologies as a challenge for computer science and linguistics, LTC\u201909, Faculty of Mathematics and Computer Science of Adam Mickiewicz University, November, 2009, pp\u00a0427\u2013431"},{"key":"20_CR17","unstructured":"Lopes L, Oliveira LH, Vieira R. (2010) Portuguese term extraction methods: comparing linguistic and statistical approaches. In: Proceedings of the 9th PROPOR"},{"key":"20_CR18","doi-asserted-by":"crossref","unstructured":"Maedche A, Staab S (2000) Semi-automatic engineering of ontologies from text. In: Proceedings of the 12th SEKE","DOI":"10.1007\/3-540-39967-4_14"},{"key":"20_CR19","volume-title":"Foundations of statistical natural language processing","author":"CD Manning","year":"1999","unstructured":"Manning CD Schutze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge"},{"key":"20_CR20","unstructured":"Navigli R, Velardi P (2002) Semantic interpretation of terminological strings. In: Proceedings of the 6th TKE, INIST-CNRS, Vandoeuvre-l\u00e8s-Nancy, France"},{"key":"20_CR21","series-title":"Studies in fuzziness and soft computing","volume-title":"Knowlodge mining","author":"MT Pazienza","year":"2005","unstructured":"Pazienza MT, Pennacchiotti M, Zanzotto FM (2005) Terminology extraction: an analysis of linguistic and statistical approaches. In: Sirmakessis S (ed) Knowlodge mining. Studies in fuzziness and soft computing, vol 185. Springer, Berlin"},{"key":"20_CR22","doi-asserted-by":"crossref","unstructured":"Park Y, Bird R, Bougarev B (2002) Automatic glossary extraction: Beyond terminology identification. In: Proceedings of the 19th COLING, Taipei, Taiwan","DOI":"10.3115\/1072228.1072370"},{"key":"20_CR23","unstructured":"Ribeiro LC (2008) OntoLP: Constru\u00e7\u00e3o semi-autom\u00e1tica de ontologias a partir de textos da l\u00edngua portuguesa. Dissertation (MSc), UNISINOS"},{"key":"20_CR24","unstructured":"Suchanek FM, Ifrim G, Andweikum G (2006) Leila: Learning to extract information by linguistic analysis. In: Proceedings of the 2nd workshop on ontology learning and population. Association for computational linguistics"}],"container-title":["Journal of the Brazilian Computer Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0020-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13173-010-0020-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0020-4","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0020-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:02:24Z","timestamp":1630443744000},"score":1,"resource":{"primary":{"URL":"https:\/\/journal-bcs.springeropen.com\/articles\/10.1007\/s13173-010-0020-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8,20]]},"references-count":24,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2010,11]]}},"alternative-id":["20"],"URL":"https:\/\/doi.org\/10.1007\/s13173-010-0020-4","relation":{},"ISSN":["0104-6500","1678-4804"],"issn-type":[{"value":"0104-6500","type":"print"},{"value":"1678-4804","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,8,20]]},"assertion":[{"value":"10 February 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 August 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}