{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T20:15:26Z","timestamp":1778616926332,"version":"3.51.4"},"reference-count":44,"publisher":"Cambridge University Press (CUP)","issue":"4","license":[{"start":{"date-parts":[[2020,1,28]],"date-time":"2020-01-28T00:00:00Z","timestamp":1580169600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being varied. In the experiments presented in this paper, the source language was English, and the target language Serbian, and a selected domain was Library and Information Science, for which an aligned corpus exists, as well as a bilingual terminological dictionary. For term extraction, we used the <jats:sc>FlexiTerm<\/jats:sc> tool for the source language and a shallow parser for the target language, while for word alignment we used GIZA++. The evaluation results show that for the first approach the F<jats:sup>1<\/jats:sup> score varies from 29.43% to 51.15%, while for the second it varies from 61.03% to 71.03%. On the basis of the evaluation results, we developed a binary classifier that decides whether a candidate pair, composed of aligned source and target terms, is valid. We trained and evaluated different classifiers on a list of manually labeled candidate pairs obtained after the implementation of our extraction system. The best results in a fivefold cross-validation setting were achieved with the Radial Basis Function Support Vector Machine classifier, giving a F<jats:sup>1<\/jats:sup> score of 82.09% and accuracy of 78.49%.<\/jats:p>","DOI":"10.1017\/s1351324919000615","type":"journal-article","created":{"date-parts":[[2020,1,28]],"date-time":"2020-01-28T12:49:20Z","timestamp":1580215760000},"page":"455-479","source":"Crossref","is-referenced-by-count":6,"title":["Two approaches to compilation of bilingual multi-word terminology lists from lexical resources"],"prefix":"10.1017","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2714-427X","authenticated-orcid":false,"given":"Branislava","family":"\u0160andrih","sequence":"first","affiliation":[]},{"given":"Cvetana","family":"Krstev","sequence":"additional","affiliation":[]},{"given":"Ranka","family":"Stankovi\u0107","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2020,1,28]]},"reference":[{"key":"S1351324919000615_ref44","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-015-0606-0"},{"key":"S1351324919000615_ref43","unstructured":"Vitas, D. , Popovi\u0107, L. , Krstev, C. , Obradovi\u0107, I. , Pavlovi\u0107 La\u017aeti\u0107, G. and Stanojevi\u0107 M. (2012). Srpski jezik u digitalnom dobu \u2013 The Serbian Language in the Digital Age. META-NET White Paper Series. Rehm, G. and Uszkoreit, H. (Series eds). Springer. Available at http:\/\/www.meta-net.eu\/whitepapers"},{"key":"S1351324919000615_ref42","volume-title":"Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC\u201908)","author":"Vintar","year":"2008"},{"key":"S1351324919000615_ref41","unstructured":"Tsvetkov, Y. and Wintner, S. (2010). Extraction of multi-word expressions from small parallel corpora. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING \u201910, pp. 1256\u20131264, Stroudsburg, PA, USA. Association for Computational Linguistics."},{"key":"S1351324919000615_ref40","unstructured":"Thurmair, G. and Aleksi\u0107, V. (2012). Creating term and lexicon entries from phrase tables. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT 2012), Trento, Italy."},{"key":"S1351324919000615_ref39","unstructured":"Stankovi\u0107, R. , Obradovi\u0107, I. , Krstev, C. and Vitas, D. (2011). Production of morphological dictionaries of multi-word units using a multipurpose tool. In Jassem, K. , Fuglewicz, P. W. , Piasecki, M. and Przepirkowski, A. (eds), Proceedings of the Computational Linguistics-Applications Conference, October 17\u201319, 2011. Jachranka, Poland, pp. 77\u201384, Polish Information Processing Society."},{"key":"S1351324919000615_ref38","unstructured":"Stankovi\u0107, R. , Krstev, C. , Lazi\u0107, B. and Vorkapi\u0107, D. (2015). A bilingual digital library for academic and entrepreneurial knowledge management. In Proceeding of 10th International Forum on Knowledge Asset Dynamics \u2013 IFKAD 2015: Culture, Innovation and Entrepreneurship: connecting the knowledge dots, Bari, Italy, 10\u201312 June 2015, pp. 1778\u20131788. Bari (2015). ISSN: 2280-787X"},{"key":"S1351324919000615_ref36","unstructured":"Stankovi\u0107, R. , Krstev, C. , Obradovi\u0107, I. , Lazi\u0107, B. and Trtovac, A. (2016). Rule-based automatic multi-word term extraction and lemmatization. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Paris, France: European Language Resources Association (ELRA). Available at http:\/\/www.lrec-conf.org\/proceedings\/lrec2016\/pdf\/1033_Paper.pdf"},{"key":"S1351324919000615_ref35","doi-asserted-by":"publisher","DOI":"10.1186\/2041-1480-4-27"},{"key":"S1351324919000615_ref34","unstructured":"Semmar, N. (2018). A hybrid approach for automatic extraction of bilingual multiword expressions from parallel corpora. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Paris, France. European Language Resources Association (ELRA). Available at http:\/\/www.lrec-conf.org\/proceedings\/lrec2018\/pdf\/958.pdf"},{"key":"S1351324919000615_ref33","first-page":"41","volume-title":"IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence","volume":"3","author":"Rish","year":"2001"},{"key":"S1351324919000615_ref32","unstructured":"Princeton University (2010). About WordNet. Princeton University."},{"key":"S1351324919000615_ref31","unstructured":"Pinnis, M. , Ljube\u0161i\u0107, N. , Stefanescu, D. , Skadina, I. , Tadi\u0107, M. and Gornostay, T. (2012). Term extraction, tagging, and mapping tools for under-resourced languages. In Proceedings of the 10th Conference on Terminology and Knowledge Engineering (TKE 2012), June, pp. 20-21."},{"key":"S1351324919000615_ref30","unstructured":"Pianta, E. , Girardi, C. and Zanoli, R. (2008). The TextPro tool suite. In: Proceedings of 6th edition of the Language Resources and Evaluation Conference."},{"key":"S1351324919000615_ref29","unstructured":"Papineni, K. , Roukos, S. , Ward, T. and Zhu, W. J. (2002). BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311\u2013318. Association for Computational Linguistics."},{"key":"S1351324919000615_ref28","doi-asserted-by":"publisher","DOI":"10.1007\/s10590-017-9201-7"},{"key":"S1351324919000615_ref26","first-page":"317","article-title":"Bilingual lexicon extraction from arabic-english parallel corpora with a view to machine translation.","volume":"7","author":"Sabtan","year":"2016","journal-title":"Arab World English Journal"},{"key":"S1351324919000615_ref25","first-page":"18","article-title":"Classification and regression by random forest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"S1351324919000615_ref24","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-45563-0_46"},{"key":"S1351324919000615_ref23","unstructured":"Krstev, C. , \u0160andrih, B. , Stankovi\u0107, R. and Mladenovi\u0107, M. (2018). Using english baits to catch serbian multi-word terminology. In Chair, N. C. C. , Choukri, K. , Cieri, C. , Declerck, T. , Goggi, S. , Hasida, K. , Isahara, H. Maegaard, B. , Mariani, J. , Mazo, H. , Moreno, A. , Odijk, J. , Piperidis, S. and Tokunaga, T. (eds), Proceedings of the11th International Conference on Language Resources and Evaluation (LREC 2018), Paris, France: European Language Resources Association (ELRA). http:\/\/www.lrec-conf.org\/proceedings\/lrec2018\/pdf\/384.pdf"},{"key":"S1351324919000615_ref21","volume-title":"Processing of Serbian. Automata, Texts and Electronic Dictionaries","author":"Krstev","year":"2008"},{"key":"S1351324919000615_ref20","unstructured":"Kova\u010devi\u0107, L. , Begeni\u0161i\u0107, D. D. and Injac-Malba\u0161a V. (2014). Dictionary of Library and Information Sciences."},{"key":"S1351324919000615_ref17","first-page":"137","volume-title":"European Conference on Machine Learning","author":"Joachims","year":"1998"},{"key":"S1351324919000615_ref15","doi-asserted-by":"publisher","DOI":"10.1002\/9781118548387"},{"key":"S1351324919000615_ref9","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1013203451"},{"key":"S1351324919000615_ref12","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324915000431"},{"key":"S1351324919000615_ref4","first-page":"267","volume":"2","author":"Baldwin","year":"2010","journal-title":"Handbook of Natural Language Processing"},{"key":"S1351324919000615_ref3","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324917000195"},{"key":"S1351324919000615_ref2","volume-title":"The English Language in the Digital Age","author":"Ananiadou","year":"2012"},{"key":"S1351324919000615_ref37","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-53640-8_10"},{"key":"S1351324919000615_ref1","first-page":"402","article-title":"Extracting bilingual terminologies from comparable corpora.","volume":"1","author":"Aker","year":"2013","journal-title":"Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics"},{"key":"S1351324919000615_ref7","unstructured":"Eibe, F. , Hall, M. and Witten, I. (2016). The WEKA Workbench. Online Appendix for \u201cData Mining: Practical Machine Learning Tools and Techniques\u201d, Morgan Kaufmann, Fourth edition."},{"key":"S1351324919000615_ref13","volume-title":"Proceedings of the 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2016)","author":"Hamon","year":"2016"},{"key":"S1351324919000615_ref27","unstructured":"Och, F. J. and Ney, H. (2000). Improved statistical alignment models. In 38 th Annual Meeting on Association for Computational Linguistics, pp. 440\u2013447. Stroudsburg, PA: Association for Computational Linguistics."},{"key":"S1351324919000615_ref8","doi-asserted-by":"publisher","DOI":"10.4000\/books.aaccademia.1473"},{"key":"S1351324919000615_ref16","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324916000127"},{"key":"S1351324919000615_ref19","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11397-5_4"},{"key":"S1351324919000615_ref18","first-page":"177","volume-title":"Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions","author":"Koehn","year":"2007"},{"key":"S1351324919000615_ref10","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10745-0_61"},{"key":"S1351324919000615_ref5","volume-title":"Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC\u201912)","author":"Bouamor","year":"2012"},{"key":"S1351324919000615_ref14","unstructured":"Hazem, A. and Morin, E. (2016). Efficient data selection for bilingual terminology extraction from comparable corpora. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 3401\u20133411."},{"key":"S1351324919000615_ref22","unstructured":"Krstev, C. , (2014). Serbian WordNet. University of Belgrade, HLT Group and JeRTeh. Available at http:\/\/korpus.matf.bg.ac.rs\/r22"},{"key":"S1351324919000615_ref6","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-4003"},{"key":"S1351324919000615_ref11","first-page":"327","volume-title":"Cognitive Studies \u2013 \u00c9tudes Cognitives","volume":"15","author":"Garab\u00edk","year":"2015"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324919000615","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,6,10]],"date-time":"2020-06-10T13:27:18Z","timestamp":1591795638000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324919000615\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,28]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["S1351324919000615"],"URL":"https:\/\/doi.org\/10.1017\/s1351324919000615","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,28]]}}}