{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T21:20:34Z","timestamp":1779312034914,"version":"3.51.4"},"reference-count":65,"publisher":"Cambridge University Press (CUP)","issue":"1","license":[{"start":{"date-parts":[[2018,8,6]],"date-time":"2018-08-06T00:00:00Z","timestamp":1533513600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2019,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This work focuses on the rapid development of linguistic annotation tools for low-resource languages (languages that have no labeled training data). We experiment with several cross-lingual annotation projection methods using recurrent neural networks (RNN) models. The distinctive feature of our approach is that our multilingual word representation requires only a parallel corpus between source and target languages. More precisely, our approach has the following characteristics: (a) it does not use word alignment information, (b) it does not assume any knowledge about target languages (one requirement is that the two languages (source and target) are not too syntactically divergent), which makes it applicable to a wide range of low-resource languages, (c) it provides authentic multilingual taggers (one tagger for<jats:italic>N<\/jats:italic>languages). We investigate both uni and bidirectional RNN models and propose a method to include external information (for instance, low-level information from part-of-speech tags) in the RNN to train higher level taggers (for instance, Super Sense taggers). We demonstrate the validity and genericity of our model by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted to induce cross-lingual part-of-speech and Super Sense taggers. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).<\/jats:p>","DOI":"10.1017\/s1351324918000293","type":"journal-article","created":{"date-parts":[[2018,8,6]],"date-time":"2018-08-06T09:29:45Z","timestamp":1533547785000},"page":"43-67","source":"Crossref","is-referenced-by-count":12,"title":["A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages"],"prefix":"10.1017","volume":"25","author":[{"given":"O.","family":"ZENNAKI","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"N.","family":"SEMMAR","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"L.","family":"BESACIER","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"56","published-online":{"date-parts":[[2018,8,6]]},"reference":[{"key":"S1351324918000293_ref025","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24797-2_2"},{"key":"S1351324918000293_ref065","doi-asserted-by":"crossref","unstructured":"Yarowsky D. , Ngai G. , and Wicentowski R. 2001. Inducing multilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the 1st International Conference on Human Language Technology Research, pp. 1\u20138.","DOI":"10.3115\/1072133.1072187"},{"key":"S1351324918000293_ref022","doi-asserted-by":"crossref","DOI":"10.4324\/9781315841366","volume-title":"Corpus Annotation: Linguistic Information from Computer Text Corpora","author":"Garside","year":"1997"},{"key":"S1351324918000293_ref061","first-page":"111","article-title":"Annotation automatique de corpus: panorama et \u00e9tat de la technique","volume":"4","author":"Veronis","year":"2000","journal-title":"Ing\u00e9nierie des langues"},{"key":"S1351324918000293_ref055","unstructured":"Sutskever I. , Vinyals O. , and Le Q. V. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems, pp. 3104\u20133112."},{"key":"S1351324918000293_ref059","unstructured":"Titov I. , and Klementiev A. 2012. Crosslingual induction of semantic roles. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 647\u2013656."},{"key":"S1351324918000293_ref058","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1162\/tacl_a_00205","article-title":"Token and type constraints for cross-lingual part-of-speech tagging","volume":"1","author":"T\u00e4ckstr\u00f6m","year":"2013","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"S1351324918000293_ref057","unstructured":"T\u00e4ckstr\u00f6m O. , McDonald R. , and Nivre J. 2013. Target language adaptation of discriminative transfer parsers. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 1061\u20131071."},{"key":"S1351324918000293_ref056","unstructured":"T\u00e4ckstr\u00f6m O. , McDonald R. , and Uszkoreit J. 2012. Cross-lingual word clusters for direct transfer of linguistic structure. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 477\u2013487."},{"key":"S1351324918000293_ref052","doi-asserted-by":"publisher","DOI":"10.1109\/78.650093"},{"key":"S1351324918000293_ref051","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1992.4.2.243"},{"key":"S1351324918000293_ref049","unstructured":"Salah M. H. , Blanchon H. , Zrigui M. , and Schwab D. 2016. Am\u00e9lioration de la traduction automatique dun corpus annot\u00e9. In Proceedings of the 23rd TALN (Traitement Automatique des Langues Naturelles) Conference."},{"key":"S1351324918000293_ref048","doi-asserted-by":"crossref","unstructured":"Rumelhart D. E. , Hinton G. E. , and Williams R. J. 1985. Learning internal representations by error propagation. DTIC Document. No. ICS-8506. California Univ San Diego La Jolla Inst for Cognitive Science.","DOI":"10.21236\/ADA164453"},{"key":"S1351324918000293_ref064","doi-asserted-by":"crossref","unstructured":"Wisniewski G. , P\u00e9cheux N. , Gahbiche-Braham S. , and Yvon F. 2014. Cross-lingual part-of-speech tagging through ambiguous learning. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, vol. 14, pp. 1779\u20131785.","DOI":"10.3115\/v1\/D14-1187"},{"key":"S1351324918000293_ref045","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"S1351324918000293_ref036","unstructured":"Manion S. L. , and Sainudiin R. 2013. DAEBAK!: peripheral diversity for multilingual word sense disambiguation. In Proceedings of SemEval, pp. 250\u2013254."},{"key":"S1351324918000293_ref044","unstructured":"Pado S. , and Pitel G. . 2007. Annotation pr\u00e9cise du fran\u00e7ais en s\u00e9mantique de r\u00f4les par projection cross-linguistique. In Actes de la 14e conf\u00e9rence sur le Traitement Automatique des Langues Naturelles (communications orales), pp. 271\u2013280."},{"key":"S1351324918000293_ref042","unstructured":"Navigli R. , Jurgens D. , and Vannella D. 2013. Semeval-2013: Multilingual word sense disambiguation. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 222\u2013231."},{"key":"S1351324918000293_ref041","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2012.07.001"},{"key":"S1351324918000293_ref038","unstructured":"Mikolov T. , Sutskever I. , Chen K. , Corrado G. S. , and Dean J. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Advances in Neural Information Processing Systems, pp. 3111\u20133119."},{"key":"S1351324918000293_ref063","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324918000293_ref035","doi-asserted-by":"crossref","unstructured":"Luong T. , Pham H. , and Manning C. D. 2015. Bilingual word representations with monolingual quality in mind. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp. 151\u2013159.","DOI":"10.3115\/v1\/W15-1521"},{"key":"S1351324918000293_ref032","doi-asserted-by":"crossref","unstructured":"Koehn P. , Hoang H. , Birch A. , Callison-Burch C. , Federico M. , Bertoldi N. , Cowan B. , Shen W. , Moran C. , and Zens R. , Dyer C. , Bojar O. , Constantin A. , and Herbst E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, pp. 177\u2013180.","DOI":"10.3115\/1557769.1557821"},{"key":"S1351324918000293_ref019","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog1402_1"},{"key":"S1351324918000293_ref023","doi-asserted-by":"crossref","unstructured":"Gouws S. , and S\u00f8gaard A. 2015. Simple task-specific bilingual word embeddings. In Proceedings of the 14th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1386\u20131390.","DOI":"10.3115\/v1\/N15-1157"},{"key":"S1351324918000293_ref008","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2013.07.008"},{"key":"S1351324918000293_ref039","doi-asserted-by":"crossref","unstructured":"Miller G. A. , Leacock C. , Tengi R. , and Bunker R. T. 1993. A semantic concordance. In Proceedings of the Workshop on Human Language Technology, Association for Computational Linguistics, pp. 303\u2013308.","DOI":"10.3115\/1075671.1075742"},{"key":"S1351324918000293_ref050","unstructured":"Schmid H. 1995. Treetagger | a language independent part-of-speech tagger. Institut f\u00fcr Maschinelle Sprachverarbeitung, Universit\u00e4t Stuttgart, vol. 46, p. 28. Available at https:\/\/protect-eu.mimecast.com\/s\/STrqCK8y8fB91wiMedpW?domain=cis.uni-muenchen.dehttp:\/\/www.cis.uni-muenchen.de\/~schmid\/tools\/TreeTagger\/"},{"key":"S1351324918000293_ref046","doi-asserted-by":"publisher","DOI":"10.1145\/3099556"},{"key":"S1351324918000293_ref018","unstructured":"Durrett G. , Pauls A. , and Klein D. 2012. Syntactic transfer using a bilingual lexicon. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, pp. 1\u201311."},{"key":"S1351324918000293_ref028","unstructured":"Jiang W. , Liu Q. , and L\u00fc Y. 2011. Relaxed cross-lingual projection of constituent syntax. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 1192\u20131201."},{"key":"S1351324918000293_ref013","doi-asserted-by":"crossref","unstructured":"Cho K. , van Merri\u00ebnboer B. , Bahdanau D. , and Bengio Y. 2014. On the properties of neural machine translation: encoder\u2013decoder approaches. In Proceedings of the Syntax, Semantics and Structure in Statistical Translation, pp. 103\u2013111.","DOI":"10.3115\/v1\/W14-4012"},{"key":"S1351324918000293_ref029","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00210"},{"key":"S1351324918000293_ref062","unstructured":"Veronis J. , Hamon O. , Ayache C. , Belmouhoub R. , Kraif O. , Laurent D. , Nguyen T. M. H. , Semmar N. , Stuck F. , and Zaghouani W. 2008. Arcade II Action de recherche concert\u00e9e sur l\u2019alignement de documents et son \u00e9valuation. Chapitre2, Editions Herm\u00e9s."},{"key":"S1351324918000293_ref015","first-page":"2493","article-title":"Natural language processing (almost) from scratch","volume":"12","author":"Collobert","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324918000293_ref024","unstructured":"Gouws S. , Bengio Y. , and Corrado G. 2015. BilBOWA: fast bilingual distributed representations without word alignments. In Proceedings of the 32nd International Conference on Machine Learning, pp. 748\u2013756."},{"key":"S1351324918000293_ref027","doi-asserted-by":"crossref","unstructured":"Henderson J. 2004. Discriminative training of a neural network statistical parser. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 95\u2013102.","DOI":"10.3115\/1218955.1218968"},{"key":"S1351324918000293_ref016","first-page":"600","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies","author":"Das","year":"2011"},{"key":"S1351324918000293_ref020","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/7287.001.0001"},{"key":"S1351324918000293_ref003","unstructured":"Aufrant L. , Wisniewski G. , and Yvon F. 2016. Zero-resource dependency parsing: boosting delexicalized cross-lingual transfer with linguistic knowledge. In Proceedings of the 26th International Conference on Computational Linguistics, pp. 119\u2013130."},{"key":"S1351324918000293_ref043","doi-asserted-by":"crossref","unstructured":"Och F. J. , and Ney H. 2000. Improved statistical alignment models. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 440\u2013447.","DOI":"10.3115\/1075218.1075274"},{"key":"S1351324918000293_ref037","unstructured":"Mikolov T. , Karafi\u00e1t M. , Burget L. , Cernock\u1ef3 J. , and Khudanpur S. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045\u20131048."},{"key":"S1351324918000293_ref034","unstructured":"Li S. , Gra\u00e7a J. V. , and Taskar B. 2012. Wiki-ly supervised part-of-speech tagging. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, pp. 1389\u20131398."},{"key":"S1351324918000293_ref009","unstructured":"Besacier L. , Lecouteux B. , Azouzi M. , and Luong N.-Q. 2012. The LIG English to French machine translation system for IWSLT 2012. In Proceedings of the 9th International Workshop on Spoken Language Translation, pp. 102\u2013108."},{"key":"S1351324918000293_ref040","unstructured":"Nasiruddin M. , Tchechmedjiev A. , Blanchon H. , and Schwab D. 2015. Cr\u00e9ation rapide et efficace dun syst\u00e8me de d\u00e9sambigu\u00efsation lexicale pour une langue peu dot\u00e9e. In Proceedings of the 22nd TALN (Traitement Automatique des Langues Naturelles) Conference."},{"key":"S1351324918000293_ref005","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-33486-6_6"},{"key":"S1351324918000293_ref007","unstructured":"B\u00e9rard A. , Servan C. , Pietquin O , and Besacier L. 2016. MultiVec: a multilingual and multilevel representation learning toolkit for NLP. In Proceedings of the 10th Edition of the Language Resources and Evaluation Conference, pp. 4188\u20134192."},{"key":"S1351324918000293_ref002","doi-asserted-by":"crossref","unstructured":"Annesi P. , and Basili R. 2010. Cross-lingual alignment of FrameNet annotations through Hidden Markov Models. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics, Springer, Berlin, Heidelberg, pp. 12\u201325.","DOI":"10.1007\/978-3-642-12116-6_2"},{"key":"S1351324918000293_ref004","first-page":"1137","article-title":"A neural probabilistic language model","volume":"3","author":"Bengio","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324918000293_ref006","doi-asserted-by":"crossref","unstructured":"Bentivogli L. , Forner P. , and Pianta E. 2004. Evaluating cross-language annotation transfer in the multisemcor corpus. In Proceedings of the 20th International Conference on Computational Linguistics, Association for Computational Linguistics, pp. 364\u2013371.","DOI":"10.3115\/1220355.1220408"},{"key":"S1351324918000293_ref026","first-page":"249","volume-title":"Enriching the Integration of Semantic Resources Based on Wordnet","author":"Guti\u00e9rrez V\u00e1zquez","year":"2011"},{"key":"S1351324918000293_ref030","unstructured":"Kim S. , Toutanova K. , and Yu H. 2012. Multilingual named entity recognition using parallel data and metadata from wikipedia. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 694\u2013702."},{"key":"S1351324918000293_ref011","first-page":"263","article-title":"The mathematics of statistical machine translation: parameter estimation","volume":"19","author":"Brown","year":"1993","journal-title":"Computational Linguistics"},{"key":"S1351324918000293_ref060","unstructured":"Van der Plas L. , and Apidianaki M. 2014. Cross-lingual word sense disambiguation for predicate labelling of french. In Proceedings of the 21st TALN (Traitement Automatique des Langues Naturelles) Conference, pp. 46\u201355."},{"key":"S1351324918000293_ref031","first-page":"79","article-title":"Europarl: a parallel corpus for statistical machine translation","volume":"5","author":"Koehn","year":"2005","journal-title":"MT Summit"},{"key":"S1351324918000293_ref012","doi-asserted-by":"crossref","unstructured":"Buchholz S. , and Marsi E. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning, Association for Computational Linguistics, pp. 149\u2013164.","DOI":"10.3115\/1596276.1596305"},{"key":"S1351324918000293_ref017","unstructured":"Duong L. , Cook P. , Bird S. , and Pecina P. 2013. Simpler unsupervised POS tagging with bilingual projections. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 634\u2013639."},{"key":"S1351324918000293_ref033","volume-title":"A Standard Corpus of Present-Day Edited American English, for Use with Digital Computers","author":"Kucera","year":"1979"},{"key":"S1351324918000293_ref021","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2007.33.3.293"},{"key":"S1351324918000293_ref001","unstructured":"Al-Rfou R. , Perozzi B. , and Skiena S. 2013. Polyglot: distributed word representations for multilingual nlp. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp. 183\u2013192."},{"key":"S1351324918000293_ref010","doi-asserted-by":"crossref","unstructured":"Brants T. 2000. TnT: a statistical part-of-speech tagger. In Proceedings of the 6th Conference on Applied Natural Language Processing, Association for Computational Linguistics, pp. 224\u2013231.","DOI":"10.3115\/974147.974178"},{"key":"S1351324918000293_ref054","doi-asserted-by":"crossref","unstructured":"Sundermeyer M. , Oparin I. , Gauvain J.-L. , Freiberg B. , Schluter R. , and Ney H. 2013. Comparison of feedforward and recurrent neural network language models. In IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8430\u20138434.","DOI":"10.1109\/ICASSP.2013.6639310"},{"key":"S1351324918000293_ref053","unstructured":"Schwab D. , Goulian J. , Tchechmedjiev A. , and Blanchon H. 2012. Ant colony algorithm for the unsupervised word sense disambiguation of texts: comparison and evaluation. In Proceedings of the 25th International Conference on Computational Linguistics, pp. 2389\u20132404."},{"key":"S1351324918000293_ref014","doi-asserted-by":"crossref","unstructured":"Ciaramita M. , and Altun Y. 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 594\u2013602.","DOI":"10.3115\/1610075.1610158"},{"key":"S1351324918000293_ref047","unstructured":"Petrov S. , Das D. , and McDonald R. 2012. A universal part-of-speech tagset. In Proceedings of the 8th International Conference on Language Resources and Evaluation, European Language Resources Association, pp. 2089\u20132096."}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324918000293","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,28]],"date-time":"2022-08-28T20:03:08Z","timestamp":1661716988000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324918000293\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,8,6]]},"references-count":65,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,1]]}},"alternative-id":["S1351324918000293"],"URL":"https:\/\/doi.org\/10.1017\/s1351324918000293","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,8,6]]}}}