{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T00:33:10Z","timestamp":1760401990358,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2020,1,27]],"date-time":"2020-01-27T00:00:00Z","timestamp":1580083200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["18H03341","17H00759","17H04706"],"award-info":[{"award-number":["18H03341","17H00759","17H04706"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The most challenging issue with low-resource languages is the difficulty of obtaining enough language resources. In this paper, we propose a language service framework for low-resource languages that enables the automatic creation and customization of new resources from existing ones. To achieve this goal, we first introduce a service-oriented language infrastructure, the Language Grid; it realizes new language services by supporting the sharing and combining of language resources. We then show the applicability of the Language Grid to low-resource languages. Furthermore, we describe how we can now realize the automation and customization of language services. Finally, we illustrate our design concept by detailing a case study of automating and customizing bilingual dictionary induction for low-resource Turkic languages and Indonesian ethnic languages.<\/jats:p>","DOI":"10.3390\/info11020067","type":"journal-article","created":{"date-parts":[[2020,1,27]],"date-time":"2020-01-27T11:41:57Z","timestamp":1580125317000},"page":"67","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Towards Language Service Creation and Customization for Low-Resource Languages"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9462-0216","authenticated-orcid":false,"given":"Donghui","family":"Lin","sequence":"first","affiliation":[{"name":"Department of Social Informatics, Kyoto University, Kyoto 606-8501, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8310-2007","authenticated-orcid":false,"given":"Yohei","family":"Murakami","sequence":"additional","affiliation":[{"name":"Faculty of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan"}]},{"given":"Toru","family":"Ishida","sequence":"additional","affiliation":[{"name":"School of Creative Science and Engineering, Waseda University, Tokyo 169-8555, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2020,1,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1006\/jaar.1998.0328","article-title":"Explaining global patterns of language diversity","volume":"17","author":"Nettle","year":"1998","journal-title":"J. Anthropol. Archaeol."},{"key":"ref_2","unstructured":"(2019, November 28). List of Wikipedias. Available online: https:\/\/meta.wikimedia.org\/wiki\/List_of_Wikipedias."},{"key":"ref_3","unstructured":"(2019, November 28). LRE Map. Available online: http:\/\/lremap.elra.info."},{"key":"ref_4","unstructured":"Calzolari, N., Del Gratta, R., Francopoulo, G., Mariani, J., Rubino, F., Russo, I., and Soria, C. (2012, January 23\u201325). The LRE Map. harmonising community descriptions of resources. Proceedings of the Eighth International Conference on Language Resources and Evaluation, Istanbul, Turkey."},{"key":"ref_5","unstructured":"(2019, November 28). Google Translate. Available online: http:\/\/translate.google.com\/."},{"key":"ref_6","unstructured":"Del Gratta, R., Frontini, F., Khan, A.F., Mariani, J., and Soria, C. (2014, January 26). The LREMap for under-resourced languages. Proceedings of the Workshop on Collaboration and Computing for Under-Resourced Languages in the Linked Open Data Era, Reykjavik, Iceland."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zoph, B., Yuret, D., May, J., and Knight, K. (2016, January 1\u20135). Transfer learning for low-resource neural machine translation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA.","DOI":"10.18653\/v1\/D16-1163"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Gu, J., Hassan, H., Devlin, J., and Li, V.O. (2018, January 1\u20136). Universal neural machine translation for extremely low resource languages. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA.","DOI":"10.18653\/v1\/N18-1032"},{"key":"ref_9","unstructured":"Tiedemann, J. (2012, January 23\u201327). Character-based pivot translation for under-resourced languages and domains. Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France."},{"key":"ref_10","unstructured":"Farhath, F., Theivendiram, P., Ranathunga, S., Jayasena, S., and Dias, G. (2018, January 7\u201312). Improving domain-specific SMT for low-resourced languages using data from different domains. Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan."},{"key":"ref_11","unstructured":"Honnet, P.E., Popescu-Belis, A., Musat, C., and Baeriswyl, M. (2018, January 7\u201312). Machine translation of low-resource spoken dialects: Strategies for normalizing Swiss German. Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1162\/tacl_a_00100","article-title":"Multilingual projection for parsing truly low-resource languages","volume":"4","author":"Alonso","year":"2016","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_13","unstructured":"Garrette, D., Mielens, J., and Baldridge, J. (2013, January 4\u20139). Real-world semi-supervised learning of POS-taggers for low-resource languages. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers), Sofia, Bulgaria."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Duong, L., Cohn, T., Bird, S., and Cook, P. (2015, January 17\u201321). A neural network model for low-resource universal dependency parsing. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal.","DOI":"10.18653\/v1\/D15-1040"},{"key":"ref_15","unstructured":"Lim, K., Partanen, N., and Poibeau, T. (2018, January 7\u201312). Multilingual dependency parsing for low-resource languages: Case studies on North Saami and Komi-Zyrian. Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan."},{"key":"ref_16","unstructured":"Gales, M.J., Knill, K.M., Ragni, A., and Rath, S.P. (2014, January 14\u201316). Speech recognition and keyword spotting for low-resource languages: BABEL project research at CUED. Proceedings of the Workshop on Spoken Language Technologies for Under-Resourced Languages, St. Petersburg, Russia."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wang, H., Ragni, A., Gales, M., Knill, K., Woodland, P., and Zhang, C. (2015, January 6\u201310). Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages. Proceedings of the 16th Annual Conference of the International Speech Communication Association, Dresden, Germany.","DOI":"10.21437\/Interspeech.2015-726"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Adams, O., Makarucha, A., Neubig, G., Bird, S., and Cohn, T. (2017, January 3\u20137). Cross-lingual word embeddings for low-resource language modeling. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1 (Long Papers), Valencia, Spain.","DOI":"10.18653\/v1\/E17-1088"},{"key":"ref_19","unstructured":"Andrews, N., Dredze, M., Van Durme, B., and Eisner, J. (August, January 30). Bayesian modeling of lexical resources for low-resource settings. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Volume 1 (Long Papers), Vancouver, BC, Canada."},{"key":"ref_20","unstructured":"Irvine, A., and Klementiev, A. (2010, January 6). Using Mechanical Turk to annotate lexicons for less commonly used languages. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon\u2019s Mechanical Turk, Los Angeles, CA, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.specom.2013.07.001","article-title":"A smartphone- based ASR data collection tool for under-resourced languages","volume":"56","author":"Davel","year":"2014","journal-title":"Speech Commun."},{"key":"ref_22","unstructured":"Fraisse, A., Jenn, R., and Fishkin, S.F. (2018, January 12). Building multilingual parallel corpora for under-resourced languages using translated fictional texts. Proceedings of the 3rd Workshop on Collaboration and Computing for Under-Resourced Languages: Sustaining Knowledge Diversity in the Digital Age, Miyazaki, Japan."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Fraisse, A., Zhang, Z., Zhai, A., Jenn, R., Fisher Fishkin, S., Zweigenbaum, P., Favier, L., and Mustafa El Hadi, W. (2019). A sustainable and open access knowledge organization model to preserve cultural heritage and language diversity. Information, 10.","DOI":"10.3390\/info10100303"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ishida, T. (2011). The Language Grid: Service-Oriented Collective Intelligence for Language Resource Interoperability, Springer.","DOI":"10.1007\/978-3-642-21178-2"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Murakami, Y., Lin, D., and Ishida, T. (2018). Services Computing for Language Resources, Springer.","DOI":"10.1007\/978-981-10-7793-7"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Murakami, Y., Lin, D., Tanaka, M., Nakaguchi, T., and Ishida, T. (2011). Service Grid architecture. The Language Grid: Service-oriented Collective Intelligence for Language Resource Interoperability, Springer.","DOI":"10.1007\/978-3-642-21178-2_2"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1109\/MC.2018.2701643","article-title":"Language service infrastructure on the Web: The Language Grid","volume":"51","author":"Ishida","year":"2018","journal-title":"Computer"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Ishida, T., Murakami, Y., and Lin, D. (2011). The Language Grid: Service-oriented approach to sharing language resources. The Language Grid: Service-Oriented Collective Intelligence for Language Resource Interoperability, Springer.","DOI":"10.1007\/978-3-642-21178-2"},{"key":"ref_29","unstructured":"Lin, D., Murakami, Y., and Ishida, T. (2018, January 7\u201312). A framework for multi-language service design with the Language Grid. Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Murakami, Y., Lin, D., and Ishida, T. (2014). Service-oriented architecture for interoperability of multilanguage services. Towards the Multilingual Semantic Web, Springer.","DOI":"10.1007\/978-3-662-43585-4_19"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Murakami, Y., Nakaguchi, T., Lin, D., and Ishida, T. (2018). Federated grid architecture for language services. Services Computing for Language Resources, Springer.","DOI":"10.1007\/978-981-10-7793-7"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wushouer, M., Ishida, T., and Lin, D. (2013, January 16\u201318). A heuristic framework for pivot-based bilingual dictionary induction. Proceedings of the 2013 International Conference on Culture and Computing, Kyoto, Japan.","DOI":"10.1109\/CultureComputing.2013.27"},{"key":"ref_33","unstructured":"Wushouer, M., Ishida, T., Lin, D., and Hirayama, K. (2014, January 26\u201331). Bilingual dictionary induction as an optimization problem. Proceedings of the Ninth International Conference on Language Resources and Evaluation, Reykjavik, Iceland."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1145\/3138815","article-title":"A generalized constraint approach to bilingual dictionary induction for low-resource language families","volume":"17","author":"Nasution","year":"2018","journal-title":"ACM Trans. Asian Low-Resour. Lang. Inf. Process."},{"key":"ref_35","unstructured":"Kaji, H., Tamamura, S., and Erdenebat, D. (2008, January 28\u201330). Automatic construction of a Japanese-Chinese dictionary via English. Proceedings of the Sixth International Conference on Language Resources and Evaluation, Marrakech, Morocco."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/s10579-007-9038-4","article-title":"Combining linguistic resources to create a machine-tractable Japanese-Malay dictionary","volume":"42","author":"Bond","year":"2008","journal-title":"Lang. Resour. Eval."},{"key":"ref_37","first-page":"862","article-title":"Bilingual dictionary generation for low-resourced language pairs","volume":"Volume 2","author":"Shoichi","year":"2009","journal-title":"Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"297","DOI":"10.3115\/991886.991937","article-title":"Construction of a bilingual dictionary intermediated by a third language","volume":"Volume 1","author":"Tanaka","year":"1994","journal-title":"Proceedings of the 15th Conference on Computational Linguistics"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Mann, G.S., and Yarowsky, D. (2001, January 2\u20137). Multipath translation lexicon induction via bridge languages. Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, Pittsburgh, PA, USA.","DOI":"10.3115\/1073336.1073356"},{"key":"ref_40","unstructured":"Tanaka, R., Murakami, Y., and Ishida, T. (2009, January 11\u201317). Context-based approach for pivot translation services. Proceedings of the 21st International Joint Conference on Artificial intelligence, Pasadena, CA, USA."},{"key":"ref_41","unstructured":"Matsuno, J., and Ishida, T. (2011, January 16\u201322). Constraint optimization approach to context based word selection. Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wushouer, M., Lin, D., Ishida, T., and Hirayama, K. (2014). Pivot-based bilingual dictionary extraction from multiple dictionary resources. 2014 Pacific Rim International Conference on Artificial Intelligence, Springer.","DOI":"10.1007\/978-3-319-13560-1_18"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1145\/2723144","article-title":"A constraint approach to pivot-based bilingual dictionary induction","volume":"15","author":"Wushouer","year":"2016","journal-title":"ACM Trans. Asian Low-Resour. Lang. Inf. Process."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"012001","DOI":"10.1088\/1742-6596\/1192\/1\/012001","article-title":"Indonesia Language Sphere: An ecosystem for dictionary development for low-resource languages","volume":"1192","author":"Murakami","year":"2019","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_45","unstructured":"Nasution, A.H., Murakami, Y., and Ishida, T. (2018, January 7\u201312). Designing a collaborative process to create bilingual dictionaries of Indonesian ethnic languages. Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/2\/67\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:20:42Z","timestamp":1760361642000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/2\/67"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,27]]},"references-count":45,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2020,2]]}},"alternative-id":["info11020067"],"URL":"https:\/\/doi.org\/10.3390\/info11020067","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2020,1,27]]}}}