{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,6,8]],"date-time":"2022-06-08T17:11:09Z","timestamp":1654708269624},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2021,10,19]],"date-time":"2021-10-19T00:00:00Z","timestamp":1634601600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Agence Nationale de la Recherche of France","award":["ANR-17-CE27-0011-BIM"],"award-info":[{"award-number":["ANR-17-CE27-0011-BIM"]}]},{"name":"Ministry of Science, Innovation, and Universities of Spain","award":["RTI2018-098082-J-I00"],"award-info":[{"award-number":["RTI2018-098082-J-I00"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,5,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>This article presents the elaboration of a morphosyntactically annotated diachronic corpus of Basque, and the first results obtained in the processing of historical varieties of this language with computational techniques. The corpus size is around one million words, expanding from the 15th to the mid-18th century and encompassing the most significant written production in all historical dialects. Morphosyntactic tagging allows for systematic searches at different levels of complexity; additionally, a rich set of metadata enables searches based on sociohistorical criteria too. This is not only the first tagged corpus of historical Basque but also a means to improve language processing tools by analyzing historical varieties more or less distant from the present-day standard language. Moreover, this project aims to set a model for further works in the historical corpora of Basque and inform similar projects on other languages.<\/jats:p>","DOI":"10.1093\/llc\/fqab066","type":"journal-article","created":{"date-parts":[[2021,6,22]],"date-time":"2021-06-22T19:12:23Z","timestamp":1624389143000},"page":"391-404","source":"Crossref","is-referenced-by-count":0,"title":["The first annotated corpus of historical Basque"],"prefix":"10.1093","volume":"37","author":[{"given":"Ainara","family":"Estarrona","sequence":"first","affiliation":[{"name":"Department of Computer Languages and Systems, Faculty of Computer Science, University of the Basque Country , Donostia, Pa\u00eds Vasco, Spain"}]},{"given":"Izaskun","family":"Etxeberria","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science, University of the Basque Country , Donostia, Pa\u00eds Vasco, Spain"}]},{"given":"Ander","family":"Soraluze","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science, University of the Basque Country , Donostia, Pa\u00eds Vasco, Spain"}]},{"given":"Ricardo","family":"Etxepare","sequence":"additional","affiliation":[{"name":"IKER-CNRS , Bayonne, Aquitaine, France"}]},{"given":"Manuel","family":"Padilla-Moyano","sequence":"additional","affiliation":[{"name":"University of the Basque Country , Donostia, Pa\u00eds Vasco, Spain"}]}],"member":"286","published-online":{"date-parts":[[2021,10,19]]},"reference":[{"key":"2022053116092419000_fqab066-B1","first-page":"447","volume-title":"Proceedings of the Second International Conference on Language Resources and Evaluation","author":"Aduriz","year":"2000"},{"key":"2022053116092419000_fqab066-B2","first-page":"1","volume-title":"IRCS Workshop on Linguistic Databases","author":"Aldezabal","year":"2001"},{"key":"2022053116092419000_fqab066-B3","first-page":"1","volume-title":"LREC-2002 Customizing Knowledge in NLP Applications Workshop","author":"Alegria","year":"2002"},{"key":"2022053116092419000_fqab066-B4","first-page":"25","article-title":"Lessons from the development of a named entity recognizer for Basque","volume":"36","author":"Alegria","year":"2006","journal-title":"Procesamiento del Lenguaje Natural"},{"key":"2022053116092419000_fqab066-B5","first-page":"5","article-title":"Chunk and clause identification for Basque by Filtering and Ranking with Perceptrons","volume":"41","author":"Alegria","year":"2008","journal-title":"Procesamiento del Lenguaje Natural"},{"issue":"4","key":"2022053116092419000_fqab066-B6","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1093\/llc\/11.4.193","article-title":"Automatic morphological analysis of Basque","volume":"11","author":"Alegria","year":"1996","journal-title":"Literary and Linguistic Computing"},{"key":"2022053116092419000_fqab066-B7","first-page":"1","volume-title":"Proceedings of the Workshop on Constraint Grammar - Methods, Tools and Applications; at NODALIDA 2015","author":"Arriola","year":"2015"},{"key":"2022053116092419000_fqab066-B8","first-page":"242","volume-title":"Corpus Linguistics. An International Handbook","author":"Claridge","year":"2009"},{"key":"2022053116092419000_fqab066-B9","volume-title":"Aldaera linguistikoen normalizazioa inferentzia fonologikoa eta morfologikoa erabiliz [= Normalization of linguistic variants using phonological and morphological inferences]. Doctoral dissertation","author":"Etxeberria","year":"2016"},{"key":"2022053116092419000_fqab066-B41","first-page":"1064","year":"2016"},{"issue":"2","key":"2022053116092419000_fqab066-B10","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1017\/S1351324918000505","article-title":"Weighted finite-state transducers for normalization of historical texts","volume":"25","author":"Etxeberria","year":"2019","journal-title":"Natural Language Engineering"},{"key":"2022053116092419000_fqab066-B11","first-page":"380","article-title":"Combining stochastic and rule-based methods for disambiguation in agglutinative languages","volume-title":"Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1","author":"Ezeiza","year":"1998"},{"key":"2022053116092419000_fqab066-B12","author":"Galves","year":"2017"},{"key":"2022053116092419000_fqab066-B13","first-page":"235","volume-title":"Towards a History of Basque Language","author":"G\u00f3mez","year":"1995"},{"key":"2022053116092419000_fqab066-B14","volume-title":"Historia de la lengua vasca","author":"Gorrochategui","year":"2018"},{"key":"2022053116092419000_fqab066-B15","first-page":"154","volume-title":"Corpus linguistics. An International Handbook","author":"Hunston","year":"2009"},{"key":"2022053116092419000_fqab066-B16","author":"Kroch","year":"2004"},{"key":"2022053116092419000_fqab066-B17","author":"Kroch","year":"2016"},{"key":"2022053116092419000_fqab066-B18","author":"Kroch","year":"2000"},{"key":"2022053116092419000_fqab066-B19","volume-title":"Le syst\u00e8me du verbe basque au XVIe si\u00e8cle","author":"Lafon","year":"1944"},{"key":"2022053116092419000_fqab066-B20","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1080\/00437956.1951.11659408","article-title":"Concordances morphologiques entre le basque et les langues caucasiques","volume":"7","author":"Lafon","year":"1951\u20131952","journal-title":"Word"},{"key":"2022053116092419000_fqab066-B21","first-page":"189","volume-title":"Towards a History of Basque Language","author":"Lakarra","year":"1995"},{"key":"2022053116092419000_fqab066-B22","first-page":"407","article-title":"Proleg\u00f3menos a la reconstrucci\u00f3n de segundo grado y an\u00e1lisis del cambio tipol\u00f3gico en (proto) vasco","volume":"5","author":"Lakarra","year":"2005","journal-title":"Palaeohispanica"},{"key":"2022053116092419000_fqab066-B23","first-page":"229","article-title":"Protovasco, munda y otros: reconstrucci\u00f3n interna y tipolog\u00eda hol\u00edstica diacr\u00f3nica","volume":"21","author":"Lakarra","year":"2006","journal-title":"Oihenart: cuadernos de lengua y literatura"},{"key":"2022053116092419000_fqab066-B24","author":"Lash","year":"2014"},{"key":"2022053116092419000_fqab066-B25","volume-title":"Towards a History of Basque Morphology: Articles and demonstratives","author":"Manterola","year":"2015"},{"key":"2022053116092419000_fqab066-B26","doi-asserted-by":"crossref","DOI":"10.3726\/978-3-653-02701-3","volume-title":"Basque and Proto-Basque. Language-Internal and Typological Approaches to Linguistic Reconstruction","author":"Mart\u00ednez-Arena","year":"2013"},{"key":"2022053116092419000_fqab066-B27","volume-title":"Textos Arcaicos Vascos","author":"Michelena","year":"1964"},{"key":"2022053116092419000_fqab066-B28","volume-title":"Fon\u00e9tica Hist\u00f3rica Vasca","author":"Michelena","year":"1977"},{"key":"2022053116092419000_fqab066-B29","volume-title":"Orotariko Euskal Hiztegia (Basque General Dictionary)","author":"Michelena","year":"1987"},{"key":"2022053116092419000_fqab066-B30","volume-title":"The Routledge Handbook of Corpus Linguistics","author":"Nelson","year":"2010"},{"key":"2022053116092419000_fqab066-B31","first-page":"45","volume-title":"Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing","author":"Novak","year":"2012"},{"issue":"6","key":"2022053116092419000_fqab066-B32","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1017\/S1351324915000315","article-title":"Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework","volume":"22","author":"Novak","year":"2016","journal-title":"Natural Language Engineering"},{"key":"2022053116092419000_fqab066-B33","volume-title":"Historia Social de la Literatura Vasca","author":"Sarasola","year":"1976"},{"key":"2022053116092419000_fqab066-B34","volume-title":"Contribuci\u00f3n al estudio y edici\u00f3n de textos antiguos vascos","author":"Sarasola","year":"1983"},{"key":"2022053116092419000_fqab066-B35","volume-title":"Euskal Testu Zaharrak (I) (Old Basque texts)","author":"Satr\u00fastegui","year":"1987"},{"key":"2022053116092419000_fqab066-B36","volume-title":"The Iberische Deklination","author":"Schuchardt","year":"1907"},{"key":"2022053116092419000_fqab066-B38","volume-title":"The History of Basque","author":"Trask","year":"1997"},{"key":"2022053116092419000_fqab066-B39","first-page":"565","article-title":"De la possibilit\u00e9 d\u2019une parent\u00e9 entre me basque et les langues caucasiques","volume":"15","author":"Uhlenbeck","year":"1924","journal-title":"Revista Internacional de Estudios Vascos"},{"key":"2022053116092419000_fqab066-B40","author":"Wallenberg","year":"2011"}],"container-title":["Digital Scholarship in the Humanities"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/37\/2\/391\/43841929\/fqab066.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/37\/2\/391\/43841929\/fqab066.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T16:20:48Z","timestamp":1654014048000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/dsh\/article\/37\/2\/391\/6400647"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,19]]},"references-count":40,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,10,19]]},"published-print":{"date-parts":[[2022,5,25]]}},"URL":"https:\/\/doi.org\/10.1093\/llc\/fqab066","relation":{},"ISSN":["2055-7671","2055-768X"],"issn-type":[{"value":"2055-7671","type":"print"},{"value":"2055-768X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,6,1]]},"published":{"date-parts":[[2021,10,19]]}}}