{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:25:07Z","timestamp":1777703107959,"version":"3.51.4"},"reference-count":34,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2019,5,14]],"date-time":"2019-05-14T00:00:00Z","timestamp":1557792000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2019,5,14]]},"abstract":"<jats:p>\u00a0User generated data in social networks is often not written in its standard form. This kind of text can lead to large dispersion in the datasets and can lead to inconsistent data. Therefore, normalization of such kind of texts is a crucial preprocessing step for common Natural Language Processing tools. In this paper we explore the state-of-the-art of the machine translation approach to normalize text under low-resource conditions. We also propose an auxiliary task for the sequence-to-sequence (seq2seq) neural architecture novel to the text normalization task, that improves the base seq2seq model up to 5%. This increase of performance closes the gap between statistical machine translation approaches and neural ones for low-resource text normalization.<\/jats:p>","DOI":"10.3233\/jifs-179039","type":"journal-article","created":{"date-parts":[[2019,5,14]],"date-time":"2019-05-14T12:09:16Z","timestamp":1557835756000},"page":"4921-4929","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["Low-resource neural character-based noisy text normalization"],"prefix":"10.1177","volume":"36","author":[{"given":"Manuel","family":"Mager","sequence":"first","affiliation":[{"name":"Institute for Natural Language Processing, University of Stuttgart, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"M\u00f3nica Jasso","family":"Rosales","sequence":"additional","affiliation":[{"name":"Facultad de Filosof\u00eda y Letras, Universidad Nacional Aut\u00f3noma de M\u00e9xico"},{"name":"Instituto de Ingenier\u00eda, Universidad Nacional Aut\u00f3noma de M\u00e9xico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"\u00d6zlem","family":"\u00c7etino\u011flu","sequence":"additional","affiliation":[{"name":"Institute for Natural Language Processing, University of Stuttgart, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ivan","family":"Meza","sequence":"additional","affiliation":[{"name":"Instituto de Investigaciones en Matem\u00e1ticas Aplicadas y en Sistemas, Universidad Nacional Aut\u00f3noma de M\u00e9xico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2019,5,14]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.3115\/1273073.1273078"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/2501115.2501131"},{"key":"e_1_3_3_4_2","unstructured":"BahdanauD. ChoK. and BengioY. Neural machine translation by jointly learning to align and translate ICLR 2015."},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4319"},{"key":"e_1_3_3_6_2","article-title":"Multilingual neural machine translation with task-specific attention","author":"Blackwood G.","year":"2018","unstructured":"BlackwoodG., BallesterosM. and WardT., Multilingual neural machine translation with task-specific attention, arXiv preprint arXiv:1806.03280 (2018).","journal-title":"arXiv preprint arXiv:1806.03280"},{"key":"e_1_3_3_7_2","first-page":"131","volume-title":"Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers","author":"Bollmann M.","year":"2016","unstructured":"BollmannM. and S\u00f8gaardA., Improving historical spelling normalization with bi-directional lstms and multi-task learning, In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016, pp. 131\u2013139."},{"issue":"2","key":"e_1_3_3_8_2","first-page":"263","article-title":"The mathematics of statistical machine translation: Parameter estimation","volume":"19","author":"Brown P.F.","year":"1993","unstructured":"BrownP.F., PietraV.J.D., PietraS.A.D. and MercerR.L., The mathematics of statistical machine translation: Parameter estimation, Computational Linguistics19(2) (1993), 263\u2013311.","journal-title":"Computational Linguistics"},{"key":"e_1_3_3_9_2","article-title":"On the properties of neural machine translation: Encoder\u2013decoder approaches","author":"Cho K.","year":"2014","unstructured":"ChoK., van MerrienboerB., BahdanauD. and BengioY., On the properties of neural machine translation: Encoder\u2013decoder approaches, In SSST, 2014.","journal-title":"SSST"},{"key":"e_1_3_3_10_2","unstructured":"DomingoM. and CasacubertaF. Spelling normalization of historical documents by using a machine translation approach 2018."},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-3501"},{"key":"e_1_3_3_12_2","first-page":"82","volume-title":"Proceedings of the First workshop on Unsupervised Learning in NLP","author":"Gouws S.","year":"2011","unstructured":"GouwsS., HovyD. and MetzlerD., Unsupervised mining of lexical variants from noisy text, In Proceedings of the First workshop on Unsupervised Learning in NLP, Edinburgh, Scotland, 2011, pp. 82\u201390."},{"key":"e_1_3_3_13_2","first-page":"368","volume-title":"Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1","author":"Han B.","year":"2011","unstructured":"HanB. and BaldwinT., Lexical normalisation of short text messages: Makn sens a# twitter, In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, 2011, pp. 368\u2013378."},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4313"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-1049"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-3401"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1005"},{"key":"e_1_3_3_18_2","article-title":"Adam: A method for stochastic optimization","author":"Kingma D.P.","year":"2014","unstructured":"KingmaD.P. and BaJ., Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014).","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.3115\/1557769.1557821"},{"key":"e_1_3_3_20_2","first-page":"12","volume-title":"Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language","author":"Korchagina N.","year":"2017","unstructured":"KorchaginaN., Normalizing medieval german texts: From rules to deep learning, In Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language, 2017, pp. 12\u201317."},{"key":"e_1_3_3_21_2","first-page":"282","volume-title":"Proceedings of the Eighteenth International Conference on Machine Learning, ICML \u201901","author":"Lafferty J.D.","year":"2001","unstructured":"LaffertyJ.D., McCallumA. and PereiraF.C.N., Conditional random fields: Probabilistic models for segmenting and labeling sequence data, In Proceedings of the Eighteenth International Conference on Machine Learning, ICML \u201901, San Francisco, CA, USA, 2001, pp. 282\u2013289. Morgan Kaufmann Publishers Inc. ISBN 1-55860-778-1. URL http:\/\/dl.acm.org\/citation.cfm?id=645530.655813."},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4323"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-3012"},{"key":"e_1_3_3_24_2","first-page":"1035","volume-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Liu F.","year":"2012","unstructured":"LiuF., WengF. and JiangX., A broad-coverage normalization system for social media language, In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jeju Island, Korea, 2012, pp. 1035\u20131044."},{"key":"e_1_3_3_25_2","unstructured":"Lopez Lude\u00f1aV. San Segundo Hern\u00e1ndezR. Montero Mart\u00ednezJ.M. Barra ChicoteR. and Lorenzo TruebaJ. Architecture for text normalization using statistical machine translation techniques SpringerverlagBerlin Heidelberg (2011)."},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4317"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1162\/089120103321337421"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-2067"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1194"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W15-4303"},{"key":"e_1_3_3_31_2","first-page":"1320","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Tang G.","year":"2018","unstructured":"TangG., CapF., PetterssonE. and NivreJ., An evaluation of neural machine translation models on historical spelling normalization, In Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 1320\u20131331. Association for Computational Linguistics. URL http:\/\/aclweb.org\/anthology\/C18-1112."},{"key":"e_1_3_3_32_2","article-title":"Monoise: Modeling noise using a modular normalization system","author":"van der Goot R.","year":"2017","unstructured":"van der GootR. and van NoordG., Monoise: Modeling noise using a modular normalization system, arXiv preprint arXiv:1710.03476 (2017).","journal-title":"arXiv preprint arXiv:1710.03476"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-4404"},{"key":"e_1_3_3_34_2","first-page":"5998","article-title":"\u0141. Kaiser and I. Polosukhin, Attention is all you need","author":"Vaswani A.","year":"2017","unstructured":"VaswaniA., ShazeerN., ParmarN., UszkoreitJ., JonesL. and GomezA.N., \u0141. Kaiser and I. Polosukhin, Attention is all you need, In Advances in Neural Information Processing Systems, 2017, pp. 5998\u20136008.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.protcy.2014.11.024"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179039","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-179039","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-179039","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:37:43Z","timestamp":1777455463000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-179039"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,14]]},"references-count":34,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2019,5,14]]}},"alternative-id":["10.3233\/JIFS-179039"],"URL":"https:\/\/doi.org\/10.3233\/jifs-179039","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,14]]}}}