{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T07:40:11Z","timestamp":1771659611609,"version":"3.50.1"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,5,8]],"date-time":"2019-05-08T00:00:00Z","timestamp":1557273600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,5,8]],"date-time":"2019-05-08T00:00:00Z","timestamp":1557273600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Unlike human medical records, most of the veterinary records are free text without standard diagnosis coding. The lack of systematic coding is a major barrier to the growing interest in leveraging veterinary records for public health and translational research. Recent machine learning effort is limited to predicting 42 top-level diagnosis categories from veterinary notes. Here we develop a large-scale algorithm to automatically predict all 4577 standard veterinary diagnosis codes from free text. We train our algorithm on a curated dataset of over 100\u2009K expert labeled veterinary notes and over one million unlabeled notes. Our algorithm is based on the adapted Transformer architecture and we demonstrate that large-scale language modeling on the unlabeled notes via pretraining and as an auxiliary objective during supervised learning greatly improves performance. We systematically evaluate the performance of the model and several baselines in challenging settings where algorithms trained on one hospital are evaluated in a different hospital with substantial domain shift. In addition, we show that hierarchical training can address severe data imbalances for fine-grained diagnosis with a few training cases, and we provide interpretation for what is learned by the deep network. Our algorithm addresses an important challenge in veterinary medicine, and our model and experiments add insights into the power of unsupervised learning for clinical natural language processing.<\/jats:p>","DOI":"10.1038\/s41746-019-0113-1","type":"journal-article","created":{"date-parts":[[2019,5,9]],"date-time":"2019-05-09T15:22:50Z","timestamp":1557415370000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["VetTag: improving automated veterinary diagnosis coding via large-scale language modeling"],"prefix":"10.1038","volume":"2","author":[{"given":"Yuhui","family":"Zhang","sequence":"first","affiliation":[]},{"given":"Allen","family":"Nie","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9436-0637","authenticated-orcid":false,"given":"Ashley","family":"Zehnder","sequence":"additional","affiliation":[]},{"given":"Rodney L.","family":"Page","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8880-4764","authenticated-orcid":false,"given":"James","family":"Zou","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,5,8]]},"reference":[{"key":"113_CR1","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1038\/s41746-018-0029-1","volume":"1","author":"A Rajkomar","year":"2018","unstructured":"Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. NPJ Dig. Med. 1, 18 (2018).","journal-title":"NPJ Dig. Med."},{"key":"113_CR2","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1109\/JBHI.2017.2767063","volume":"22","author":"B Shickel","year":"2018","unstructured":"Shickel, B., Tighe, P. J., Bihorac, A. & Rashidi, P. Deep ehr: a survey of recent advances in deep learning techniques for electronic health record (ehr) analysis. IEEE J. Biomed. Health Inform. 22, 1589\u20131604 (2018).","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"113_CR3","doi-asserted-by":"publisher","first-page":"2133","DOI":"10.1158\/1078-0432.CCR-15-2347","volume":"22","author":"AK LeBlanc","year":"2016","unstructured":"LeBlanc, A. K., Mazcko, C. N. & Khanna, C. Defining the value of a comparative approach to cancer drug development. Clin. Cancer Res. 22, 2133\u20132138 (2016).","journal-title":"Clin. Cancer Res."},{"key":"113_CR4","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1007\/s12031-007-9023-9","volume":"34","author":"M Vainzof","year":"2008","unstructured":"Vainzof, M. et al. Animal models for genetic neuromuscular diseases. J. Mol. Neurosci. 34, 241\u2013248 (2008).","journal-title":"J. Mol. Neurosci."},{"key":"113_CR5","doi-asserted-by":"publisher","first-page":"764621","DOI":"10.1155\/2012\/764621","volume":"2012","author":"MH Gregory","year":"2012","unstructured":"Gregory, M. H. et al. A review of translational animal models for knee osteoarthritis. Arthritis 2012, 764621 (2012).","journal-title":"Arthritis"},{"key":"113_CR6","first-page":"509","volume":"90","author":"CA Adin","year":"2017","unstructured":"Adin, C. A. & Gilor, C. Focus: Comparative medicine: the diabetic dog as a translational model for human islet transplantation. Yale J. Biol. Med. 90, 509 (2017).","journal-title":"Yale J. Biol. Med."},{"key":"113_CR7","doi-asserted-by":"publisher","first-page":"308ps21","DOI":"10.1126\/scitranslmed.aaa9116","volume":"7","author":"A Kol","year":"2015","unstructured":"Kol, A. et al. Companion animals: Translational scientist\u2019s new best friends. Sci. Transl. Med. 7, 308ps21 (2015).","journal-title":"Sci. Transl. Med."},{"key":"113_CR8","first-page":"183","volume":"10","author":"S Velupillai","year":"2015","unstructured":"Velupillai, S., Mowery, D., South, B. R., Kvist, M. & Dalianis, H. Recent advances in clinical natural language processing in support of semantic analysis. Yearb. Med. Inform. 10, 183 (2015).","journal-title":"Yearb. Med. Inform."},{"key":"113_CR9","doi-asserted-by":"publisher","first-page":"224","DOI":"10.15265\/IY-2016-017","volume":"25","author":"D Demner-Fushman","year":"2016","unstructured":"Demner-Fushman, D. & Elhadad, N. Aspiring to unintended consequences of natural language processing: a review of recent developments in clinical and consumer-generated text processing. Yearb. Med. Inform. 25, 224\u2013233 (2016).","journal-title":"Yearb. Med. Inform."},{"key":"113_CR10","doi-asserted-by":"publisher","first-page":"156","DOI":"10.1016\/j.jbi.2015.10.001","volume":"58","author":"R Pivovarov","year":"2015","unstructured":"Pivovarov, R. et al. Learning probabilistic phenotypes from heterogeneous ehr data. J. Biomed. Inform. 58, 156\u2013165 (2015).","journal-title":"J. Biomed. Inform."},{"key":"113_CR11","unstructured":"Lipton, Z. C., Kale, D. C., Elkan, C. & Wetzel, R. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677 (2015)."},{"key":"113_CR12","unstructured":"Baumel, T., Nassour-Kassis, J., Cohen, R., Elhadad, M. & Elhadad, N. Multi-label classification of patient notes: case study on icd code assignment. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)."},{"key":"113_CR13","doi-asserted-by":"crossref","unstructured":"Prakash, A. et al. Condensed memory networks for clinical diagnostic inferencing. In AAAI, 3274\u20133280 (2017).","DOI":"10.1609\/aaai.v31i1.10964"},{"key":"113_CR14","unstructured":"Peters, M. E. et al. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)."},{"key":"113_CR15","unstructured":"Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training (2018)."},{"key":"113_CR16","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1038\/s41746-018-0067-8","volume":"1","author":"A Nie","year":"2018","unstructured":"Nie, A. et al. Deeptag: inferring diagnoses from veterinary clinical notes. NPJ Dig. Med. 1, 60 (2018).","journal-title":"NPJ Dig. Med."},{"key":"113_CR17","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1136\/amiajnl-2013-002159","volume":"21","author":"A Perotte","year":"2013","unstructured":"Perotte, A. et al. Diagnosis code assignment: models and evaluation metrics. J. Am. Med. Inform. Assoc. 21, 231\u2013237 (2013).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"113_CR18","unstructured":"Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems, 5998\u20136008 (2017)."},{"key":"113_CR19","doi-asserted-by":"crossref","unstructured":"Kim, Y. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).","DOI":"10.3115\/v1\/D14-1181"},{"key":"113_CR20","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735\u20131780 (1997).","journal-title":"Neural Comput."},{"key":"113_CR21","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.35","volume":"3","author":"AEW Johnson","year":"2016","unstructured":"Johnson, A. E. W. et al. Mimic-iii, a freely accessible critical care database. Sci. data 3, 160035 (2016).","journal-title":"Sci. data"},{"key":"113_CR22","doi-asserted-by":"crossref","unstructured":"Mullenbach, J., Wiegreffe, S., Duke, J., Sun, J. & Eisenstein, J. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1101\u20131111 (2018).","DOI":"10.18653\/v1\/N18-1100"},{"key":"113_CR23","unstructured":"Kaiser, L. et al. One model to learn them all. arXiv preprint arXiv:1706.05137 (2017)."},{"key":"113_CR24","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1136\/jamia.2009.002733","volume":"17","author":"AR Aronson","year":"2010","unstructured":"Aronson, A. R. & Lang, F.-M. An overview of metamap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17, 229\u2013236 (2010).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"113_CR25","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825\u20132830 (2011).","journal-title":"J. Mach. Learn. Res."},{"key":"113_CR26","volume-title":"Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition","author":"D Jurafsky","year":"2000","unstructured":"Jurafsky, D. & Martin, J. H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 1st edn (Prentice Hall PTR, Upper Saddle River, NJ, USA, 2000).","edition":"1st edn"},{"key":"113_CR27","unstructured":"Yang, Z., Dai, Z., Salakhutdinov, R. & Cohen, W. W. Breaking the softmax bottleneck: A high-rank RNN language model. In International Conference on Learning Representations (2018)."},{"key":"113_CR28","doi-asserted-by":"crossref","unstructured":"Bird, S. & Loper, E. Nltk: the natural language toolkit. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, 31. Association for Computational Linguistics (2004).","DOI":"10.3115\/1219044.1219075"},{"key":"113_CR29","doi-asserted-by":"crossref","unstructured":"Sennrich, R., Haddow, B. & Birch, A. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers), 1715\u20131725 (2016).","DOI":"10.18653\/v1\/P16-1162"},{"key":"113_CR30","first-page":"279","volume":"121","author":"K Donnelly","year":"2006","unstructured":"Donnelly, K. Snomed-ct: The advanced terminology and coding system for ehealth. Stud. Health Technol. Inform. 121, 279 (2006).","journal-title":"Stud. Health Technol. Inform."},{"key":"113_CR31","doi-asserted-by":"publisher","first-page":"1620","DOI":"10.1111\/j.1475-6773.2005.00444.x","volume":"40","author":"KJ O\u2019malley","year":"2005","unstructured":"O\u2019malley, K. J. et al. Measuring diagnoses: Icd code accuracy. Health Serv. Res. 40, 1620\u20131639 (2005).","journal-title":"Health Serv. Res."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0113-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0113-1","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0113-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,17]],"date-time":"2022-12-17T18:27:33Z","timestamp":1671301653000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0113-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,8]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,12]]}},"alternative-id":["113"],"URL":"https:\/\/doi.org\/10.1038\/s41746-019-0113-1","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,8]]},"assertion":[{"value":"25 January 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 April 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 May 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"35"}}