{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,18]],"date-time":"2026-06-18T15:28:40Z","timestamp":1781796520193,"version":"3.54.5"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000925","name":"Department of Health | National Health and Medical Research Council","doi-asserted-by":"publisher","award":["1134919"],"award-info":[{"award-number":["1134919"]}],"id":[{"id":"10.13039\/501100000925","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000943","name":"Commonwealth Scientific and Industrial Research Organisation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000943","id-type":"DOI","asserted-by":"publisher"}]},{"name":"University of Melbourne, Melbourne School of Engineering"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As healthcare providers receive fixed amounts of reimbursement for given services under DRG (Diagnosis-Related Groups) payment, DRG codes are valuable for cost monitoring and resource allocation. However, coding is typically performed retrospectively post-discharge. We seek to predict DRGs and DRG-based case mix index (CMI) at early inpatient admission using routine clinical text to estimate hospital cost in an acute setting. We examined a deep learning-based natural language processing (NLP) model to automatically predict per-episode DRGs and corresponding cost-reflecting weights on two cohorts (paid under Medicare Severity (MS) DRG or All Patient Refined (APR) DRG), without human coding efforts. It achieved macro-averaged area under the receiver operating characteristic curve (AUC) scores of 0\u00b7871 (SD 0\u00b7011) on MS-DRG and 0\u00b7884 (0\u00b7003) on APR-DRG in fivefold cross-validation experiments on the first day of ICU admission. When extended to simulated patient populations to estimate average cost-reflecting weights, the model increased its accuracy over time and obtained absolute CMI error of 2\u00b740 (1\u00b707%) and 12\u00b779% (2\u00b731%), respectively on the first day. As the model could adapt to variations in admission time, cohort size, and requires no extra manual coding efforts, it shows potential to help estimating costs for active patients to support better operational decision-making in hospitals.<\/jats:p>","DOI":"10.1038\/s41746-021-00474-9","type":"journal-article","created":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T10:02:45Z","timestamp":1625133765000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":46,"title":["Early prediction of diagnostic-related groups and estimation of hospital cost by processing clinical notes"],"prefix":"10.1038","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7945-4165","authenticated-orcid":false,"given":"Jinghui","family":"Liu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9256-1256","authenticated-orcid":false,"given":"Daniel","family":"Capurro","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6215-6954","authenticated-orcid":false,"given":"Anthony","family":"Nguyen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8661-1544","authenticated-orcid":false,"given":"Karin","family":"Verspoor","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,7,1]]},"reference":[{"key":"474_CR1","doi-asserted-by":"crossref","unstructured":"Bredenkamp, C., Bales, S. & Kahur, K. Transition to Diagnosis-Related Group (DRG) Payments for Health: Lessons from Case Studies (The World Bank, 2019).","DOI":"10.1596\/978-1-4648-1521-8"},{"key":"474_CR2","doi-asserted-by":"publisher","unstructured":"Mihailovic, N., Kocic, S. & Jakovljevic, M. Review of diagnosis-related group-based financing of hospital care. Heal. Serv. Res. Manag. Epidemiol. https:\/\/doi.org\/10.1177\/2333392816647892 (2016).","DOI":"10.1177\/2333392816647892"},{"key":"474_CR3","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1089\/pop.2013.0002","volume":"17","author":"CM Mendez","year":"2014","unstructured":"Mendez, C. M., Harrington, D. W., Christenson, P. & Spellberg, B. Impact of hospital variables on case mix index as a marker of disease severity. Popul. Health Manag. 17, 28\u201334 (2014).","journal-title":"Popul. Health Manag."},{"key":"474_CR4","doi-asserted-by":"crossref","unstructured":"Thompson, N. D., Edwards, J. R., Dudeck, M. A., Fridkin, S. K. & Magill, S. S. Evaluating the use of the case mix index for risk adjustment of healthcare-associated infection data: an illustration using clostridium Difficile infection data from the national healthcare safety network. Infect. Control Hosp. Epidemiol. 37, 19\u201325 (2016).","DOI":"10.1017\/ice.2015.252"},{"key":"474_CR5","first-page":"426","volume":"160","author":"K Quinn","year":"2014","unstructured":"Quinn, K. After the revolution: DRGs at age 30. Ann. Intern. Med. 160, 426\u2013429 (2014).","journal-title":"Ann. Intern. Med."},{"key":"474_CR6","unstructured":"Andrew, S., O\u2019Reilly, J., Ward, P. & Mason, A. in Diagnosis-related Groups in Europe: Moving towards Transparency, Efficiency and Quality in Hospitals (eds Busse, R., Geissler, A., Quentin, W. & Wiley, M.) Ch. 7 (McGraw-Hill Education, 2011)."},{"key":"474_CR7","doi-asserted-by":"publisher","first-page":"718","DOI":"10.1287\/ijoc.2015.0655","volume":"27","author":"D Gartner","year":"2015","unstructured":"Gartner, D., Kolisch, R., Neill, D. B. & Padman, R. Machine learning approaches for early DRG classification and resource allocation. INFORMS J. Comput. 27, 718\u2013734 (2015).","journal-title":"INFORMS J. Comput."},{"key":"474_CR8","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1038\/s41746-018-0029-1","volume":"1","author":"A Rajkomar","year":"2018","unstructured":"Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 1, 18 (2018).","journal-title":"npj Digit. Med."},{"key":"474_CR9","doi-asserted-by":"publisher","first-page":"1347","DOI":"10.1056\/NEJMra1814259","volume":"380","author":"A Rajkomar","year":"2019","unstructured":"Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380, 1347\u20131358 (2019).","journal-title":"N. Engl. J. Med."},{"key":"474_CR10","doi-asserted-by":"publisher","first-page":"1419","DOI":"10.1093\/jamia\/ocy068","volume":"25","author":"C Xiao","year":"2018","unstructured":"Xiao, C., Choi, E. & Sun, J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J. Am. Med. Inform. Assoc. 25, 1419\u20131428 (2018).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"474_CR11","doi-asserted-by":"crossref","unstructured":"Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).","DOI":"10.1038\/sdata.2016.35"},{"key":"474_CR12","doi-asserted-by":"crossref","unstructured":"Mullenbach, J., Wiegreffe, S., Duke, J., Sun, J. & Eisenstein, J. Explainable prediction of medical codes from clinical text. In Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Walker, M., Ji, H. & Stent, A.) 1101\u20131111 (Association for Computational Linguistics, 2018).","DOI":"10.18653\/v1\/N18-1100"},{"key":"474_CR13","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735\u20131780 (1997).","journal-title":"Neural Comput."},{"key":"474_CR14","unstructured":"World Health Organization (WHO). Global Spending on Health: A World in Transition 2019 (WHO, 2019)."},{"key":"474_CR15","doi-asserted-by":"publisher","first-page":"2233","DOI":"10.1016\/S0140-6736(19)30841-4","volume":"393","author":"AY Chang","year":"2019","unstructured":"Chang, A. Y. et al. Past, present, and future of global health financing: a review of development assistance, government, out-of-pocket, and other private spending on health for 195 countries, 1995-2050. Lancet 393, 2233\u20132260 (2019).","journal-title":"Lancet"},{"key":"474_CR16","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1001\/jama.2020.0734","volume":"323","author":"JL Dieleman","year":"2020","unstructured":"Dieleman, J. L. et al. US Health Care spending by payer and health condition, 1996-2016. J. Am. Med. Assoc. 323, 863\u2013884 (2020).","journal-title":"J. Am. Med. Assoc."},{"key":"474_CR17","doi-asserted-by":"publisher","first-page":"1444","DOI":"10.1377\/hlthaff.2015.1553","volume":"35","author":"LC Baker","year":"2016","unstructured":"Baker, L. C., Bundorf, M. K., Devlin, A. M. & Kessler, D. P. Medicare advantage plans pay hospitals less than traditional medicare pays. Health Aff. 35, 1444\u20131451 (2016).","journal-title":"Health Aff."},{"key":"474_CR18","first-page":"102","volume":"94","author":"BC James","year":"2016","unstructured":"James, B. C. & Poulsen, G. P. The case for capitation. Harv. Bus. Rev. 94, 102\u2013111 (2016).","journal-title":"Harv. Bus. Rev."},{"key":"474_CR19","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1001\/jama.2015.18161","volume":"315","author":"MJ Press","year":"2016","unstructured":"Press, M. J., Rajkumar, R. & Conway, P. H. Medicare\u2019s new bundled payments: design, strategy, and evolution. J. Am. Med. Assoc. 315, 131\u2013132 (2016).","journal-title":"J. Am. Med. Assoc."},{"key":"474_CR20","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1016\/j.spinee.2019.04.024","volume":"20","author":"AT Malik","year":"2020","unstructured":"Malik, A. T., Phillips, F. M., Yu, E. & Khan, S. N. Are current DRG-based bundled payment models for lumbar fusions risk-adjusting adequately? An analysis of medicare beneficiaries. Spine J. 20, 32\u201340 (2020).","journal-title":"Spine J."},{"key":"474_CR21","first-page":"1312","volume":"2017","author":"MA Morid","year":"2017","unstructured":"Morid, M. A., Kawamoto, K., Ault, T., Dorius, J. & Abdelrahman, S. Supervised learning methods for predicting healthcare costs: systematic literature review and empirical evaluation. AMIA Annu. Symp . Proc. 2017, 1312\u20131321 (2017).","journal-title":"AMIA Annu. Symp . Proc."},{"key":"474_CR22","doi-asserted-by":"publisher","first-page":"103565","DOI":"10.1016\/j.jbi.2020.103565","volume":"111","author":"MA Morid","year":"2020","unstructured":"Morid, M. A., Sheng, O. R. L., Kawamoto, K. & Abdelrahman, S. Learning hidden patterns from patient multivariate time series data using convolutional neural networks: a case study of healthcare cost prediction. J. Biomed. Inform. 111, 103565 (2020).","journal-title":"J. Biomed. Inform."},{"key":"474_CR23","doi-asserted-by":"publisher","first-page":"204","DOI":"10.1001\/jamacardio.2016.3956","volume":"2","author":"JD Frizzell","year":"2017","unstructured":"Frizzell, J. D. et al. Prediction of 30-day all-cause readmissions in patients hospitalized for heart failure: comparison of machine learning and other statistical approaches. JAMA Cardiol. 2, 204\u2013209 (2017).","journal-title":"JAMA Cardiol."},{"key":"474_CR24","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1038\/s41746-020-00354-8","volume":"3","author":"I Osawa","year":"2020","unstructured":"Osawa, I., Goto, T., Yamamoto, Y. & Tsugawa, Y. Machine-learning-based prediction models for high-need high-cost patients using nationwide clinical and claims data. npj Digit. Med. 3, 148 (2020).","journal-title":"npj Digit. Med."},{"key":"474_CR25","doi-asserted-by":"publisher","first-page":"837","DOI":"10.1093\/jac\/dkn275","volume":"62","author":"SP Kuster","year":"2008","unstructured":"Kuster, S. P. et al. Correlation between case mix index and antibiotic use in hospitals. J. Antimicrob. Chemother. 62, 837\u2013842 (2008).","journal-title":"J. Antimicrob. Chemother."},{"key":"474_CR26","unstructured":"Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Burstein, J., Doran, C. & Solorio, T) 4171\u20134186 (Association for Computational Linguistics, 2019)."},{"key":"474_CR27","doi-asserted-by":"crossref","unstructured":"Alsentzer, E. et al. Publicly available clinical BERT embeddings. In Proc. 2nd Clinical Natural Language Processing Workshop (eds Rumshisky, A., Roberts, K., Bethard, S. & Naumann, T.) 72\u201378 (Association for Computational Linguistics, 2019).","DOI":"10.18653\/v1\/W19-1909"},{"key":"474_CR28","unstructured":"Jain, S. & Wallace, B. C. Attention is not explanation. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (eds Burstein, J., Doran, C. & Solorio, T) 3543\u20133556 (Association for Computational Linguistics, 2019)."},{"key":"474_CR29","doi-asserted-by":"crossref","unstructured":"Wiegreffe, S. & Pinter, Y. Attention is not not explanation. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (eds Inui, K., Jiang, J., Ng, V. & Wan, X.) 11\u201320 (Association for Computational Linguistics, 2019).","DOI":"10.18653\/v1\/D19-1002"},{"key":"474_CR30","unstructured":"Beltagy, I., Peters, M. E. & Cohan, A. Longformer: the long-document transformer. Preprint at arXiv:2004.05150 (2020)."},{"key":"474_CR31","doi-asserted-by":"crossref","unstructured":"Rios, A. & Kavuluru, R. Few-shot and zero-shot multi-label learning for structured label spaces. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing (eds Riloff, E., Chiang, D., Hockenmaier, J. & Tsujii, J.) 3132\u20133142 (Association for Computational Linguistics, 2018).","DOI":"10.18653\/v1\/D18-1352"},{"key":"474_CR32","doi-asserted-by":"publisher","DOI":"10.1038\/s41597-019-0055-0","volume":"6","author":"Y Zhang","year":"2019","unstructured":"Zhang, Y., Chen, Q., Yang, Z., Lin, H. & Lu, Z. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci. Data 6, 52 (2019).","journal-title":"Sci. Data"},{"key":"474_CR33","doi-asserted-by":"crossref","unstructured":"Kim, Y. Convolutional neural networks for sentence classification. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (eds Moschitti, A., Pang, B. & Daelemans, W.) 1746\u20131751 (Association for Computational Linguistics, 2014).","DOI":"10.3115\/v1\/D14-1181"},{"key":"474_CR34","doi-asserted-by":"crossref","unstructured":"Wang, S. et al. MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III. In Proc. of the ACM Conference on Health, Inference, and Learning (CHIL'20). (ed. Ghassemi, M.) 222\u2013235 (2020).","DOI":"10.1145\/3368555.3384469"}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00474-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00474-9","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00474-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,3]],"date-time":"2022-12-03T18:55:08Z","timestamp":1670093708000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-021-00474-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,1]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["474"],"URL":"https:\/\/doi.org\/10.1038\/s41746-021-00474-9","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,1]]},"assertion":[{"value":"11 January 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 June 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"K.V. reports grants from National Health and Medical Research Council, grants from Australian Research Council, during the conduct of the study; personal fees from Pfizer, outside the submitted work. J.L, D.C., and A.N. have none to declare.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Approval of data collection, processing, and release for the MIMIC-III database used in the study has been granted by the Institutional Review Boards of Beth Israel Deaconess Medical Center and the Massachusetts Institute of Technology<sup>11<\/sup>.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}}],"article-number":"103"}}