{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,26]],"date-time":"2026-01-26T01:51:37Z","timestamp":1769392297334,"version":"3.49.0"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2022,11,9]],"date-time":"2022-11-09T00:00:00Z","timestamp":1667952000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2023,2,28]]},"abstract":"<jats:p>Human coders assign standardized medical codes to clinical documents generated during patients\u2019 hospitalization, which is error prone and labor intensive. Automated medical coding approaches have been developed using machine learning methods, such as deep neural networks. Nevertheless, automated medical coding is still challenging because of complex code association, noise in lengthy documents, and the imbalanced class problem. We propose a novel neural network, called the Multitask Balanced and Recalibrated Neural Network, to solve these issues. Significantly, the multitask learning scheme shares the relationship knowledge between different coding branches to capture code association. A recalibrated aggregation module is developed by cascading convolutional blocks to extract high-level semantic features that mitigate the impact of noise in documents. Also, the cascaded structure of the recalibrated module can benefit learning from lengthy notes. To solve the imbalanced class problem, we deploy focal loss to redistribute the attention on low- and high-frequency medical codes. Experimental results show that our proposed model outperforms competitive baselines on a real-world clinical dataset called the Medical Information Mart for Intensive Care (MIMIC-III).<\/jats:p>","DOI":"10.1145\/3563041","type":"journal-article","created":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T12:21:26Z","timestamp":1662639686000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Multitask Balanced and Recalibrated Network for Medical Code Prediction"],"prefix":"10.1145","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6724-0584","authenticated-orcid":false,"given":"Wei","family":"Sun","sequence":"first","affiliation":[{"name":"Aalto University Espoo, Finland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3281-8002","authenticated-orcid":false,"given":"Shaoxiong","family":"Ji","sequence":"additional","affiliation":[{"name":"Aalto University Espoo, Finland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3030-1280","authenticated-orcid":false,"given":"Erik","family":"Cambria","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7078-7927","authenticated-orcid":false,"given":"Pekka","family":"Marttinen","sequence":"additional","affiliation":[{"name":"Aalto University Espoo, Finland"}]}],"member":"320","published-online":{"date-parts":[[2022,11,9]]},"reference":[{"key":"e_1_3_2_2_2","article-title":"Longformer: The long-document transformer","author":"Beltagy Iz","year":"2020","unstructured":"Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150.","journal-title":"arXiv preprint arXiv:2004.05150"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87479-9_26"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46672-9_5"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2014.08.091"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1613\/jair.953"},{"issue":"1","key":"e_1_3_2_7_2","first-page":"1","article-title":"Special issue on learning from imbalanced data sets","volume":"6","author":"Chawla Nitesh V.","year":"2004","unstructured":"Nitesh V. Chawla, Nathalie Japkowicz, and Aleksander Kotcz. 2004. Special issue on learning from imbalanced data sets. ACM Special Interest Group on Knowledge Discovery and Data Mining Explorations Newsletter 6, 1 (2004), 1\u20136.","journal-title":"ACM Special Interest Group on Knowledge Discovery and Data Mining Explorations Newsletter"},{"issue":"17","key":"e_1_3_2_8_2","first-page":"75","article-title":"A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records","volume":"19","author":"Chowdhury Shanta","year":"2018","unstructured":"Shanta Chowdhury, Xishuang Dong, Lijun Qian, Xiangfang Li, Yi Guan, Jinfeng Yang, and Qiubin Yu. 2018. A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records. BMC Bioinformatics 19, 17 (2018), 75\u201384.","journal-title":"BMC Bioinformatics"},{"key":"e_1_3_2_9_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2021.103728"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0174708"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00745"},{"key":"e_1_3_2_13_2","article-title":"Multitask learning from clinical text and acute physiological conditions differentially improve the prediction of mortality and diagnosis at the ICU","author":"Interian Yannet","year":"2020","unstructured":"Yannet Interian, Lara Reichmann, and Gilmer Valdes. 2020. Multitask learning from clinical text and acute physiological conditions differentially improve the prediction of mortality and diagnosis at the ICU. medRxiv.","journal-title":"medRxiv"},{"key":"e_1_3_2_14_2","unstructured":"Shaoxiong Ji Erik Cambria and Pekka Marttinen. 2020. Dilated convolutional attention network for medical code assignment from clinical text. arXiv preprint arXiv:2009.14578 (2020)."},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2021.104998"},{"key":"e_1_3_2_16_2","unstructured":"Shaoxiong Ji Shirui Pan and Pekka Marttinen. 2021. Medical code assignment with gated convolution and note-code interaction. arXiv preprint arXiv:2010.06975 (2020)."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.35"},{"key":"e_1_3_2_18_2","article-title":"Convolutional neural networks for sentence classification","author":"Kim Yoon","year":"2014","unstructured":"Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.","journal-title":"arXiv preprint arXiv:1408.5882"},{"key":"e_1_3_2_19_2","article-title":"Adam: A method for stochastic optimization","author":"Kingma Diederik P.","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"e_1_3_2_20_2","article-title":"Reformer: The efficient transformer","author":"Kitaev Nikita","year":"2020","unstructured":"Nikita Kitaev, \u0141ukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451.","journal-title":"arXiv preprint arXiv:2001.04451"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijmedinf.2015.08.004"},{"issue":"1","key":"e_1_3_2_22_2","first-page":"46","article-title":"Mixture of expert agents for handling imbalanced data sets","volume":"1","author":"Kotsiantis S. B.","year":"2003","unstructured":"S. B. Kotsiantis and P. E. Pintelas. 2003. Mixture of expert agents for handling imbalanced data sets. Annals of Mathematics, Computing & Teleinformatics 1, 1 (2003), 46\u201355.","journal-title":"Annals of Mathematics, Computing & Teleinformatics"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/243199.243276"},{"key":"e_1_3_2_24_2","article-title":"FNet: Mixing tokens with Fourier transforms","author":"Lee-Thorp James","year":"2021","unstructured":"James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, and Santiago Ontanon. 2021. FNet: Mixing tokens with Fourier transforms. arXiv preprint arXiv:2105.03824.","journal-title":"arXiv preprint arXiv:2105.03824"},{"key":"e_1_3_2_25_2","doi-asserted-by":"crossref","unstructured":"Fei Li and Hong Yu. 2020. ICD coding from clinical text using multi-filter residual convolutional neural network. Proceedings of the AAAI Conference on Artificial Intelligence 34 5 (2020) 8180\u20138187.","DOI":"10.1609\/aaai.v34i05.6331"},{"key":"e_1_3_2_26_2","article-title":"A closer look at loss weighting in multi-task learning","author":"Lin Baijiong","year":"2021","unstructured":"Baijiong Lin, Feiyang Ye, and Yu Zhang. 2021. A closer look at loss weighting in multi-task learning. arXiv preprint arXiv:2111.10603.","journal-title":"arXiv preprint arXiv:2111.10603"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.324"},{"key":"e_1_3_2_28_2","article-title":"Multi-task deep neural networks for natural language understanding","author":"Liu Xiaodong","year":"2019","unstructured":"Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019. Multi-task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504.","journal-title":"arXiv preprint arXiv:1901.11504"},{"key":"e_1_3_2_29_2","first-page":"2096","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Luo Junyu","year":"2021","unstructured":"Junyu Luo, Cao Xiao, Lucas Glass, Jimeng Sun, and Fenglong Ma. 2021. Fusion: Towards automated ICD coding via feature compression. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. ACL, Thailand, 2096\u20132101."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.5555\/2002472.2002491"},{"key":"e_1_3_2_31_2","article-title":"Distributed representations of words and phrases and their compositionality","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. arXiv preprint arXiv:1310.4546.","journal-title":"arXiv preprint arXiv:1310.4546"},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","unstructured":"James Mullenbach Sarah Wiegreffe Jon Duke Jimeng Sun and Jacob Eisenstein. 2018. Explainable prediction of medical codes from clinical text. arXiv preprint arXiv:1802.05695 (2018).","DOI":"10.18653\/v1\/N18-1100"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1475-6773.2005.00444.x"},{"issue":"1","key":"e_1_3_2_34_2","first-page":"76","article-title":"The accuracy of ICD codes for cerebrovascular diseases in medical insurance claims","volume":"33","author":"Park Jong-Ku","year":"2000","unstructured":"Jong-Ku Park, Ki-Soon Kim, Tae-Yong Lee, Kang-Sook Lee, Duk-Hee Lee, Sun-Hee Lee, Sun-Ha Jee, Il Suh, Kwang-Wook Koh, So-Yeon Ryu, et\u00a0al. 2000. The accuracy of ICD codes for cerebrovascular diseases in medical insurance claims. Journal of Preventive Medicine and Public Health 33, 1 (2000), 76\u201382.","journal-title":"Journal of Preventive Medicine and Public Health"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.189"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1136\/amiajnl-2013-002159"},{"key":"e_1_3_2_37_2","first-page":"779","article-title":"Deep patient representation of clinical notes via multi-task learning for mortality prediction","volume":"2019","author":"Si Yuqi","year":"2019","unstructured":"Yuqi Si and Kirk Roberts. 2019. Deep patient representation of clinical notes via multi-task learning for mortality prediction. AMIA Summits on Translational Science Proceedings 2019 (2019), 779.","journal-title":"AMIA Summits on Translational Science Proceedings"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00429-015-1059-y"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-86514-6_23"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2011.10.019"},{"key":"e_1_3_2_41_2","unstructured":"Thanh Vu Dat Quoc Nguyen and Anthony Nguyen. 2021. A label attention model for ICD coding from clinical text. arXiv preprint arXiv:2007.06351 (2020)."},{"key":"e_1_3_2_42_2","article-title":"Linformer: Self-attention with linear complexity","author":"Wang Sinong","year":"2020","unstructured":"Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768.","journal-title":"arXiv preprint arXiv:2006.04768"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357897"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-1174"},{"key":"e_1_3_2_45_2","article-title":"How transferable are features in deep neural networks?","author":"Yosinski Jason","year":"2014","unstructured":"Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? arXiv preprint arXiv:1411.1792.","journal-title":"arXiv preprint arXiv:1411.1792"},{"key":"e_1_3_2_46_2","article-title":"Hierarchical BERT for medical document understanding","author":"Zhang Ning","year":"2022","unstructured":"Ning Zhang and Maciej Jankowski. 2022. Hierarchical BERT for medical document understanding. arXiv preprint arXiv:2204.09600.","journal-title":"arXiv preprint arXiv:2204.09600"},{"key":"e_1_3_2_47_2","first-page":"1","article-title":"A survey on multi-task learning","author":"Zhang Yu","year":"2017","unstructured":"Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.08114 (2017), 1\u20131.","journal-title":"arXiv preprint arXiv:1707.08114"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.3301817"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3563041","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3563041","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3563041","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:38:09Z","timestamp":1750178289000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3563041"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,9]]},"references-count":47,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,2,28]]}},"alternative-id":["10.1145\/3563041"],"URL":"https:\/\/doi.org\/10.1145\/3563041","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,9]]},"assertion":[{"value":"2022-04-10","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-30","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-11-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}