{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T17:48:33Z","timestamp":1769881713701,"version":"3.49.0"},"reference-count":25,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,12,11]],"date-time":"2024-12-11T00:00:00Z","timestamp":1733875200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001807","name":"Funda\u00e7\u00e3o de Amparo \u00e0 Pesquisa do Estado de S\u00e3o Paulo","doi-asserted-by":"crossref","award":["887.507037\/2020-00"],"award-info":[{"award-number":["887.507037\/2020-00"]}],"id":[{"id":"10.13039\/501100001807","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Coordination for the Improvement of Higher Education Personnel \u2013 Brazil","award":["88887.507037\/2020-00"],"award-info":[{"award-number":["88887.507037\/2020-00"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2024,12,31]]},"abstract":"<jats:p>\n            Machine Learning (ML) models have the potential to support decision-making in healthcare by grasping complex patterns within data. However, decisions in this domain are sensitive and require active involvement of domain specialists with deep knowledge of the data. To address this task, clinicians need to understand how predictions are generated so they can provide feedback for model refinement. There is usually a gap in the communication between data scientists and domain specialists that needs to be addressed. Specifically, many ML studies are only concerned with presenting average accuracies over an entire dataset, losing valuable insights that can be obtained at a more fine-grained patient-level analysis of classification performance. In this article, we present a case study aimed at explaining the factors that contribute to specific predictions for individual patients. Our approach takes a data-centric perspective, focusing on the structure of the data and its correlation with ML model performance. We utilize the concept of\n            <jats:italic>Instance Hardness<\/jats:italic>\n            , which measures the level of difficulty an instance poses in being correctly classified. By selecting the hardest and easiest to classify instances, we analyze and contrast the distributions of specific input features and extract meta-features to describe each instance. Furthermore, we individually examine certain instances, offering valuable insights into why they offer challenges for classification, enabling a better understanding of both the successes and failures of the ML models. This opens up the possibility for discussions between data scientists and domain specialists, supporting collaborative decision-making.\n          <\/jats:p>","DOI":"10.1145\/3687267","type":"journal-article","created":{"date-parts":[[2024,9,13]],"date-time":"2024-09-13T07:59:24Z","timestamp":1726214364000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Understanding the performance of machine learning models from data- to patient-level"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3631-156X","authenticated-orcid":false,"given":"Maria Gabriela","family":"Valeriano","sequence":"first","affiliation":[{"name":"Instituto Tecnol\u00f3gico de Aeron\u00e1utica, Sao Jose dos Campos, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8409-3747","authenticated-orcid":false,"given":"Ana","family":"Matran-Fernandez","sequence":"additional","affiliation":[{"name":"University of Essex, Colchester, United Kingdom of Great Britain and Northern Ireland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1122-0693","authenticated-orcid":false,"given":"Carlos","family":"Kiffer","sequence":"additional","affiliation":[{"name":"Universidade Federal de S\u00e3o Paulo, Sao Paulo, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6140-571X","authenticated-orcid":false,"given":"Ana Carolina","family":"Lorena","sequence":"additional","affiliation":[{"name":"Instituto Tecnol\u00f3gico de Aeron\u00e1utica, Sao Jose dos Campos, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,12,11]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-61380-8_33"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-50478-0_20"},{"key":"e_1_3_4_4_2","doi-asserted-by":"crossref","unstructured":"Angelos Chatzimparmpas Fernando V. Paulovich and Andreas Kerren. 2022. HardVis: Visual analytics to handle instance hardness using undersampling and oversampling techniques. In Computer graphics forum (Print) Vol. 42. John Wiley & Sons 135\u2013154.","DOI":"10.1111\/cgf.14726"},{"key":"e_1_3_4_5_2","doi-asserted-by":"crossref","unstructured":"Guang Chen DI Wu Wei Guo Yong Cao Da Huang Hongwu Wang Tao Wang Xiaoyun Zhang Huilong Chen Haijing Yu et\u00a0al. 2020. Clinical and immunological features of severe and moderate coronavirus disease 2019. The Journal of Clinical Investigation 130 5 (2020) 2620\u20132629.","DOI":"10.1172\/JCI137244"},{"key":"e_1_3_4_6_2","doi-asserted-by":"crossref","unstructured":"Alexander Decruyenaere Philippe Decruyenaere Patrick Peeters Frank Vermassen Tom Dhaene and Ivo Couckuyt. 2015. Prediction of delayed graft function after kidney transplantation: Comparison between logistic regression and machine learning methods. BMC Medical Informatics and Decision Making 15 (2015) 1\u201310.","DOI":"10.1186\/s12911-015-0206-y"},{"key":"e_1_3_4_7_2","doi-asserted-by":"crossref","unstructured":"Menglu Gao Qianying Wang Jianhao Wei Zhaoqin Zhu and Haicong Li. 2020. Severe Coronavirus disease 2019 pneumonia patients showed signs of aggravated renal impairment. Journal of Clinical Laboratory Analysis 34 10 (2020) e23535.","DOI":"10.1002\/jcla.23535"},{"key":"e_1_3_4_8_2","doi-asserted-by":"crossref","unstructured":"Andreas Holzinger. 2016. Interactive machine learning for health informatics: When do we need the human-in-the-loop?Brain Informatics 3 2 (2016) 119\u2013131.","DOI":"10.1007\/s40708-016-0042-6"},{"key":"e_1_3_4_9_2","doi-asserted-by":"crossref","unstructured":"Andrew Houston Georgina Cosma Phillipa Turner and Alexander Bennett. 2021. Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies. Scientific Reports 11 1 (2021) 1\u201315.","DOI":"10.1038\/s41598-021-03825-4"},{"key":"e_1_3_4_10_2","doi-asserted-by":"crossref","unstructured":"Grey Leonard Charles South Courtney Balentine Matthew Porembka John Mansour Sam Wang Adam Yopp Patricio Polanco Herbert Zeh and Mathew Augustine. 2022. Machine learning improves prediction over logistic regression on resected colon cancer patients. Journal of Surgical Research 275 (2022) 181\u2013193.","DOI":"10.1016\/j.jss.2022.01.012"},{"key":"e_1_3_4_11_2","unstructured":"Jing Li Yinghua Zhang Fang Wang Bing Liu Hui Li Guodong Tang Zhigang Chang Aihua Liu Chunyi Fu Jing Gao et\u00a0al. 2020. Sex differences in clinical findings among patients with coronavirus disease 2019 (COVID-19) and severe condition. MedRxiv (2020) 2020\u201302."},{"key":"e_1_3_4_12_2","unstructured":"Camila Castro Moreno Pedro Yuri Arbs Paiva Gustavo H. Nunes and Ana Carolina Lorena. 2021. Contrasting the profiles of easy and hard observations in a dataset. In Proc. NeurIPS DCAI Workshop."},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11481-020-09974-z"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.1002\/jmv.26300"},{"key":"e_1_3_4_15_2","first-page":"1","article-title":"Relating instance hardness to classification performance in a dataset: A visual approach","author":"Paiva Pedro Yuri Arbs","year":"2022","unstructured":"Pedro Yuri Arbs Paiva, Camila Castro Moreno, Kate Smith-Miles, Maria Gabriela Valeriano, and Ana Carolina Lorena. 2022. Relating instance hardness to classification performance in a dataset: A visual approach. Machine Learning (2022), 1\u201339.","journal-title":"Machine Learning"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2022.3197957"},{"key":"e_1_3_4_17_2","unstructured":"Nabeel Seedat Jonathan Crabb\u00e9 Ioana Bica and Mihaela van der Schaar. 2022. Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data. Advances in Neural Information Processing Systems 35 (2022) 23660\u201323674."},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0242400"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.3126\/jngmc.v20i1.48347"},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-013-5422-z"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.2196\/16503"},{"issue":"1","key":"e_1_3_4_22_2","first-page":"1","article-title":"Lymphopenia predicts disease severity of COVID-19: A descriptive and predictive study","volume":"5","author":"Tan Li","year":"2020","unstructured":"Li Tan, Qi Wang, Duanyang Zhang, Jinya Ding, Qianchuan Huang, Yi-Quan Tang, Qiongshu Wang, and Hongming Miao. 2020. Lymphopenia predicts disease severity of COVID-19: A descriptive and predictive study. Signal Transduction and Targeted Therapy 5, 1 (2020), 1\u20133.","journal-title":"Signal Transduction and Targeted Therapy"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41577-020-0311-8"},{"key":"e_1_3_4_24_2","first-page":"359","volume-title":"Machine Learning for Healthcare Conference","author":"Tonekaboni Sana","year":"2019","unstructured":"Sana Tonekaboni, Shalmali Joshi, Melissa D. McCradden, and Anna Goldenberg. 2019. What clinicians want: Contextualizing explainable machine learning for clinical end use. In Machine Learning for Healthcare Conference. PMLR, 359\u2013380."},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid54584.2022.00115"},{"key":"e_1_3_4_26_2","unstructured":"Maria Gabriela Valeriano Carlos Roberto Veiga Kiffer and Ana Carolina Lorena. 2022. Supporting decision making in health scenarios with machine learning models. In Anais do Simposio Brasileiro de Pesquisa Operacional (Juiz de Fora). Anais eletronicos. Campinas Galoa."}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687267","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3687267","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:58:02Z","timestamp":1750294682000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3687267"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,11]]},"references-count":25,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,12,31]]}},"alternative-id":["10.1145\/3687267"],"URL":"https:\/\/doi.org\/10.1145\/3687267","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"value":"1936-1955","type":"print"},{"value":"1936-1963","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,11]]},"assertion":[{"value":"2023-05-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-19","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}