{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T13:49:22Z","timestamp":1776347362536,"version":"3.51.2"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2023,4,10]],"date-time":"2023-04-10T00:00:00Z","timestamp":1681084800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/100000005","name":"Department of Defense","doi-asserted-by":"publisher","award":["W81XWH-19-1-0294"],"award-info":[{"award-number":["W81XWH-19-1-0294"]}],"id":[{"id":"10.13039\/100000005","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000050","name":"National Heart, Lung, and Blood Institute","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000050","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["R01HL146398"],"award-info":[{"award-number":["R01HL146398"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000060","name":"National Institutes of Allergy and Infectious Diseases","doi-asserted-by":"crossref","award":["NIH R01AI148747-01"],"award-info":[{"award-number":["NIH R01AI148747-01"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000936","name":"Gordon and Betty Moore Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000936","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000968","name":"American Heart Association","doi-asserted-by":"publisher","award":["17IGMV33870001"],"award-info":[{"award-number":["17IGMV33870001"]}],"id":[{"id":"10.13039\/100000968","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,5,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Objective<\/jats:title>\n                    <jats:p>Deep learning (DL) has been applied in proofs of concept across biomedical imaging, including across modalities and medical specialties. Labeled data are critical to training and testing DL models, but human expert labelers are limited. In addition, DL traditionally requires copious training data, which is computationally expensive to process and iterate over. Consequently, it is useful to prioritize using those images that are most likely to improve a model\u2019s performance, a practice known as instance selection. The challenge is determining how best to prioritize. It is natural to prefer straightforward, robust, quantitative metrics as the basis for prioritization for instance selection. However, in current practice, such metrics are not tailored to, and almost never used for, image datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Materials and Methods<\/jats:title>\n                    <jats:p>To address this problem, we introduce ENRICH\u2014Eliminate Noise and Redundancy for Imaging Challenges\u2014a customizable method that prioritizes images based on how much diversity each image adds to the training set.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>First, we show that medical datasets are special in that in general each image adds less diversity than in nonmedical datasets. Next, we demonstrate that ENRICH achieves nearly maximal performance on classification and segmentation tasks on several medical image datasets using only a fraction of the available images and without up-front data labeling. ENRICH outperforms random image selection, the negative control. Finally, we show that ENRICH can also be used to identify errors and outliers in imaging datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>ENRICH is a simple, computationally efficient method for prioritizing images for expert labeling and use in DL.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/jamia\/ocad055","type":"journal-article","created":{"date-parts":[[2023,3,15]],"date-time":"2023-03-15T19:49:25Z","timestamp":1678909765000},"page":"1079-1090","source":"Crossref","is-referenced-by-count":13,"title":["ENRICHing medical imaging training sets enables more efficient machine learning"],"prefix":"10.1093","volume":"30","author":[{"given":"Erin","family":"Chinn","sequence":"first","affiliation":[{"name":"Department of Medicine, Division of Cardiology, Department of Radiology, Bakar Computational Health Sciences Institute, University of California, San Francisco , San Francisco, California, USA"}]},{"given":"Rohit","family":"Arora","sequence":"additional","affiliation":[{"name":"Division of Clinical Pathology, Department of Pathology, Beth Israel Deaconess Medical Center , Boston, Massachusetts, USA"}]},{"given":"Ramy","family":"Arnaout","sequence":"additional","affiliation":[{"name":"Division of Clinical Pathology, Department of Pathology, Beth Israel Deaconess Medical Center , Boston, Massachusetts, USA"},{"name":"Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center , Boston, Massachusetts, USA"}]},{"given":"Rima","family":"Arnaout","sequence":"additional","affiliation":[{"name":"Department of Medicine, Division of Cardiology, Department of Radiology, Bakar Computational Health Sciences Institute, University of California, San Francisco , San Francisco, California, USA"}]}],"member":"286","published-online":{"date-parts":[[2023,4,10]]},"reference":[{"key":"2024041000053450700_ocad055-B1","doi-asserted-by":"crossref","DOI":"10.1038\/s41746-017-0013-1","article-title":"Fast and accurate view classification of echocardiograms using deep learning","volume":"1","author":"Madani","year":"2018","journal-title":"NPJ Digit Med"},{"issue":"8","key":"2024041000053450700_ocad055-B2","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.1002\/jum.15868","article-title":"Development and validation of a deep learning strategy for automated view classification of pediatric focused assessment with sonography for trauma","volume":"41","author":"Kornblith","year":"2022","journal-title":"J Ultrasound Med"},{"issue":"5","key":"2024041000053450700_ocad055-B3","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1038\/s41591-021-01342-5","article-title":"An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease","volume":"27","author":"Arnaout","year":"2021","journal-title":"Nat Med"},{"key":"2024041000053450700_ocad055-B4","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-33128-3","volume-title":". Deep Learning in Medical Image Analysis: Challenges and Applications","author":"Lee","year":"2020"},{"issue":"7639","key":"2024041000053450700_ocad055-B5","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1038\/nature21056","article-title":"Dermatologist-level classification of skin cancer with deep neural networks","volume":"542","author":"Esteva","year":"2017","journal-title":"Nature"},{"issue":"22","key":"2024041000053450700_ocad055-B6","doi-asserted-by":"crossref","first-page":"2402","DOI":"10.1001\/jama.2016.17216","article-title":"Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs","volume":"316","author":"Gulshan","year":"2016","journal-title":"JAMA"},{"key":"2024041000053450700_ocad055-B7","first-page":"403","article-title":"Fetal pose estimation in volumetric MRI using a 3D convolution neural network","volume":"11767","author":"Xu","year":"2019","journal-title":"Med Image Comput Comput Assist Interv"},{"issue":"11","key":"2024041000053450700_ocad055-B8","doi-asserted-by":"crossref","first-page":"5648","DOI":"10.1002\/mp.14467","article-title":"Automatic contouring system for cervical cancer using convolutional neural networks","volume":"47","author":"Rhee","year":"2020","journal-title":"Med Phys"},{"issue":"23","key":"2024041000053450700_ocad055-B9","doi-asserted-by":"crossref","first-page":"235003","DOI":"10.1088\/1361-6560\/ab4e3e","article-title":"A dual-stream deep convolutional network for reducing metal streak artifacts in CT images","volume":"64","author":"Gjesteby","year":"2019","journal-title":"Phys Med Biol"},{"issue":"3","key":"2024041000053450700_ocad055-B10","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1007\/s00247-020-04854-3","article-title":"DeepLiverNet: a deep transfer learning model for classifying liver stiffness using clinical and T2-weighted magnetic resonance imaging data in children and young adults","volume":"51","author":"Li","year":"2021","journal-title":"Pediatr Radiol"},{"issue":"1","key":"2024041000053450700_ocad055-B11","doi-asserted-by":"crossref","first-page":"100464","DOI":"10.1016\/j.adro.2020.04.023","article-title":"Automated contouring of contrast and noncontrast computed tomography liver images with fully convolutional networks","volume":"6","author":"Anderson","year":"2021","journal-title":"Adv Radiat Oncol"},{"key":"2024041000053450700_ocad055-B12","doi-asserted-by":"crossref","first-page":"101908","DOI":"10.1016\/j.media.2020.101908","article-title":"An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization","volume":"68","author":"Shen","year":"2021","journal-title":"Med Image Anal"},{"key":"2024041000053450700_ocad055-B13","first-page":"79","article-title":"Shortcomings of ventricle segmentation using deep convolutional networks","volume":"11038","author":"Shao","year":"2018","journal-title":"Underst Interpret Mach Learn Med Image Comput Appl (2018)"},{"issue":"5","key":"2024041000053450700_ocad055-B14","doi-asserted-by":"crossref","first-page":"e200007","DOI":"10.1148\/ryai.2020200007","article-title":"Accelerating prostate diffusion-weighted MRI using a guided denoising convolutional neural network: retrospective feasibility study","volume":"2","author":"Kaye","year":"2020","journal-title":"Radiol Artif Intell"},{"key":"2024041000053450700_ocad055-B15","first-page":"105750D","article-title":"Deep learning and texture-based semantic label fusion for brain tumor segmentation","volume":"2018","author":"Vidyaratne","year":"2018","journal-title":"Proc SPIE Int Soc Opt Eng"},{"issue":"16","key":"2024041000053450700_ocad055-B16","doi-asserted-by":"crossref","first-page":"1623","DOI":"10.1161\/CIRCULATIONAHA.118.034338","article-title":"Fully automated echocardiogram interpretation in clinical practice","volume":"138","author":"Zhang","year":"2018","journal-title":"Circulation"},{"key":"2024041000053450700_ocad055-B17","doi-asserted-by":"crossref","first-page":"e4239","DOI":"10.1002\/nbm.4239","article-title":"Rapid dealiasing of undersampled, non-Cartesian cardiac perfusion images using U-net","volume":"33","author":"Fan","year":"2020","journal-title":"NMR Biomed"},{"issue":"1","key":"2024041000053450700_ocad055-B18","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1148\/radiol.2015150921","article-title":"The U.S. Radiologist Workforce: an analysis of temporal and geographic variation by using large national datasets","volume":"279","author":"Rosenkrantz","year":"2016","journal-title":"Radiology"},{"key":"2024041000053450700_ocad055-B19","author":"WHO"},{"key":"2024041000053450700_ocad055-B20","author":"WHO","year":"2021"},{"key":"2024041000053450700_ocad055-B21"},{"key":"2024041000053450700_ocad055-B22"},{"key":"2024041000053450700_ocad055-B23","author":"Culbertson"},{"key":"2024041000053450700_ocad055-B24","author":"Jercich","year":"2021"},{"key":"2024041000053450700_ocad055-B25","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1007\/s10462-010-9165-y","article-title":"A review of instance selection methods","volume":"34","author":"Olvera-L\u00f3pez","year":"2010","journal-title":"Artif Intell Rev"},{"key":"2024041000053450700_ocad055-B26","doi-asserted-by":"publisher","first-page":"2372","author":"Joshi","DOI":"10.1109\/CVPR.2009.5206627"},{"key":"2024041000053450700_ocad055-B27","author":"Hoyer","year":"2021"},{"key":"2024041000053450700_ocad055-B28","author":"Mehta","year":"2022"},{"issue":"5","key":"2024041000053450700_ocad055-B29","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1016\/j.cell.2018.02.010","article-title":"Identifying medical diagnoses and treatable diseases by image-based deep learning","volume":"172","author":"Kermany","year":"2018","journal-title":"Cell"},{"key":"2024041000053450700_ocad055-B30","first-page":"215","author":"Coates","year":"2011"},{"key":"2024041000053450700_ocad055-B31","author":"Burgess","year":"2018"},{"key":"2024041000053450700_ocad055-B32","author":"Leinster"},{"key":"2024041000053450700_ocad055-B33","first-page":"55","article-title":"What do we mean by diversity? The path towards quantification","volume":"9","author":"Jost","year":"2019","journal-title":"M\u00e8t Sci Stud J Annu Rev"},{"key":"2024041000053450700_ocad055-B34","doi-asserted-by":"crossref","first-page":"11881","DOI":"10.1038\/ncomms11881","article-title":"Robust estimates of overall immune-repertoire diversity from high-throughput measurements on samples","volume":"7","author":"Kaplinsky","year":"2016","journal-title":"Nat Commun"},{"issue":"34","key":"2024041000053450700_ocad055-B35","doi-asserted-by":"crossref","first-page":"e2203505119","DOI":"10.1073\/pnas.2203505119","article-title":"Repertoire-scale measures of antigen binding","volume":"119","author":"Arora","year":"2022","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024041000053450700_ocad055-B36","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/BF00993277","article-title":"Improving generalization with active learning","volume":"15","author":"Cohn","year":"1994","journal-title":"Mach Learn"},{"key":"2024041000053450700_ocad055-B37","doi-asserted-by":"crossref","first-page":"2591","DOI":"10.1109\/TCSVT.2016.2589879","article-title":"Cost-effective active learning for deep image classification","volume":"27","author":"Wang","year":"2017","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"2024041000053450700_ocad055-B38","author":"Fang","year":"2017"},{"key":"2024041000053450700_ocad055-B39","doi-asserted-by":"publisher","author":"Arora","year":"2020","DOI":"10.1101\/2020.06.18.159699"}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/6\/1079\/50374618\/ocad055.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/jamia\/article-pdf\/30\/6\/1079\/50374618\/ocad055.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,9]],"date-time":"2024-04-09T20:05:55Z","timestamp":1712693155000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/30\/6\/1079\/7111836"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,10]]},"references-count":39,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,4,10]]},"published-print":{"date-parts":[[2023,5,19]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocad055","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.05.22.21257645","asserted-by":"object"}]},"ISSN":["1067-5027","1527-974X"],"issn-type":[{"value":"1067-5027","type":"print"},{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,6,1]]},"published":{"date-parts":[[2023,4,10]]}}}