{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T16:50:14Z","timestamp":1775494214221,"version":"3.50.1"},"reference-count":61,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T00:00:00Z","timestamp":1668729600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T00:00:00Z","timestamp":1668729600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000054","name":"U.S. Department of Health & Human Services | NIH | National Cancer Institute","doi-asserted-by":"publisher","award":["T32CA09168"],"award-info":[{"award-number":["T32CA09168"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000009","name":"Foundation for the National Institutes of Health","doi-asserted-by":"publisher","award":["R01 HD107493"],"award-info":[{"award-number":["R01 HD107493"]}],"id":[{"id":"10.13039\/100000009","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004328","name":"Genentech","doi-asserted-by":"publisher","award":["R21 EY031883"],"award-info":[{"award-number":["R21 EY031883"]}],"id":[{"id":"10.13039\/100004328","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Ideal repeatable models output predictions without variation during independent tests carried out under similar conditions. However, slight variations, though not ideal, may be unavoidable and acceptable in practice. During model development and evaluation, much attention is given to classification performance while model repeatability is rarely assessed, leading to the development of models that are unusable in clinical practice. In this work, we evaluate the repeatability of four model types (binary classification, multi-class classification, ordinal classification, and regression) on images that were acquired from the same patient during the same visit. We study the each model\u2019s performance on four medical image classification tasks from public and private datasets: knee osteoarthritis, cervical cancer screening, breast density estimation, and retinopathy of prematurity. Repeatability is measured and compared on ResNet and DenseNet architectures. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increases repeatability, in particular at the class boundaries, for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95% limits of agreement by 16% points and of the class disagreement rate by 7% points. The classification accuracy improves in most settings along with the repeatability. Our results suggest that beyond about 20 Monte Carlo iterations, there is no further gain in repeatability. In addition to the higher test-retest agreement, Monte Carlo predictions are better calibrated which leads to output probabilities reflecting more accurately the true likelihood of being correctly classified.<\/jats:p>","DOI":"10.1038\/s41746-022-00709-3","type":"journal-article","created":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T08:29:44Z","timestamp":1668760184000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":73,"title":["Improving the repeatability of deep learning models with Monte Carlo dropout"],"prefix":"10.1038","volume":"5","author":[{"given":"Andreanne","family":"Lemay","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1881-7065","authenticated-orcid":false,"given":"Katharina","family":"Hoebel","sequence":"additional","affiliation":[]},{"given":"Christopher P.","family":"Bridge","sequence":"additional","affiliation":[]},{"given":"Brian","family":"Befano","sequence":"additional","affiliation":[]},{"given":"Silvia","family":"De Sanjos\u00e9","sequence":"additional","affiliation":[]},{"given":"Didem","family":"Egemen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9866-2106","authenticated-orcid":false,"given":"Ana Cecilia","family":"Rodriguez","sequence":"additional","affiliation":[]},{"given":"Mark","family":"Schiffman","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7964-9475","authenticated-orcid":false,"given":"John Peter","family":"Campbell","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8906-9618","authenticated-orcid":false,"given":"Jayashree","family":"Kalpathy-Cramer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,11,18]]},"reference":[{"key":"709_CR1","doi-asserted-by":"publisher","first-page":"211860","DOI":"10.1109\/ACCESS.2020.3039833","volume":"8","author":"SS Alahmari","year":"2020","unstructured":"Alahmari, S. S., Goldgof, D. B., Mouton, P. R. & Hall, L. O. Challenges for the repeatability of deep learning models. IEEE Access 8, 211860\u2013211868 (2020).","journal-title":"IEEE Access"},{"key":"709_CR2","doi-asserted-by":"publisher","first-page":"2346","DOI":"10.1007\/s00330-019-06589-8","volume":"30","author":"H Kim","year":"2020","unstructured":"Kim, H., Park, C. M. & Goo, J. M. Test-retest reproducibility of a deep learning\u2013based automatic detection algorithm for the chest radiograph. Eur Radiol. 30, 2346\u20132355 (2020).","journal-title":"Eur Radiol."},{"key":"709_CR3","unstructured":"Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)."},{"key":"709_CR4","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929\u20131958 (2014).","journal-title":"J. Mach. Learn. Res."},{"key":"709_CR5","unstructured":"Gal, Y. & Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, 1050\u20131059 (PMLR, 2016)."},{"key":"709_CR6","doi-asserted-by":"crossref","unstructured":"Camarasa, R. et al. Quantitative comparison of monte-carlo dropout uncertainty measures for multi-class segmentation. In Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Graphs in Biomedical Image Analysis, 32\u201341 (Springer, 2020).","DOI":"10.1007\/978-3-030-60365-6_4"},{"key":"709_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-017-17876-z","volume":"7","author":"C Leibig","year":"2017","unstructured":"Leibig, C., Allken, V., Ayhan, M. S., Berens, P. & Wahl, S. Leveraging uncertainty information from deep neural networks for disease detection. Sci. Rep. 7, 1\u201314 (2017).","journal-title":"Sci. Rep."},{"key":"709_CR8","doi-asserted-by":"crossref","unstructured":"Combalia, M., Hueto, F., Puig, S., Malvehy, J. & Vilaplana, V. Uncertainty estimation in deep neural networks for dermoscopic image classification. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, 744\u2013745 (2020).","DOI":"10.1109\/CVPRW50498.2020.00380"},{"key":"709_CR9","doi-asserted-by":"crossref","unstructured":"Singh, R. K., Gorantla, R., Allada, S. G. R., & Narra, P. SkiNet: A deep learning framework for skin lesion diagnosis with uncertainty estimation and explainability. Plos one, 17, e0276836 (2022).","DOI":"10.1371\/journal.pone.0276836"},{"key":"709_CR10","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1007\/s00330-020-07065-4","volume":"31","author":"A Hiremath","year":"2021","unstructured":"Hiremath, A. et al. Test-retest repeatability of a deep learning architecture in detecting and segmenting clinically significant prostate cancer on apparent diffusion coefficient (adc) maps. Eur. Radiol. 31, 379\u2013391 (2021).","journal-title":"Eur. Radiol."},{"key":"709_CR11","doi-asserted-by":"publisher","first-page":"1471","DOI":"10.1002\/mrm.28022","volume":"83","author":"S Estrada","year":"2020","unstructured":"Estrada, S. et al. Fatsegnet: A fully automated deep learning pipeline for adipose tissue segmentation on abdominal dixon mri. Magn. Reson. Med. 83, 1471\u20131483 (2020).","journal-title":"Magn. Reson. Med."},{"key":"709_CR12","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1016\/j.neuroimage.2017.07.059","volume":"163","author":"JH Cole","year":"2017","unstructured":"Cole, J. H. et al. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. NeuroImage 163, 115\u2013124 (2017).","journal-title":"NeuroImage"},{"key":"709_CR13","first-page":"e190199","volume":"3","author":"KV Hoebel","year":"2020","unstructured":"Hoebel, K. V. et al. Radiomics repeatability pitfalls in a scan-rescan mri study of glioblastoma. Radiol.: Artif. Intell. 3, e190199 (2020).","journal-title":"Radiol.: Artif. Intell."},{"key":"709_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-019-45766-z","volume":"9","author":"M Schwier","year":"2019","unstructured":"Schwier, M. et al. Repeatability of multiparametric prostate mri radiomics features. Sci. Rep. 9, 1\u201316 (2019).","journal-title":"Sci. Rep."},{"key":"709_CR15","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1007\/s11307-016-0940-2","volume":"18","author":"FH van Velden","year":"2016","unstructured":"van Velden, F. H. et al. Repeatability of radiomic features in non-small-cell lung cancer [18 f] fdg-pet\/ct studies: impact of reconstruction and delineation. Mol. Imag. Biol. 18, 788\u2013795 (2016).","journal-title":"Mol. Imag. Biol."},{"key":"709_CR16","doi-asserted-by":"crossref","unstructured":"Mojtahed, A. et al. Repeatability and reproducibility of deep-learning-based liver volume and couinaud segment volume measurement tool. Abdominal Radiol. 1\u20139 (2021).","DOI":"10.1007\/s00261-021-03262-x"},{"key":"709_CR17","doi-asserted-by":"publisher","first-page":"2345","DOI":"10.1016\/j.ophtha.2016.07.020","volume":"123","author":"J Kalpathy-Cramer","year":"2016","unstructured":"Kalpathy-Cramer, J. et al. Plus Disease in Retinopathy of Prematurity: Improving Diagnosis by Ranking Disease Severity and Using Quantitative Image Analysis. Ophthalmology 123, 2345\u20132351 (2016).","journal-title":"Ophthalmology"},{"key":"709_CR18","unstructured":"Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q.On calibration of modern neural networks. In International Conference on Machine Learning, 1321\u20131330 (PMLR, 2017)."},{"key":"709_CR19","unstructured":"Kuleshov, V., Fenner, N. & Ermon, S. Accurate uncertainties for deep learning using calibrated regression. In International Conference on Machine Learning, 2796\u20132804 (PMLR, 2018)."},{"key":"709_CR20","unstructured":"Laves, M.-H., Ihler, S., Fast, J. F., Kahrs, L. A. & Ortmaier, T. Well-calibrated regression uncertainty in medical imaging with deep learning. In Medical Imaging with Deep Learning, 393\u2013412 (PMLR, 2020)."},{"key":"709_CR21","doi-asserted-by":"publisher","first-page":"1836","DOI":"10.1093\/annonc\/mdy166","volume":"29","author":"HA Haenssle","year":"2018","unstructured":"Haenssle, H. A. et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 29, 1836\u20131842 (2018).","journal-title":"Ann. Oncol."},{"key":"709_CR22","doi-asserted-by":"publisher","first-page":"e1002686","DOI":"10.1371\/journal.pmed.1002686","volume":"15","author":"P Rajpurkar","year":"2018","unstructured":"Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the chexnext algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).","journal-title":"PLoS Med."},{"key":"709_CR23","doi-asserted-by":"publisher","unstructured":"Bakas, S., Reyes, M., Jakab, A., Bauer, S., Rempfler, M., Crimi, A., Shinohara, R. T., et al. Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge. https:\/\/doi.org\/10.17863\/CAM.38755 (2018).","DOI":"10.17863\/CAM.38755"},{"key":"709_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41746-020-0255-1","volume":"3","author":"MD Li","year":"2020","unstructured":"Li, M. D. et al. Siamese neural networks for continuous disease severity evaluation and change detection in medical imaging. NPJ Digital Med. 3, 1\u20139 (2020).","journal-title":"NPJ Digital Med."},{"key":"709_CR25","doi-asserted-by":"publisher","first-page":"556","DOI":"10.1016\/j.acra.2010.12.015","volume":"18","author":"JJ Heine","year":"2011","unstructured":"Heine, J. J., Cao, K., Rollison, D. E., Tiffenberg, G. & Thomas, J. A. A quantitative description of the percentage of breast density measurement using full-field digital mammography. Acad. Radiol. 18, 556\u2013564 (2011).","journal-title":"Acad. Radiol."},{"key":"709_CR26","doi-asserted-by":"publisher","first-page":"2338","DOI":"10.1016\/j.ophtha.2016.07.026","volume":"123","author":"JP Campbell","year":"2016","unstructured":"Campbell, J. P. et al. Plus Disease in Retinopathy of Prematurity: A Continuous Spectrum of Vascular Abnormality as a Basis of Diagnostic Variability. Ophthalmology 123, 2338\u20132344 (2016).","journal-title":"Ophthalmology"},{"key":"709_CR27","first-page":"e190065","volume":"2","author":"KA Thomas","year":"2020","unstructured":"Thomas, K. A. et al. Automated classification of radiographic knee osteoarthritis severity using deep neural networks. Radiol.: Artif. Intell. 2, e190065 (2020).","journal-title":"Radiol.: Artif. Intell."},{"key":"709_CR28","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1148\/radiol.2018180694","volume":"290","author":"CD Lehman","year":"2019","unstructured":"Lehman, C. D. et al. Mammographic breast density assessment using deep learning: clinical implementation. Radiology 290, 52\u201358 (2019).","journal-title":"Radiology"},{"key":"709_CR29","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1001\/jamaophthalmol.2018.1934","volume":"136","author":"JM Brown","year":"2018","unstructured":"Brown, J. M. et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 136, 803\u2013810 (2018).","journal-title":"JAMA Ophthalmol."},{"key":"709_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-018-20132-7","volume":"8","author":"A Tiulpin","year":"2018","unstructured":"Tiulpin, A., Thevenot, J., Rahtu, E., Lehenkari, P. & Saarakkala, S. Automatic knee osteoarthritis diagnosis from plain radiographs: a deep learning-based approach. Sci. Rep. 8, 1\u201310 (2018).","journal-title":"Sci. Rep."},{"key":"709_CR31","doi-asserted-by":"publisher","first-page":"1323","DOI":"10.1136\/annrheumdis-2013-204763","volume":"73","author":"M Cross","year":"2014","unstructured":"Cross, M. et al. The global burden of hip and knee osteoarthritis: estimates from the global burden of disease 2010 study. Ann. Rheumatic Dis. 73, 1323\u20131330 (2014).","journal-title":"Ann. Rheumatic Dis."},{"key":"709_CR32","doi-asserted-by":"publisher","first-page":"494","DOI":"10.1136\/ard.16.4.494","volume":"16","author":"JH Kellgren","year":"1957","unstructured":"Kellgren, J. H. & Lawrence, J. Radiological assessment of osteo-arthrosis. Ann. Rheumatic Dis. 16, 494 (1957).","journal-title":"Ann. Rheumatic Dis."},{"key":"709_CR33","doi-asserted-by":"publisher","first-page":"e191","DOI":"10.1016\/S2214-109X(19)30482-6","volume":"8","author":"M Arbyn","year":"2020","unstructured":"Arbyn, M. et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Global Health 8, e191\u2013e203 (2020).","journal-title":"Lancet Global Health"},{"key":"709_CR34","doi-asserted-by":"publisher","first-page":"1340","DOI":"10.1056\/NEJMoa1917338","volume":"383","author":"J Lei","year":"2020","unstructured":"Lei, J. et al. HPV Vaccination and the Risk of Invasive Cervical Cancer. N. Engl. J. Med. 383, 1340\u20131348 (2020).","journal-title":"N. Engl. J. Med."},{"key":"709_CR35","doi-asserted-by":"publisher","first-page":"281","DOI":"10.5306\/wjco.v6.i6.281","volume":"6","author":"R Catarino","year":"2015","unstructured":"Catarino, R., Petignat, P., Dongui, G. & Vassilakos, P. Cervical cancer screening in developing countries at a crossroad: Emerging technologies and policy choices. World J. Clin. Oncol. 6, 281\u2013290 (2015).","journal-title":"World J. Clin. Oncol."},{"key":"709_CR36","doi-asserted-by":"publisher","first-page":"2416","DOI":"10.1002\/ijc.33029","volume":"147","author":"Z Xue","year":"2020","unstructured":"Xue, Z. et al. A demonstration of automated visual evaluation of cervical images taken with a smartphone camera. Int. J. Cancer 147, 2416\u20132423 (2020).","journal-title":"Int. J. Cancer"},{"key":"709_CR37","doi-asserted-by":"publisher","first-page":"923","DOI":"10.1093\/jnci\/djy225","volume":"111","author":"L Hu","year":"2019","unstructured":"Hu, L. et al. An Observational Study of Deep Learning and Automated Evaluation of Cervical Images for Cancer Screening. J. Natl. Cancer Instit. 111, 923\u2013932 (2019).","journal-title":"J. Natl. Cancer Instit."},{"key":"709_CR38","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1590\/S1020-49892004000200002","volume":"15","author":"MC Bratti","year":"2004","unstructured":"Bratti, M. C. et al. Description of a seven-year prospective study of human papillomavirus infection and cervical neoplasia among 10 000 women in guanacaste, costa rica. Revista Panamericana de Salud P\u00fablica 15, 75\u201389 (2004).","journal-title":"Revista Panamericana de Salud P\u00fablica"},{"key":"709_CR39","doi-asserted-by":"publisher","first-page":"946","DOI":"10.5858\/2003-127-946-FTDFTA","volume":"127","author":"M Schiffman","year":"2003","unstructured":"Schiffman, M. & Solomon, D. Findings to date from the ascus-lsil triage study (alts). Arch. Pathol. Lab. Med. 127, 946\u2013949 (2003).","journal-title":"Arch. Pathol. Lab. Med."},{"key":"709_CR40","first-page":"7","volume":"69","author":"RL Siegel","year":"2019","unstructured":"Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2019. CA: A Cancer J. Clin. 69, 7\u201334 (2019).","journal-title":"CA: A Cancer J. Clin."},{"key":"709_CR41","doi-asserted-by":"crossref","unstructured":"Liberman, L. & Menell, J. H.Breast imaging reporting and data system (BI-RADS) https:\/\/pubmed.ncbi.nlm.nih.gov\/12117184\/. (2002).","DOI":"10.1016\/S0033-8389(01)00017-3"},{"key":"709_CR42","doi-asserted-by":"publisher","first-page":"670","DOI":"10.1093\/jnci\/87.9.670","volume":"87","author":"NF Boyd","year":"1995","unstructured":"Boyd, N. F. et al. Quantitative classification of mammographic densities and breast cancer risk: Results from the canadian national breast screening study. J. Natl. Cancer Instit. 87, 670\u2013675 (1995).","journal-title":"J. Natl. Cancer Instit."},{"key":"709_CR43","doi-asserted-by":"publisher","first-page":"2091","DOI":"10.1056\/NEJMoa1903986","volume":"381","author":"MF Bakker","year":"2019","unstructured":"Bakker, M. F. et al. Supplemental MRI Screening for Women with Extremely Dense Breast Tissue. N. Eng. J. Med. 381, 2091\u20132102 (2019).","journal-title":"N. Eng. J. Med."},{"key":"709_CR44","doi-asserted-by":"publisher","first-page":"1773","DOI":"10.1056\/NEJMoa052911","volume":"353","author":"ED Pisano","year":"2005","unstructured":"Pisano, E. D. et al. Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening. N. Eng. J. Med. 353, 1773\u20131783 (2005).","journal-title":"N. Eng. J. Med."},{"key":"709_CR45","unstructured":"IAPB, International Agency for the Prevention of Blindness. https:\/\/www.iapb.org:8443 (NA)."},{"key":"709_CR46","doi-asserted-by":"publisher","first-page":"991","DOI":"10.1001\/archopht.123.7.991","volume":"123","author":"GE Quinn","year":"2005","unstructured":"Quinn, G. E. The international classification of retinopathy of prematurity revisited: An international committee for the classification of retinopathy of prematurity. Arch. Ophthalmol. 123, 991\u2013999 (2005).","journal-title":"Arch. Ophthalmol."},{"key":"709_CR47","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1001\/archopht.125.7.875","volume":"125","author":"MF Chiang","year":"2007","unstructured":"Chiang, M. F., Jiang, L., Gelman, R., Du, Y. E. & Flynn, J. T. Interexpert agreement of plus disease diagnosis in retinopathy of prematurity. Arch. Ophthalmol. 125, 875\u2013880 (2007).","journal-title":"Arch. Ophthalmol."},{"key":"709_CR48","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1001\/jamaophthalmol.2018.1934","volume":"136","author":"JM Brown","year":"2018","unstructured":"Brown, J. M. et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 136, 803\u2013810 (2018).","journal-title":"JAMA Ophthalmol."},{"key":"709_CR49","first-page":"1902","volume":"2014","author":"MC Ryan","year":"2014","unstructured":"Ryan, M. C. et al. Development and Evaluation of Reference Standards for Image-based Telemedicine Diagnosis and Clinical Research Studies in Ophthalmology. AMIA. Ann. Symp. Proc. 2014, 1902\u20131910 (2014).","journal-title":"AMIA. Ann. Symp. Proc."},{"key":"709_CR50","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1001\/jamaophthalmol.2016.0611","volume":"134","author":"JP Campbell","year":"2016","unstructured":"Campbell, J. P. et al. Expert diagnosis of plus disease in retinopathy of prematurity from computer-based image analysis. JAMA Ophthalmol. 134, 651\u2013657 (2016).","journal-title":"JAMA Ophthalmol."},{"key":"709_CR51","doi-asserted-by":"crossref","unstructured":"Cao, W., Mirjalili, V., & Raschka, S. Rank consistent ordinal regression for neural networks with application to age estimation. Pattern Recognit Lett, 140, 325\u2013331 (2020).","DOI":"10.1016\/j.patrec.2020.11.008"},{"key":"709_CR52","doi-asserted-by":"publisher","unstructured":"Consortium, T. M. Project monai (2020). https:\/\/doi.org\/10.5281\/zenodo.4323059.","DOI":"10.5281\/zenodo.4323059"},{"key":"709_CR53","unstructured":"Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. In Wallach, H. et al. (eds.) Advances in Neural Information Processing Systems 32, 8024-8035 (Curran Associates, Inc., 2019). http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf."},{"key":"709_CR54","unstructured":"L\u00e9vy, D. & Jain, A. Breast mass classification from mammograms using deep convolutional neural networks. In Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (2016)."},{"key":"709_CR55","doi-asserted-by":"crossref","unstructured":"Siddiqi, R. Automated pneumonia diagnosis using a customized sequential convolutional neural network. In Proceedings of the 2019 3rd international conference on deep learning technologies, 64\u201370 (2019).","DOI":"10.1145\/3342999.3343001"},{"key":"709_CR56","doi-asserted-by":"publisher","first-page":"104005","DOI":"10.1088\/1361-6579\/aae304","volume":"39","author":"P Sodmann","year":"2018","unstructured":"Sodmann, P., Vollmer, M., Nath, N. & Kaderali, L. A convolutional neural network for ecg annotation as the basis for classification of cardiac rhythms. Physiol. Measure. 39, 104005 (2018).","journal-title":"Physiol. Measure."},{"key":"709_CR57","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-0255-1","volume":"3","author":"MD Li","year":"2020","unstructured":"Li, M. D. et al. Siamese neural networks for continuous disease severity evaluation and change detection in medical imaging. npj Dig. Med. 3, 48 (2020).","journal-title":"npj Dig. Med."},{"key":"709_CR58","doi-asserted-by":"publisher","first-page":"1653","DOI":"10.1016\/j.jacr.2020.05.015","volume":"17","author":"K Chang","year":"2020","unstructured":"Chang, K. et al. Multi-Institutional Assessment and Crowdsourcing Evaluation of Deep Learning for Automated Classification of Breast Density. J. Am. College .Radiol. 17, 1653\u20131662 (2020).","journal-title":"J. Am. College .Radiol."},{"key":"709_CR59","unstructured":"Kingma, D., & Ba, J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (2014)."},{"key":"709_CR60","doi-asserted-by":"crossref","unstructured":"Li, L., & Lin, H. T. Ordinal regression by extended binary classification. Advances in neural information processing systems 19 (2006).","DOI":"10.7551\/mitpress\/7503.003.0113"},{"key":"709_CR61","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1177\/096228029900800204","volume":"8","author":"JM Bland","year":"1999","unstructured":"Bland, J. M. & Altman, D. G. Measuring agreement in method comparison studies. Stat. Methods Med. Res. 8, 135\u2013160 (1999).","journal-title":"Stat. Methods Med. Res."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-022-00709-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-022-00709-3","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-022-00709-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,1]],"date-time":"2023-12-01T10:30:20Z","timestamp":1701426620000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-022-00709-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,18]]},"references-count":61,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["709"],"URL":"https:\/\/doi.org\/10.1038\/s41746-022-00709-3","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,18]]},"assertion":[{"value":"14 January 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 October 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 November 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"174"}}