{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,5]],"date-time":"2026-04-05T09:40:06Z","timestamp":1775382006979,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,6,28]],"date-time":"2019-06-28T00:00:00Z","timestamp":1561680000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,6,28]],"date-time":"2019-06-28T00:00:00Z","timestamp":1561680000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Machine learning (ML) and its parent technology trend, artificial intelligence (AI), are deriving novel insights from ever larger and more complex datasets. Efficient and accurate AI analytics require fastidious data science\u2014the careful curating of knowledge representations in databases, decomposition of data matrices to reduce dimensionality, and preprocessing of datasets to mitigate the confounding effects of messy (i.e., missing, redundant, and outlier) data. Messier, bigger and more dynamic medical datasets create the potential for ML computing systems querying databases to draw erroneous data inferences, portending real-world human health consequences. High-dimensional medical datasets can be static or dynamic. For example, principal component analysis (PCA) used within R computing packages can speed &amp; scale disease association analytics for deriving polygenic risk scores from static gene-expression microarrays. Robust PCA of <jats:italic>k<\/jats:italic>-dimensional subspace data accelerates image acquisition and reconstruction of dynamic 4-D magnetic resonance imaging studies, enhancing tracking of organ physiology, tissue relaxation parameters, and contrast agent effects. Unlike other data-dense business and scientific sectors, medical AI users must be aware that input data quality limitations can have health implications, potentially reducing analytic model accuracy for predicting clinical disease risks and patient outcomes. As AI technologies find more health applications, physicians should contribute their health domain expertize to rules-\/ML-based computer system development, inform input data provenance and recognize the importance of data preprocessing quality assurance <jats:italic>before<\/jats:italic> interpreting the clinical implications of intelligent machine outputs to patients.<\/jats:p>","DOI":"10.1038\/s41746-019-0138-5","type":"journal-article","created":{"date-parts":[[2019,6,28]],"date-time":"2019-06-28T10:03:14Z","timestamp":1561716194000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":53,"title":["The medical AI insurgency: what physicians must know about data to practice with intelligent machines"],"prefix":"10.1038","volume":"2","author":[{"given":"D. Douglas","family":"Miller","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,6,28]]},"reference":[{"key":"138_CR1","doi-asserted-by":"crossref","unstructured":"Stupple, A., Singerman, D., Celi, L. A. The reproducibility crisis in the age of digital medicine. npj Digit. Med. Article 2. https:\/\/www.nature.com\/articles\/s41746-019-0079-z (2019). Accessed Feb 18, 2019.","DOI":"10.1038\/s41746-019-0079-z"},{"key":"138_CR2","doi-asserted-by":"publisher","first-page":"1274","DOI":"10.1109\/JPROC.2018.2853498","volume":"106","author":"N Vaswani","year":"2018","unstructured":"Vaswani, N., Chi, Y. & Bouwmans, T. Rethinking PCA for Modern Data Sets: theory, algorithms, and applications. Proc. IEEE 106, 1274\u20131276 (2018).","journal-title":"Proc. IEEE"},{"key":"138_CR3","unstructured":"Singh, T. Why Data Scientists are Crucial for AI Transformation, Forbes. https:\/\/www.forbes.com\/sites\/cognitiveworld\/2018\/09\/13\/why-data-scientists-are-crucial-for-ai-transformation\/amp\/ (2018). Accessed Nov 22, 2018."},{"key":"138_CR4","unstructured":"Ng, A. What Artificial Intelligence Can and Can\u2019t Do Right Now. Harvard Business Review. Nov 9, 2016."},{"key":"138_CR5","unstructured":"Simonite, T. How A. I. Can Keep Accelerating after Moore\u2019s Law. MIT Technology Review. https:\/\/www.technologyreview.com\/s\/607917\/how-ai-can-keep-accelerating-after-morres-law\/amp\/ (2017). Accessed Nov 21, 2018."},{"issue":"2","key":"138_CR6","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1016\/j.amjmed.2017.10.035","volume":"131","author":"D. Douglas Miller","year":"2018","unstructured":"Miller, D. D., Brown, E. W. Artificial intelligence in medical practice: the question to the answer? Am. J. Med. 131, 129\u2013133 (2017)","journal-title":"The American Journal of Medicine"},{"key":"138_CR7","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y Le Cun","year":"2015","unstructured":"Le Cun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436\u2013444 (2015).","journal-title":"Nature"},{"key":"138_CR8","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.cobeha.2018.12.010","volume":"29","author":"M Garnelo","year":"2019","unstructured":"Garnelo, M. & Shanahan, M. Reconciling deep learning with symbolic artificial intelligence: representing objects and relations. Curr. Opin. Behav. Sci. 29, 17\u201323 (2019). Accessed Feb 16.","journal-title":"Curr. Opin. Behav. Sci."},{"key":"138_CR9","unstructured":"Chollet, F. Posted @fchollet Apr 20, 2018."},{"key":"138_CR10","unstructured":"Goodfellow, I., et al. Generative Adversarial Nets. https:\/\/papers.nips.cc\/paper\/5423-generative-adversarial-nets.pdf (2014)."},{"key":"138_CR11","doi-asserted-by":"publisher","first-page":"1194","DOI":"10.1021\/acs.jcim.7b00690","volume":"58","author":"E Putin","year":"2018","unstructured":"Putin, E. et al. Reinforced adversarial neural computer for de novo molecular design. J. Chem. Inf. Model. 58, 1194\u20131204 (2018). Accessed Feb. 24, 2019.","journal-title":"J. Chem. Inf. Model."},{"key":"138_CR12","unstructured":"Giles, M. quoting LeCun, Y. Intelligent Machines. MIT Technology Review (2018)."},{"key":"138_CR13","unstructured":"Webb, A. The Big Nine: How the Tech Titans & Their Thinking Machines Could Warp Humanity. (Public Affairs, Hachette Book Group, New York, NY, 2019)."},{"key":"138_CR14","unstructured":"Knight, M. Taxonomy vs Ontology: Machine Learning Breakthroughs. DATAVERSITY. http:\/\/www.dataversity.net\/taxonomy-vs-ontology-machine-learning-breakthroughs\/ (2017). Accessed July 25, 2018."},{"key":"138_CR15","unstructured":"Press, G. Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says. Forbes. https:\/\/www.forbes.com\/sites\/gilpress\/2016\/03\/23\/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says\/amp\/ (2016). Accessed Jan 18, 2019."},{"key":"#cr-split#-138_CR16.1","unstructured":"Krauthgamer, R., Lee, J. R. Navigating Nets: Simple Algorithms for Proximity Search. University of Washington. https:\/\/www.csun.edu\/~ctoth\/Handbook\/chap32.pdf (2004)"},{"key":"#cr-split#-138_CR16.2","unstructured":"Quoc, V.L., Ranzato, M. A., Monga, R., Devin, M., Chen, K., Corrado, G. S., Dean, J., Ng, A. Y. Building high-level features using large scale unsupervised learning. Comput. Sci. Mach. Learn. https:\/\/arxiv.org\/abs\/1112.6209 (2012). Accessed Nov 12, 2018."},{"key":"138_CR17","unstructured":"Li, K., Malik, J. Fast k-Nearest Neighbor Search via Dynamic Continuous Indexing. IEEE. https:\/\/arxiv.org\/abs\/1512.00442v3 (2016)"},{"key":"138_CR18","unstructured":"Wickham, H., Grolemund, G. Introduction: Chapter 1 in R for DataScience. (O\u2019Reilly Media, Inc., North Sebastopol, USA, 2017) https:\/\/r4ds.had.co.nz\/introduction.html. Accessed Feb 10, 2019."},{"key":"138_CR19","unstructured":"Freitas, A., Sales, J. E., Handschuh, S., Curry, E. How Hard is this Query? Measuring the Semantic Complexity of Schema-agnostic Queries. 2015. http:\/\/aclweb.org\/anthology\/W\/W15\/W15-0133.pdf."},{"key":"138_CR20","doi-asserted-by":"publisher","unstructured":"Oh, H., Yang, H., Yi, K. Learning a Strategy for Adapting a Program Analysis via Bayesian Optimization. Proc of ACM SIGPLAN Intl Conf on Object-Oriented Programming, Systems, Language and Applications. pp. 572\u2013588 (2015). https:\/\/doi.org\/10.1145\/2F2814270.2814309 Accessed Feb 22, 2019.","DOI":"10.1145\/2F2814270.2814309"},{"key":"138_CR21","unstructured":"Goodfellow, I., Bengio, Y., Courville, A. Deep Learning. (The MIT Press, Cambridge MA, London, UK, 2016). www.deeplearningbook.org."},{"key":"138_CR22","doi-asserted-by":"publisher","first-page":"1277","DOI":"10.1109\/JPROC.2018.2846730","volume":"106","author":"I Johnstone","year":"2018","unstructured":"Johnstone, I. & Paul, D. PCA in High Dimensions: An Orientation. Proc. IEEE 106, 1277\u20131292 (2018).","journal-title":"Proc. IEEE"},{"key":"138_CR23","doi-asserted-by":"publisher","first-page":"1427","DOI":"10.1109\/JPROC.2018.2853589","volume":"106","author":"T Bouwmans","year":"2018","unstructured":"Bouwmans, T., Javed, S., Zhang, H., Lin, Z. & Otazo, R. On the applications of robust PCA in image and video processing. Proc. IEEE 106, 1427\u20131457 (2018).","journal-title":"Proc. IEEE"},{"key":"138_CR24","doi-asserted-by":"publisher","first-page":"1359","DOI":"10.1109\/JPROC.2018.2844126","volume":"106","author":"N Vaswani","year":"2018","unstructured":"Vaswani, N. & Narayanamurthy, P. Static and dynamic robust PCA and matrix completion: a review. Proc. IEEE 106, 1359\u20131379 (2018).","journal-title":"Proc. IEEE"},{"key":"138_CR25","unstructured":"P. C. A. Whitening\u2014Unsupervised Feature Learning and Deep Learning (Stanford University Tutorial) http:\/\/ufldl.stanford.edu\/tutorial\/unsupervised\/PCAWhitening\/ Accessed Sep 2, 2018."},{"key":"138_CR26","doi-asserted-by":"crossref","unstructured":"Cand\u00e8s, E., Li, X., Ma, Y., Wright, J. Robust Principle Component Analysis. Int. J. ACM 58 ; Article no. 10 (2011).","DOI":"10.1145\/1970392.1970395"},{"key":"138_CR27","doi-asserted-by":"crossref","unstructured":"Violante, A. An Introduction to t-SNE with a Python Example. Towards Data Science. https:\/\/towardsdatascience.com\/an-introduction-to-t-sne-with-python-example-5a3a293108d1 (2018). Accessed Mar 20, 2019.","DOI":"10.1017\/9781108591942.001"},{"issue":"11","key":"138_CR28","doi-asserted-by":"publisher","first-page":"1272","DOI":"10.1016\/j.amjmed.2018.05.038","volume":"131","author":"D. Douglas Miller","year":"2018","unstructured":"Miller, D. D. The big health data\u2014intelligent machine paradox. Am. J. Med. https:\/\/doi.org\/10.1016\/j.amjmed.2018.05.038 (2018).","journal-title":"The American Journal of Medicine"},{"issue":"1","key":"138_CR29","doi-asserted-by":"publisher","first-page":"9","DOI":"10.2214\/AJR.18.19914","volume":"212","author":"D. Douglas Miller","year":"2019","unstructured":"Miller, D. D., Brown, E. W. How cognitive machines can augment medical imaging. Am. J. Roentgenol. https:\/\/doi.org\/10.2214\/AJR.18.19914 (2019).","journal-title":"American Journal of Roentgenology"},{"key":"138_CR30","doi-asserted-by":"crossref","unstructured":"Francisco, C., Laguna, P., Leif, S., Andreas, B. Principal component analysis in ECGSignal processing. J. Adv. Signal Process. https:\/\/www.researchgate.net\/publication\/26620236_Principal_Component_Analysis_in_ECG_Signal_Processing (2007). Accessed Jan 23, 2019.","DOI":"10.1155\/2007\/74580"},{"key":"138_CR31","unstructured":"Deepa, R., Shanmugam, A., Sivasenapathi, B. Reasoning of EEG waveform using revised principal component analysis. Biomed. Res. http:\/\/www.alliedacademies.org\/articles\/reasoning-of-eeg-waveform-using-revised-principal-component-analysis-rpca.html (2017). Accessed Jan 24, 2019."},{"key":"138_CR32","doi-asserted-by":"publisher","first-page":"2781","DOI":"10.1093\/bioinformatics\/bty185","volume":"34","author":"F Priv\u00e9","year":"2018","unstructured":"Priv\u00e9, F., Aschard, H., Ziyatdinov, A. & Blum, M. G. B. Efficient analysis of large-scale genome-wide data with two R packages: bigstatsrand bigsnpr. Bioinformatics 34, 2781\u20132787 (2018).Accessed Feb. 9, 2019.","journal-title":"Bioinformatics"},{"key":"138_CR33","unstructured":"Dai, P., Gwadry-Sridhar, F., Bauer, M., Borrie, M. A Hybrid Manifold Learning Algorithm for the Diagnosis and Prognostication of Alzheimer\u2019s disease. AMIA Annual Symp Proc pp: 475-483, 2015 https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC4765614\/ Accessed Nov 23, 2018."},{"key":"138_CR34","doi-asserted-by":"crossref","unstructured":"Fu, Y., Wang, W., Wang, C. Image change detection method based on RPCA and low-rank decomposition. Proc. Chinese Control Conf. (CCC), 9412\u20139417 (2016).","DOI":"10.1109\/ChiCC.2016.7554851"},{"key":"138_CR35","unstructured":"Rajkomar, A., et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. https:\/\/www.nature.com\/articles\/s41746-018-0029-1 (2018). Accessed Mar 19, 2019."},{"key":"138_CR36","doi-asserted-by":"crossref","unstructured":"Char, D. S., Shah, N. H. & Magnus, D. Implementing machine learning in health care\u2014addressing ethical challenges. N. Engl. J. Med. 378, 981\u2013983 (2018). https:\/\/www.nejm.org\/doi\/full\/10.1056\/NEJMp1714229.","DOI":"10.1056\/NEJMp1714229"},{"issue":"3","key":"138_CR37","doi-asserted-by":"publisher","first-page":"1559","DOI":"10.13005\/bpj\/1266","volume":"10","author":"Jeba Shiney","year":"2017","unstructured":"Shiney, O. J., Singh, J. A., Shan, B. P. A Review on techniques for computer aided diagnosis of sift tissue markers for detection of downs syndrome in ultrasound fetal images. Biomed. Pharmacol. J. 10: 1559\u20131568. https:\/\/pdfs.semanticscholar.org\/486c\/ce1acc1b85cbde9752f914df1db279171e9a.pdf (2017). Accessed Mar 4, 2019.","journal-title":"Biomedical and Pharmacology Journal"},{"key":"138_CR38","unstructured":"Abrams, C. Google\u2019s effort to prevent blindness shows AI challenges. Wall Str. J. https:\/\/www.wsj.com\/amp\/articles\/googles-effort-to-prevent-blindness-hits-roadblock-11548504004 (2019). Accessed Mar 14, 2019."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0138-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0138-5","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0138-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,17]],"date-time":"2022-12-17T18:30:10Z","timestamp":1671301810000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-019-0138-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,28]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,12]]}},"alternative-id":["138"],"URL":"https:\/\/doi.org\/10.1038\/s41746-019-0138-5","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,6,28]]},"assertion":[{"value":"8 April 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 June 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The author declares no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"62"}}