{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T18:03:02Z","timestamp":1781719382132,"version":"3.54.5"},"reference-count":64,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,7,17]],"date-time":"2020-07-17T00:00:00Z","timestamp":1594944000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,17]],"date-time":"2020-07-17T00:00:00Z","timestamp":1594944000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson\u2019s disease, and Alzheimer\u2019s disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.<\/jats:p>","DOI":"10.1038\/s41746-020-0301-z","type":"journal-article","created":{"date-parts":[[2020,7,17]],"date-time":"2020-07-17T14:23:54Z","timestamp":1594995834000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":192,"title":["Deep representation learning of electronic health records to unlock patient stratification at scale"],"prefix":"10.1038","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4212-4709","authenticated-orcid":false,"given":"Isotta","family":"Landi","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4515-8090","authenticated-orcid":false,"given":"Benjamin S.","family":"Glicksberg","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hao-Chih","family":"Lee","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sarah","family":"Cherng","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Giulia","family":"Landi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Matteo","family":"Danieletto","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Joel T.","family":"Dudley","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5384-3605","authenticated-orcid":false,"given":"Cesare","family":"Furlanello","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7815-6000","authenticated-orcid":false,"given":"Riccardo","family":"Miotto","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2020,7,17]]},"reference":[{"key":"301_CR1","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1038\/nrg3208","volume":"13","author":"PB Jensen","year":"2012","unstructured":"Jensen, P. B., Jensen, L. J. & Brunak, S. Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13, 395 (2012).","journal-title":"Nat. Rev. Genet."},{"key":"301_CR2","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1038\/nrg3849","volume":"16","author":"GR Cutting","year":"2014","unstructured":"Cutting, G. R. Cystic fibrosis genetics: from molecular understanding to clinical application. Nat. Rev. Genet. 16, 45\u201356 (2014).","journal-title":"Nat. Rev. Genet."},{"key":"301_CR3","doi-asserted-by":"publisher","first-page":"838","DOI":"10.1038\/nbt.3587","volume":"34","author":"V Alexandrov","year":"2016","unstructured":"Alexandrov, V. et al. Large-scale phenome analysis defines a behavioral signature for Huntington\u2019s disease genotype in mice. Nat. Biotechnol. 34, 838\u201344 (2016).","journal-title":"Nat. Biotechnol."},{"key":"301_CR4","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1002\/ana.20834","volume":"59","author":"JW Langston","year":"2006","unstructured":"Langston, J. W. The Parkinson\u2019s complex: Parkinsonism is just the tip of the iceberg. Ann. Neurol. 59, 591\u2013596 (2006).","journal-title":"Ann. Neurol."},{"key":"301_CR5","doi-asserted-by":"publisher","unstructured":"de Mel, S., Lim, S. H., Tung, M. L. & Chng, W. J. Implications of heterogeneity in multiple myeloma. BioMed Res. Int. 1\u201312, https:\/\/doi.org\/10.1155\/2014\/232546 (2014).","DOI":"10.1155\/2014\/232546"},{"key":"301_CR6","doi-asserted-by":"publisher","first-page":"1107","DOI":"10.1007\/s00125-019-4909-y","volume":"62","author":"ER Pearson","year":"2019","unstructured":"Pearson, E. R. Type 2 diabetes: a multifaceted disease. Diabetologia 62, 1107\u20131112 (2019).","journal-title":"Diabetologia"},{"key":"301_CR7","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1038\/nrd.2017.226","volume":"17","author":"SA Dugger","year":"2017","unstructured":"Dugger, S. A., Platt, A. & Goldstein, D. B. Drug development in the era of precision medicine. Nat. Rev. Drug Discov. 17, 183\u2013196 (2017).","journal-title":"Nat. Rev. Drug Discov."},{"key":"301_CR8","doi-asserted-by":"crossref","unstructured":"Baytas, I. M. et al. Patient subtyping via time-aware LSTM Networks. In Proc. 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Matwin S, S., Yu, S. & Farooq, F.) 65\u201374 (ACM, New York, 2017).","DOI":"10.1145\/3097983.3097997"},{"key":"301_CR9","doi-asserted-by":"publisher","first-page":"e54","DOI":"10.1542\/peds.2013-0819","volume":"133","author":"F Doshi-Velez","year":"2013","unstructured":"Doshi-Velez, F., Ge, Y. & Kohane, I. Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics 133, e54\u2013e63 (2013).","journal-title":"Pediatrics"},{"key":"301_CR10","doi-asserted-by":"publisher","first-page":"311ra174","DOI":"10.1126\/scitranslmed.aaa9364","volume":"7","author":"L Li","year":"2015","unstructured":"Li, L. et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174 (2015).","journal-title":"Sci. Transl. Med."},{"key":"301_CR11","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-018-37545-z","volume":"9","author":"X Zhang","year":"2019","unstructured":"Zhang, X. et al. Data-driven subtyping of Parkinson\u2019s disease using longitudinal clinical records: a cohort study. Scientific Rep. 9, 797 (2019).","journal-title":"Scientific Rep."},{"key":"301_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41746-018-0076-7","volume":"2","author":"D Chen","year":"2019","unstructured":"Chen, D. et al. Deep learning and alternative learning strategies for retrospective real-world clinical data. npj Dig. Med. 2, 1\u20135 (2019).","journal-title":"npj Dig. Med."},{"key":"301_CR13","doi-asserted-by":"publisher","first-page":"1798","DOI":"10.1109\/TPAMI.2013.50","volume":"35","author":"Y Bengio","year":"2013","unstructured":"Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798\u20131828 (2013).","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"301_CR14","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436\u2013444 (2015).","journal-title":"Nature"},{"key":"301_CR15","doi-asserted-by":"publisher","first-page":"1236","DOI":"10.1093\/bib\/bbx044","volume":"19","author":"R Miotto","year":"2017","unstructured":"Miotto, R., Wang, F., Wang, S., Jiang, X. & Dudley, J. T. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 19, 1236\u20131246 (2017).","journal-title":"Brief Bioinform."},{"key":"301_CR16","doi-asserted-by":"publisher","first-page":"1419","DOI":"10.1093\/jamia\/ocy068","volume":"25","author":"C Xiao","year":"2018","unstructured":"Xiao, C., Choi, E. & Sun, J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J. Am. Med. Inform. Assoc. 25, 1419\u20131428 (2018).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"301_CR17","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0076295","volume":"8","author":"M Manchia","year":"2013","unstructured":"Manchia, M. et al. The impact of phenotypic and genetic heterogeneity on results of genome wide association studies of complex diseases. PLoS ONE 8, e76295 (2013).","journal-title":"PLoS ONE"},{"key":"301_CR18","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1146\/annurev-biodatasci-080917-013315","volume":"1","author":"JM Banda","year":"2018","unstructured":"Banda, J. M., Seneviratne, M., Hernandez-Boussard, T. & Shah, N. H. Advances in electronic phenotyping: from rule-based definitions to machine learning models. Annu. Rev. Biomed. Data Sci. 1, 53\u201368 (2018).","journal-title":"Annu. Rev. Biomed. Data Sci."},{"key":"301_CR19","doi-asserted-by":"publisher","first-page":"756","DOI":"10.1001\/jama.1980.03300340032015","volume":"243","author":"RA Cote","year":"1980","unstructured":"Cote, R. A. & Robboy, S. Progress in medical information management: systematized nomenclature of medicine (snomed). JAMA 243, 756\u2013762 (1980).","journal-title":"JAMA"},{"key":"301_CR20","doi-asserted-by":"publisher","DOI":"10.1038\/srep26094","volume":"6","author":"R Miotto","year":"2016","unstructured":"Miotto, R., Li, L., Kidd, B. A. & Dudley, J. T. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Scientific Rep. 6, 26094 (2016).","journal-title":"Scientific Rep."},{"key":"301_CR21","doi-asserted-by":"publisher","first-page":"917","DOI":"10.1016\/j.patcog.2003.10.003","volume":"37","author":"ER Dougherty","year":"2004","unstructured":"Dougherty, E. R. & Brun, M. A probabilistic theory of clustering. Pattern Recogn. 37, 917\u2013925 (2004).","journal-title":"Pattern Recogn."},{"key":"301_CR22","doi-asserted-by":"publisher","unstructured":"Dalton, L. A., Benalc\u00e1zar, M. E. & Dougherty, E. R. Optimal clustering under uncertainty. PLoS ONE 13, https:\/\/doi.org\/10.1371\/journal.pone.0204627 (2018).","DOI":"10.1371\/journal.pone.0204627"},{"key":"301_CR23","doi-asserted-by":"publisher","first-page":"807","DOI":"10.1016\/j.patcog.2006.06.026","volume":"40","author":"M Brun","year":"2007","unstructured":"Brun, M. et al. Model-based evaluation of clustering validation measures. Pattern Recogn. 40, 807\u2013824 (2007).","journal-title":"Pattern Recogn."},{"key":"301_CR24","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1007\/s10791-008-9066-8","volume":"12","author":"E Amig\u00f3","year":"2009","unstructured":"Amig\u00f3, E., Gonzalo, J., Artiles, J. & Verdejo, F. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inform. Retrieval 12, 461\u2013486 (2009).","journal-title":"Inform. Retrieval"},{"key":"301_CR25","doi-asserted-by":"publisher","unstructured":"McInnes, L., Healy, J., Saul N., & Grossberger, L. UMAP: Uniform Manifold Approximation and Projection for dimension reduction. J Open Source Softw 3, 861, https:\/\/doi.org\/10.21105\/joss.00861 (The Open Journal, 2018).","DOI":"10.21105\/joss.00861"},{"key":"301_CR26","unstructured":"Cowie, C. C., Casagrande, S. S. & Geiss, L. S. Prevalence and incidence of type 2 diabetes and prediabetes. In Diabetes in America 3rd edn (eds Barrett-Connor, E. et al.) 3\u20131 (National Institutes of Health, Bethesda, 2018)."},{"key":"301_CR27","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1016\/S1474-4422(06)70471-9","volume":"5","author":"LML de Lau","year":"2006","unstructured":"de Lau, L. M. L. & Breteler, M. M. B. Epidemiology of Parkinson\u2019s disease. Lancet Neurol. 5, 525\u2013535 (2006).","journal-title":"Lancet Neurol."},{"key":"301_CR28","doi-asserted-by":"publisher","first-page":"111","DOI":"10.31887\/DCNS.2009.11.2\/cqiu","volume":"11","author":"C Qiu","year":"2009","unstructured":"Qiu, C., Kivipelto, M. & von Strauss, E. Epidemiology of alzheimeras disease: occurrence, determinants, and strategies toward intervention. Dialog. Clin. Neurosci. 11, 111 (2009).","journal-title":"Dialog. Clin. Neurosci."},{"key":"301_CR29","doi-asserted-by":"crossref","unstructured":"Kazandjian, D. Multiple myeloma epidemiology and survival: a unique malignancy. In Seminars in Oncology, Vol. 43 (eds Ahn I. E. & Mailankody, S.) 676\u2013681 (Elsevier, 2016).","DOI":"10.1053\/j.seminoncol.2016.11.004"},{"key":"301_CR30","unstructured":"Cancer Stat Facts: Prostate Cancer. https:\/\/seer.cancer.gov\/statfacts\/html\/prost.html (2019)."},{"key":"301_CR31","unstructured":"Cancer Stat Facts: Female Breast Cancer. https:\/\/seer.cancer.gov\/statfacts\/html\/breast.html (2019)."},{"key":"301_CR32","doi-asserted-by":"crossref","first-page":"1175","DOI":"10.1002\/cphy.c100049","volume":"1","author":"V Vallon","year":"2011","unstructured":"Vallon, V. & Komers, R. Pathophysiology of the diabetic kidney. Compr. Physiol. 1, 1175\u20131232 (2011).","journal-title":"Compr. Physiol."},{"key":"301_CR33","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1016\/j.critrevonc.2009.06.002","volume":"74","author":"L Malaguarnera","year":"2010","unstructured":"Malaguarnera, L., Cristaldi, E. & Malaguarnera, M. The role of immunity in elderly cancer. Crit. Rev. Oncol. Hematol. 74, 40\u201360 (2010).","journal-title":"Crit. Rev. Oncol. Hematol."},{"key":"301_CR34","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1002\/(SICI)1096-9136(199701)14:1<29::AID-DIA300>3.0.CO;2-V","volume":"14","author":"M Delamaire","year":"1997","unstructured":"Delamaire, M. et al. Impaired leucocyte functions in diabetic patients. Diabetic Med. 14, 29\u201334 (1997).","journal-title":"Diabetic Med."},{"key":"301_CR35","doi-asserted-by":"publisher","first-page":"1100","DOI":"10.1001\/archneur.63.8.1100","volume":"63","author":"S Jain","year":"2006","unstructured":"Jain, S., Lo, S. E. & Louis, E. D. Common misdiagnosis of a common neurological disorder. Arch. Neurol. 63, 1100\u20131104 (2006).","journal-title":"Arch. Neurol."},{"key":"301_CR36","doi-asserted-by":"publisher","first-page":"1908","DOI":"10.1212\/01.WNL.0000144277.06917.CC","volume":"63","author":"G Alves","year":"2004","unstructured":"Alves, G., Wentzel-Larsen, T. & Larsen, J. P. Is fatigue an independent and persistent symptom in patients with Parkinson disease? Neurology 63, 1908\u20131911 (2004).","journal-title":"Neurology"},{"key":"301_CR37","doi-asserted-by":"publisher","first-page":"1712","DOI":"10.1002\/mds.27461","volume":"33","author":"M Siciliano","year":"2018","unstructured":"Siciliano, M. et al. Fatigue in Parkinson\u2019s disease: a systematic review and meta-analysis. Mov. Disord. 33, 1712\u20131723 (2018).","journal-title":"Mov. Disord."},{"key":"301_CR38","unstructured":"Alzheimer\u2019s association. Younger\/Early-Onset Alzheimer\u2019s. https:\/\/www.alz.org\/alzheimers-dementia\/what-is-alzheimers\/younger-early-onset (2019)."},{"key":"301_CR39","doi-asserted-by":"publisher","first-page":"1126","DOI":"10.1136\/jnnp-2012-304022","volume":"84","author":"H Manji","year":"2013","unstructured":"Manji, H., J\u00e4ger, H. R. & Winston, A. HIV, dementia and antiretroviral drugs: 30 years of an epidemic. J. Neurol. Neurosurg. Psychiatry 84, 1126\u20131137 (2013).","journal-title":"J. Neurol. Neurosurg. Psychiatry"},{"key":"301_CR40","doi-asserted-by":"publisher","first-page":"1475","DOI":"10.1001\/jama.288.12.1475","volume":"288","author":"CG Lyketsos","year":"2002","unstructured":"Lyketsos, C. G. et al. Prevalence of neuropsychiatric symptoms in dementia and mild cognitive impairment. JAMA 288, 1475\u20131483 (2002).","journal-title":"JAMA"},{"key":"301_CR41","doi-asserted-by":"publisher","first-page":"710","DOI":"10.1016\/j.jalz.2014.10.008","volume":"11","author":"HM Snyder","year":"2015","unstructured":"Snyder, H. M. et al. Vascular contributions to cognitive impairment and dementia including Alzheimer\u2019s disease. Alzheimers Dement. 11, 710\u2013717 (2015).","journal-title":"Alzheimers Dement."},{"key":"301_CR42","first-page":"CD001190","volume":"6","author":"JS Birks","year":"2018","unstructured":"Birks, J. S. & Harvey, R. J. Donepezil for dementia due to Alzheimer\u2019s disease. Cochrane Database Syst. Rev. 6, CD001190 (2018).","journal-title":"Cochrane Database Syst. Rev."},{"key":"301_CR43","doi-asserted-by":"publisher","DOI":"10.1038\/srep35333","volume":"6","author":"MV Lombardo","year":"2016","unstructured":"Lombardo, M. V. et al. Unsupervised data-driven stratification of mentalizing heterogeneity in autism. Scientific Rep. 6, 35333 (2016).","journal-title":"Scientific Rep."},{"key":"301_CR44","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1016\/j.ijmedinf.2019.05.006","volume":"129","author":"E Stevens","year":"2019","unstructured":"Stevens, E. et al. Identification and analysis of behavioral phenotypes in autism spectrum disorder via unsupervised machine learning. Int. J. Med. Inform. 129, 29\u201336 (2019).","journal-title":"Int. J. Med. Inform."},{"key":"301_CR45","unstructured":"Choi, E., Bahadori, M. & Sun, J. Doctor AI: predicting clinical events via recurrent neural networks. In Proc. Machine Learning for Healthcare, Vol. 56 (eds Doshi-Velez, F. et al.) (PMLR, 2016)."},{"key":"301_CR46","doi-asserted-by":"crossref","unstructured":"Pham, T., Tran, T., Phung, D. & Venkatesh, S. DeepCare: A deep dynamic memory model for predictive medicine. In Advances in Knowledge Discovery and Data Mining (eds Bailey, J. et al.) 30\u201341 (Springer International Publishing, 2016).","DOI":"10.1007\/978-3-319-31750-2_3"},{"key":"301_CR47","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-018-0029-1","volume":"1","author":"A Rajkomar","year":"2018","unstructured":"Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Dig. Med. 1, 18 (2018).","journal-title":"npj Dig. Med."},{"key":"301_CR48","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1016\/j.jbi.2016.10.007","volume":"64","author":"BK Beaulieu-Jones","year":"2016","unstructured":"Beaulieu-Jones, B. K. et al. Semi-supervised learning of the electronic health record for phenotype stratification. J. Biomed. Inform. 64, 168\u2013178 (2016).","journal-title":"J. Biomed. Inform."},{"key":"301_CR49","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1109\/JBHI.2016.2633963","volume":"21","author":"P Nguyen","year":"2017","unstructured":"Nguyen, P., Tran, T., Wickramasinghe, N. & Venkatesh, S. Deepr: a convolutional net for medical records. IEEE J. Biomed. Health Inform. 21, 22\u201330 (2017).","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"301_CR50","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1109\/TNB.2018.2837622","volume":"17","author":"Q Suo","year":"2018","unstructured":"Suo, Q. et al. Deep patient similarity learning for personalized healthcare. IEEE Trans. NanoBiosci. 17, 219\u2013227 (2018).","journal-title":"IEEE Trans. NanoBiosci."},{"key":"301_CR51","doi-asserted-by":"publisher","first-page":"e20","DOI":"10.1093\/jamia\/ocv130","volume":"23","author":"W Wei","year":"2015","unstructured":"Wei, W. et al. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J. Am. Med. Inform. Assoc. 23, e20\u2013e27 (2015).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"301_CR52","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1093\/jamia\/ocv202","volume":"23","author":"JC Kirby","year":"2016","unstructured":"Kirby, J. C. et al. Phekb: a catalog and workflow for creating electronic phenotype algorithms for transportability. J. Am. Med. Inform. Assoc. 23, 1046\u20131052 (2016).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"301_CR53","doi-asserted-by":"publisher","first-page":"731","DOI":"10.1093\/jamia\/ocw011","volume":"23","author":"Y Halpern","year":"2016","unstructured":"Halpern, Y., Horng, S., Choi, Y. & Sontag, D. Electronic medical record phenotyping using the anchor and learn framework. J. Am. Med. Inform. Assoc. 23, 731\u2013740 (2016).","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"301_CR54","doi-asserted-by":"publisher","unstructured":"Glicksberg, B. S. et al. Automated disease cohort selection using word embeddings from Electronic Health Records. In Biocomputing 2018 (eds Altman, R. B. et al.) 145\u2013156, https:\/\/doi.org\/10.1142\/9789813235533_0014 (World Scientific, 2017).","DOI":"10.1142\/9789813235533_0014"},{"key":"301_CR55","first-page":"993","volume":"3","author":"D Blei","year":"2003","unstructured":"Blei, D., Ng, A. & Jordan, M. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993\u20131022 (2003).","journal-title":"J. Mach. Learn. Res."},{"key":"301_CR56","unstructured":"Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at https:\/\/arxiv.org\/abs\/1301.3781 (2013)."},{"key":"301_CR57","unstructured":"Jonquet, C., Shah, N. H. & Musen, M. A. The open biomedical annotator. In AMIA Summits on Translational Science Proceedings (ed American Medical Informatics Association) 56\u201360 (American Medical Informatics Association, Bethesda, MD, 2009)."},{"key":"301_CR58","doi-asserted-by":"publisher","DOI":"10.1186\/2041-1480-3-S1-S5","volume":"17","author":"P Lependu","year":"2012","unstructured":"Lependu, P., Iyer, S. V., Fairon, C. & Shah, N. H. Annotation analysis for testing drug safety signals using unstructured clinical notes. J. Biomed. Seman. 17, s5 (2012).","journal-title":"J. Biomed. Seman."},{"key":"301_CR59","unstructured":"Choi, Y., Chiu, C. Y. I. & Sontag, D. Learning low-dimensional representations of medical concepts. In AMIA Summits on Translational Science Proceedings (ed American Medical Informatics Association) 41\u201350 (American Medical Informatics Association, Bethesda, MD, 2016)."},{"key":"301_CR60","doi-asserted-by":"crossref","unstructured":"Zhu, Z. et al. Measuring patient similarities via a deep architecture with medical concept embedding. In 2016 IEEE 16th International Conference on Data Mining (eds Bonchi, E. et al.) 749\u2013758 (IEEE, 2016).","DOI":"10.1109\/ICDM.2016.0086"},{"key":"301_CR61","doi-asserted-by":"crossref","unstructured":"Suo, Q. et al. Personalized disease prediction using a CNN-based similarity learning method. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (eds Hu, X. et al.) 811\u2013816 (IEEE, 2017).","DOI":"10.1109\/BIBM.2017.8217759"},{"key":"301_CR62","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825\u20132830 (2011).","journal-title":"J. Mach. Learn. Res."},{"key":"301_CR63","unstructured":"Paszke, A. et al. Automatic differentiation in pytorch. In (eds Wiltschko, A., van Merri\u00ebnboer, B. & Lamblin, P.) NeurIPS Autodiff Workshop, https:\/\/autodiff-workshop.github.io\/ (2017)."},{"key":"301_CR64","unstructured":"Kingma, D. & Adam, J. B. Adam: a method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) 1\u201315, https:\/\/dblp.org\/db\/conf\/iclr\/iclr2015 (2015)."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-020-0301-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-020-0301-z","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-020-0301-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T02:24:46Z","timestamp":1670379886000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-020-0301-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,17]]},"references-count":64,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["301"],"URL":"https:\/\/doi.org\/10.1038\/s41746-020-0301-z","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,17]]},"assertion":[{"value":"9 April 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 June 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"96"}}