{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T08:02:20Z","timestamp":1777104140351,"version":"3.51.4"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,6]],"date-time":"2023-06-06T00:00:00Z","timestamp":1686009600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,6]],"date-time":"2023-06-06T00:00:00Z","timestamp":1686009600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["npj Digit. Med."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The electrocardiogram (ECG) is a ubiquitous diagnostic modality. Convolutional neural networks (CNNs) applied towards ECG analysis require large sample sizes, and transfer learning approaches for biomedical problems may result in suboptimal performance when pre-training is done on natural images. We leveraged masked image modeling to create a vision-based transformer model, HeartBEiT, for electrocardiogram waveform analysis. We pre-trained this model on 8.5 million ECGs and then compared performance vs. standard CNN architectures for diagnosis of hypertrophic cardiomyopathy, low left ventricular ejection fraction and ST elevation myocardial infarction using differing training sample sizes and independent validation datasets. We find that HeartBEiT has significantly higher performance at lower sample sizes compared to other models. We also find that HeartBEiT improves explainability of diagnosis by highlighting biologically relevant regions of the EKG vs. standard CNNs. Domain specific pre-trained transformer models may exceed the classification performance of models trained on natural images especially in very low data regimes. The combination of the architecture and such pre-training allows for more accurate, granular explainability of model predictions.<\/jats:p>","DOI":"10.1038\/s41746-023-00840-9","type":"journal-article","created":{"date-parts":[[2023,6,6]],"date-time":"2023-06-06T17:04:18Z","timestamp":1686071058000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":94,"title":["A foundational vision transformer improves diagnostic performance for electrocardiograms"],"prefix":"10.1038","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3343-744X","authenticated-orcid":false,"given":"Akhil","family":"Vaid","sequence":"first","affiliation":[]},{"given":"Joy","family":"Jiang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1525-8541","authenticated-orcid":false,"given":"Ashwin","family":"Sawant","sequence":"additional","affiliation":[]},{"given":"Stamatios","family":"Lerakis","sequence":"additional","affiliation":[]},{"given":"Edgar","family":"Argulian","sequence":"additional","affiliation":[]},{"given":"Yuri","family":"Ahuja","sequence":"additional","affiliation":[]},{"given":"Joshua","family":"Lampert","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8135-6858","authenticated-orcid":false,"given":"Alexander","family":"Charney","sequence":"additional","affiliation":[]},{"given":"Hayit","family":"Greenspan","sequence":"additional","affiliation":[]},{"given":"Jagat","family":"Narula","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4515-8090","authenticated-orcid":false,"given":"Benjamin","family":"Glicksberg","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6319-4314","authenticated-orcid":false,"given":"Girish N","family":"Nadkarni","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,6]]},"reference":[{"key":"840_CR1","doi-asserted-by":"publisher","first-page":"S98","DOI":"10.1016\/0022-0736(88)90068-4","volume":"21","author":"E Drazen","year":"1988","unstructured":"Drazen, E., Mann, N., Borun, R., Laks, M. & Bersen, A. Survey of computer-assisted electrocardiography in the United States. J. Electrocardiol. 21, S98\u2013S104 (1988).","journal-title":"J. Electrocardiol."},{"key":"840_CR2","doi-asserted-by":"publisher","first-page":"1017","DOI":"10.2215\/CJN.16481221","volume":"17","author":"A Vaid","year":"2022","unstructured":"Vaid, A. et al. Automated Determination of Left Ventricular Function Using Electrocardiogram Data in Patients on Maintenance Hemodialysis. Clin. J. Am. Soc. Nephrol. 17, 1017\u20131025 (2022).","journal-title":"Clin. J. Am. Soc. Nephrol."},{"key":"840_CR3","first-page":"395","volume":"15","author":"A Vaid","year":"2022","unstructured":"Vaid, A. et al. Using deep-learning algorithms to simultaneously identify right and left ventricular dysfunction from the electrocardiogram. Cardiovasc. Imaging 15, 395\u2013410 (2022).","journal-title":"Cardiovasc. Imaging"},{"key":"840_CR4","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1038\/s43856-023-00240-w","volume":"3","author":"A Vaid","year":"2023","unstructured":"Vaid, A. et al. Multi-center retrospective cohort study applying deep learning to electrocardiograms to identify left heart valvular dysfunction. Commun. Med. 3, 24 (2023).","journal-title":"Commun. Med."},{"key":"840_CR5","doi-asserted-by":"publisher","first-page":"S61","DOI":"10.1016\/j.jelectrocard.2019.08.008","volume":"57","author":"A Minchol\u00e9","year":"2019","unstructured":"Minchol\u00e9, A., Camps, J., Lyon, A. & Rodr\u00edguez, B. Machine learning in the electrocardiogram. J. Electrocardiol. 57, S61\u2013S64 (2019).","journal-title":"J. Electrocardiol."},{"key":"840_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-021-97118-5","volume":"11","author":"S Aziz","year":"2021","unstructured":"Aziz, S., Ahmed, S. & Alouini, M.-S. ECG-based machine-learning algorithms for heartbeat classification. Sci. Rep. 11, 18738 (2021).","journal-title":"Sci. Rep."},{"key":"840_CR7","doi-asserted-by":"publisher","first-page":"103801","DOI":"10.1016\/j.compbiomed.2020.103801","volume":"122","author":"S Hong","year":"2020","unstructured":"Hong, S., Zhou, Y., Shang, J., Xiao, C. & Sun, J. Opportunities and challenges of deep learning methods for electrocardiogram data: A systematic review. Computers Biol. Med. 122, 103801 (2020).","journal-title":"Computers Biol. Med."},{"key":"840_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/neco.1992.4.1.1","volume":"4","author":"S Geman","year":"1992","unstructured":"Geman, S., Bienenstock, E. & Doursat, R. Neural networks and the bias\/variance dilemma. Neural Comput. 4, 1\u201358 (1992).","journal-title":"Neural Comput."},{"key":"840_CR9","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-021-00444-8","volume":"8","author":"L Alzubaidi","year":"2021","unstructured":"Alzubaidi, L. et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 8, 53 (2021).","journal-title":"J. Big Data"},{"key":"840_CR10","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1016\/j.patcog.2017.10.013","volume":"77","author":"J Gu","year":"2018","unstructured":"Gu, J. et al. Recent advances in convolutional neural networks. Pattern Recognit. 77, 354\u2013377 (2018).","journal-title":"Pattern Recognit."},{"key":"840_CR11","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-021-84374-8","volume":"11","author":"K Weimann","year":"2021","unstructured":"Weimann, K. & Conrad, T. O. F. Transfer learning for ECG classification. Sci. Rep. 11, 5251 (2021).","journal-title":"Sci. Rep."},{"key":"840_CR12","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-016-0043-6","volume":"3","author":"K Weiss","year":"2016","unstructured":"Weiss, K., Khoshgoftaar, T. M. & Wang, D. A survey of transfer learning. J. Big Data 3, 9 (2016).","journal-title":"J. Big Data"},{"key":"840_CR13","unstructured":"Deng, J. et al. In 2009 IEEE conference on computer vision and pattern recognition. 248\u2013255 (Ieee)."},{"key":"840_CR14","doi-asserted-by":"publisher","first-page":"19","DOI":"10.4018\/IJSSCI.2018100102","volume":"10","author":"AD Gavrilov","year":"2018","unstructured":"Gavrilov, A. D., Jordache, A., Vasdani, M. & Deng, J. Preventing model overfitting and underfitting in convolutional neural networks. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 10, 19\u201328 (2018).","journal-title":"Int. J. Softw. Sci. Comput. Intell. (IJSSCI)"},{"key":"840_CR15","unstructured":"Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) (Curran Associates, Inc, 2017). https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf."},{"key":"840_CR16","doi-asserted-by":"crossref","unstructured":"Khan, S. et al. Transformers in vision: A survey. ACM Computing Surveys (CSUR) 54, 1\u201341 (2022).","DOI":"10.1145\/3505244"},{"key":"840_CR17","unstructured":"Wolf, T. et al. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. 38\u201345."},{"key":"840_CR18","unstructured":"Kalyan, K. S., Rajasekharan, A. & Sangeetha, S. Ammus: A survey of transformer-based pretrained models in natural language processing. Preprint at https:\/\/arxiv.org\/abs\/2108.05542 (2021)."},{"key":"840_CR19","unstructured":"Liu, Z. et al. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10012\u201310022."},{"key":"840_CR20","unstructured":"Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. Preprint at https:\/\/arxiv.org\/abs\/2010.11929 (2020)."},{"key":"840_CR21","unstructured":"Bao, H., Dong, L. & Wei, F. Beit: Bert pre-training of image transformers. Preprint at https:\/\/arxiv.org\/abs\/2106.08254 (2021)."},{"key":"840_CR22","first-page":"12116","volume":"34","author":"M Raghu","year":"2021","unstructured":"Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C. & Dosovitskiy, A. Do vision transformers see like convolutional neural networks? Adv. Neural Inf. Process. Syst. 34, 12116\u201312128 (2021).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"840_CR23","doi-asserted-by":"publisher","unstructured":"Shahani, L. S1Q3T3 pattern leading to early diagnosis of pulmonary embolism. BMJ Case Rep. 2012 https:\/\/doi.org\/10.1136\/bcr-2012-006569 (2012).","DOI":"10.1136\/bcr-2012-006569"},{"key":"840_CR24","doi-asserted-by":"publisher","first-page":"252","DOI":"10.1109\/34.75512","volume":"13","author":"SJ Raudys","year":"1991","unstructured":"Raudys, S. J. & Jain, A. K. Small sample size effects in statistical pattern recognition: Recommendations for practitioners. IEEE Trans. Pattern Anal. Mach. Intell. 13, 252\u2013264 (1991).","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"840_CR25","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929\u20131958 (2014).","journal-title":"J. Mach. Learn. Res."},{"key":"840_CR26","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1016\/j.cardfail.2021.01.022","volume":"27","author":"B Bozkurt","year":"2021","unstructured":"Bozkurt, B. et al. Universal definition and classification of heart failure: a report of the heart failure society of America, heart failure association of the European society of cardiology, Japanese heart failure society and writing committee of the universal definition of heart failure. J. Card. Fail. 27, 387\u2013413 (2021).","journal-title":"J. Card. Fail."},{"key":"840_CR27","doi-asserted-by":"crossref","unstructured":"Webster, J. J. & Kit, C. In COLING 1992 volume 4: The 14th international conference on computational linguistics.","DOI":"10.3115\/992424.992434"},{"key":"840_CR28","doi-asserted-by":"crossref","unstructured":"Ghazvininejad, M., Levy, O., Liu, Y. & Zettlemoyer, L. Mask-Predict: Parallel Decoding of Conditional Masked Language Models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 6112\u20136121. https:\/\/arxiv.org\/abs\/1904.09324 (Association for Computational Linguistics, Hong Kong, China, 2019).","DOI":"10.18653\/v1\/D19-1633"},{"key":"840_CR29","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1023\/A:1026543900054","volume":"40","author":"Y Rubner","year":"2000","unstructured":"Rubner, Y., Tomasi, C. & Guibas, L. J. The Earth Mover\u2019s Distance as a Metric for Image Retrieval. Int. J. Computer Vis. 40, 99\u2013121 (2000).","journal-title":"Int. J. Computer Vis."},{"key":"840_CR30","unstructured":"Selvaraju, R. R. et al. In Proceedings of the IEEE international conference on computer vision. 618\u2013626."}],"container-title":["npj Digital Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41746-023-00840-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-023-00840-9","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41746-023-00840-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,21]],"date-time":"2024-10-21T21:13:49Z","timestamp":1729545229000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41746-023-00840-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,6]]},"references-count":30,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["840"],"URL":"https:\/\/doi.org\/10.1038\/s41746-023-00840-9","relation":{},"ISSN":["2398-6352"],"issn-type":[{"value":"2398-6352","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,6]]},"assertion":[{"value":"13 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 May 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Dr. Nadkarni reports consultancy agreements with AstraZeneca, BioVie, GLG Consulting, Pensieve Health, Reata, Renalytix, Siemens Healthineers, and Variant Bio; research funding from Goldfinch Bio and Renalytix; honoraria from AstraZeneca, BioVie, Lexicon, Daiichi Sankyo, Meanrini Health and Reata; patents or royalties with Renalytix; owns equity and stock options in Pensieve Health and Renalytix as a scientific cofounder; owns equity in Verici Dx; has received financial compensation as a scientific board member and advisor to Renalytix; serves on the advisory board of Neurona Health; and serves in an advisory or leadership role for Pensieve Health and Renalytix. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"108"}}