{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T12:15:18Z","timestamp":1768565718384,"version":"3.49.0"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009594","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,11,23]],"date-time":"2021-11-23T00:00:00Z","timestamp":1637625600000}}],"reference-count":38,"publisher":"Public Library of Science (PLoS)","issue":"11","license":[{"start":{"date-parts":[[2021,11,11]],"date-time":"2021-11-11T00:00:00Z","timestamp":1636588800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000025","name":"National Institute of Mental Health","doi-asserted-by":"publisher","award":["MH110185"],"award-info":[{"award-number":["MH110185"]}],"id":[{"id":"10.13039\/100000025","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000065","name":"National Institute of Neurological Disorders and Stroke","doi-asserted-by":"publisher","award":["NS021328"],"award-info":[{"award-number":["NS021328"]}],"id":[{"id":"10.13039\/100000065","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000025","name":"National Institute of Mental Health","doi-asserted-by":"publisher","award":["MH108592"],"award-info":[{"award-number":["MH108592"]}],"id":[{"id":"10.13039\/100000025","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000052","name":"NIH Office of the Director","doi-asserted-by":"publisher","award":["OD010944"],"award-info":[{"award-number":["OD010944"]}],"id":[{"id":"10.13039\/100000052","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human disease severity. While there are approaches for obtaining mitochondrial DNA variants from NGS data, these software do not account for the unique characteristics of mitochondrial genetics and can be inaccurate even for homoplasmic variants. We introduce MitoScape, a novel, big-data, software for extracting mitochondrial DNA sequences from NGS. MitoScape adopts a novel departure from other algorithms by using machine learning to model the unique characteristics of mitochondrial genetics. We also employ a novel approach of using rho-zero (mitochondrial DNA-depleted) data to model nuclear-encoded mitochondrial sequences. We showed that MitoScape produces accurate heteroplasmy estimates using gold-standard mitochondrial DNA data. We provide a comprehensive comparison of the most common tools for obtaining mtDNA variants from NGS and showed that MitoScape had superior performance to compared tools in every statistically category we compared, including false positives and false negatives. By applying MitoScape to common disease examples, we illustrate how MitoScape facilitates important heteroplasmy-disease association discoveries by expanding upon a reported association between hypertrophic cardiomyopathy and mitochondrial haplogroup T in men (adjusted p-value = 0.003). The improved accuracy of mitochondrial DNA variants produced by MitoScape will be instrumental in diagnosing disease in the context of personalized medicine and clinical diagnostics.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009594","type":"journal-article","created":{"date-parts":[[2021,11,11]],"date-time":"2021-11-11T18:40:30Z","timestamp":1636656030000},"page":"e1009594","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":18,"title":["MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2478-5864","authenticated-orcid":true,"given":"Larry N.","family":"Singh","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2653-5009","authenticated-orcid":true,"given":"Brian","family":"Ennis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7861-0197","authenticated-orcid":true,"given":"Bryn","family":"Loneragan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5743-0795","authenticated-orcid":true,"given":"Noah L.","family":"Tsao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1528-5964","authenticated-orcid":true,"given":"M. Isabel G.","family":"Lopez Sanchez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianping","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrick","family":"Acheampong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Oanh","family":"Tran","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ian A.","family":"Trounce","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2455-9525","authenticated-orcid":true,"given":"Yuankun","family":"Zhu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Prasanth","family":"Potluri","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"name":"Regeneron Genetics Center","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9207-6955","authenticated-orcid":true,"given":"Beverly S.","family":"Emanuel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9245-9876","authenticated-orcid":true,"given":"Daniel J.","family":"Rader","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1368-2453","authenticated-orcid":true,"given":"Zoltan","family":"Arany","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8009-1632","authenticated-orcid":true,"given":"Scott M.","family":"Damrauer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0436-4189","authenticated-orcid":true,"given":"Adam C.","family":"Resnick","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stewart A.","family":"Anderson","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7480-8278","authenticated-orcid":true,"given":"Douglas C.","family":"Wallace","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2021,11,11]]},"reference":[{"key":"pcbi.1009594.ref001","doi-asserted-by":"crossref","first-page":"1380","DOI":"10.1038\/gim.2017.107","article-title":"Patient care standards for primary mitochondrial disease: a consensus statement from the Mitochondrial Medicine Society","volume":"19","author":"S Parikh","year":"2017","journal-title":"Genetics in Medicine"},{"key":"pcbi.1009594.ref002","doi-asserted-by":"crossref","first-page":"4598","DOI":"10.1167\/iovs.18-25085","article-title":"Mitochondrial DNA Variation and Disease Susceptibility in Primary Open-Angle Glaucoma","volume":"59","author":"LN Singh","year":"2018","journal-title":"Invest Ophthalmol Vis Sci"},{"key":"pcbi.1009594.ref003","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1001\/jamapsychiatry.2017.2604","article-title":"Association Between Mitochondrial DNA Haplogroup Variation and Autism Spectrum Disorders.","volume":"74","author":"D Chalkia","year":"2017","journal-title":"JAMA Psychiatry"},{"key":"pcbi.1009594.ref004","doi-asserted-by":"crossref","first-page":"1642","DOI":"10.1038\/s41588-018-0264-z","article-title":"Mitochondrial genetic medicine","volume":"50","author":"DC Wallace","year":"2018","journal-title":"Nature Genetics"},{"key":"pcbi.1009594.ref005","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1038\/s41588-019-0557-x","article-title":"Comprehensive molecular characterization of mitochondrial genomes in human cancers","volume":"52","author":"Y Yuan","year":"2020","journal-title":"Nat Genet"},{"key":"pcbi.1009594.ref006","doi-asserted-by":"crossref","first-page":"878","DOI":"10.1038\/nrg3275","article-title":"Human mitochondrial DNA: roles of inherited and somatic mutations","volume":"13","author":"EA Schon","year":"2012","journal-title":"Nat Rev Genet"},{"key":"pcbi.1009594.ref007","doi-asserted-by":"crossref","first-page":"C258","DOI":"10.1152\/ajpcell.00224.2020","article-title":"Decoding SARS-CoV-2 hijacking of host mitochondria in COVID-19 pathogenesis","volume":"319","author":"KK Singh","year":"2020","journal-title":"American Journal of Physiology-Cell Physiology"},{"key":"pcbi.1009594.ref008","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1016\/j.cmet.2020.07.015","article-title":"Elevated Glucose Levels Favor SARS-CoV-2 Infection and Monocyte Response through a HIF-1\u03b1\/Glycolysis-Dependent Axis","volume":"32","author":"AC Codo","year":"2020","journal-title":"Cell Metabolism"},{"key":"pcbi.1009594.ref009","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1016\/j.cell.2020.11.002","article-title":"Comprehensive Multi-omics Analysis Reveals Mitochondrial Stress as a Central Biological Hub for Spaceflight Impact","volume":"183","author":"WA da Silveira","year":"2020","journal-title":"Cell"},{"key":"pcbi.1009594.ref010","doi-asserted-by":"crossref","first-page":"612369","DOI":"10.1155\/2013\/612369","article-title":"Mitochondria and cancer: past, present, and future","volume":"2013","author":"ML Verschoor","year":"2013","journal-title":"Biomed Res Int"},{"key":"pcbi.1009594.ref011","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1146\/annurev-genom-091416-035426","article-title":"Recent Advances in Mitochondrial Disease.","volume":"18","author":"L Craven","year":"2017","journal-title":"Annu Rev Genomics Hum Genet"},{"key":"pcbi.1009594.ref012","first-page":"1","article-title":"Extreme heterogeneity of human mitochondrial DNA from organelles to populations","author":"JB Stewart","year":"2020","journal-title":"Nature Reviews Genetics"},{"key":"pcbi.1009594.ref013","doi-asserted-by":"crossref","first-page":"9073","DOI":"10.1093\/nar\/gks424","article-title":"Mammalian NUMT insertion is non-random","volume":"40","author":"J Tsuji","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009594.ref014","doi-asserted-by":"crossref","first-page":"946","DOI":"10.1016\/j.mito.2011.08.009","article-title":"Nuclear insertions of mitochondrial origin: Database updating and usefulness in cancer studies","volume":"11","author":"A Ramos","year":"2011","journal-title":"Mitochondrion"},{"key":"pcbi.1009594.ref015","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/j.semcancer.2017.05.001","article-title":"Mitochondrial Determinants of Cancer Health Disparities","volume":"47","author":"AR Choudhury","year":"2017","journal-title":"Semin Cancer Biol"},{"key":"pcbi.1009594.ref016","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.semcancer.2017.05.003","article-title":"Numtogenesis as a mechanism for development of cancer","volume":"47","author":"KK Singh","year":"2017","journal-title":"Semin Cancer Biol"},{"key":"pcbi.1009594.ref017","doi-asserted-by":"crossref","first-page":"12640","DOI":"10.1093\/nar\/gku1038","article-title":"The genomic landscape of polymorphic human nuclear mitochondrial insertions","volume":"42","author":"G Dayama","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009594.ref018","doi-asserted-by":"crossref","first-page":"3115","DOI":"10.1093\/bioinformatics\/btu483","article-title":"MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing","volume":"30","author":"C Calabrese","year":"2014","journal-title":"Bioinformatics"},{"key":"pcbi.1009594.ref019","doi-asserted-by":"crossref","first-page":"e1000834","DOI":"10.1371\/journal.pgen.1000834","article-title":"Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes.","volume":"6","author":"E Hazkani-Covo","year":"2010","journal-title":"PLOS Genetics."},{"key":"pcbi.1009594.ref020","doi-asserted-by":"crossref","first-page":"e137","DOI":"10.1093\/nar\/gks499","article-title":"Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs","volume":"40","author":"M Li","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009594.ref021","doi-asserted-by":"crossref","first-page":"W64","DOI":"10.1093\/nar\/gkw247","article-title":"mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud","volume":"44","author":"H Weissensteiner","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009594.ref022","doi-asserted-by":"crossref","first-page":"1740","DOI":"10.1038\/s41467-020-15336-3","article-title":"Nuclear-mitochondrial DNA segments resemble paternally inherited mitochondrial DNA in humans","volume":"11","author":"W Wei","year":"2020","journal-title":"Nature Communications"},{"key":"pcbi.1009594.ref023","doi-asserted-by":"crossref","first-page":"924","DOI":"10.1016\/j.mito.2011.08.005","article-title":"MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences","volume":"11","author":"I Zhidkov","year":"2011","journal-title":"Mitochondrion"},{"key":"pcbi.1009594.ref024","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1038\/13779","article-title":"Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA","volume":"23","author":"RM Andrews","year":"1999","journal-title":"Nat Genet"},{"key":"pcbi.1009594.ref025","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1016\/S0076-6879(96)64044-0","article-title":"Assessment of mitochondrial oxidative phosphorylation in patient muscle biopsies, lymphoblasts, and transmitochondrial cell lines","volume":"264","author":"IA Trounce","year":"1996","journal-title":"Methods Enzymol"},{"key":"pcbi.1009594.ref026","first-page":"587","volume-title":"The Elements of Statistical Learning: Data Mining, Inference, and Prediction","author":"T Hastie","year":"2016","edition":"2"},{"key":"pcbi.1009594.ref027","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1093\/bioinformatics\/btq057","article-title":"Fast and SNP-tolerant detection of complex variants and splicing in short reads","volume":"26","author":"TD Wu","year":"2010","journal-title":"Bioinformatics"},{"key":"pcbi.1009594.ref028","volume-title":"ADAM: Genomics Formats and Processing Patterns for Cloud Scale Computing","author":"M Massie","year":"2013"},{"key":"pcbi.1009594.ref029","doi-asserted-by":"crossref","unstructured":"Nothaft FA, Massie M, Danford T, Zhang Z, Laserson U, Yeksigian C, et al. Rethinking Data-Intensive Science Using Scalable Analytics Systems. Proceedings of the 2015 International Conference on Management of Data (SIGMOD \u201815). ACM; 2015.","DOI":"10.1145\/2723372.2742787"},{"key":"pcbi.1009594.ref030","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1038\/gim.2012.144","article-title":"Comprehensive next-generation sequence analyses of the entire mitochondrial genome reveal new insights into the molecular diagnosis of mitochondrial DNA disorders","volume":"15","author":"H Cui","year":"2013","journal-title":"Genet Med"},{"key":"pcbi.1009594.ref031","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1176\/appi.ajp.2013.13070864","article-title":"Psychiatric disorders from childhood to adulthood in 22q11.2 deletion syndrome: results from the International Consortium on Brain and Behavior in 22q11.2 Deletion Syndrome.","volume":"171","author":"M Schneider","year":"2014","journal-title":"Am J Psychiatry."},{"key":"pcbi.1009594.ref032","article-title":"Calling Somatic SNVs and Indels with Mutect2.","volume":"861054","author":"D Benjamin","year":"2019","journal-title":"bioRxiv"},{"key":"pcbi.1009594.ref033","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1002\/humu.22974","article-title":"MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease","volume":"37","author":"L Shen","year":"2016","journal-title":"Hum Mutat"},{"key":"pcbi.1009594.ref034","article-title":"Specifications of the ACMG\/AMP standards and guidelines for mitochondrial DNA variant interpretation","author":"EM McCormick","year":"2020","journal-title":"Hum Mutat"},{"key":"pcbi.1009594.ref035","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1056\/NEJMra1710575","article-title":"Clinical Course and Management of Hypertrophic Cardiomyopathy.","volume":"379","author":"BJ Maron","year":"2018","journal-title":"New England Journal of Medicine"},{"key":"pcbi.1009594.ref036","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1016\/j.ijcard.2005.09.008","article-title":"Mitochondrial DNA haplogroups in Spanish patients with hypertrophic cardiomyopathy","volume":"112","author":"MG Castro","year":"2006","journal-title":"International Journal of Cardiology"},{"key":"pcbi.1009594.ref037","doi-asserted-by":"crossref","first-page":"e71904","DOI":"10.1371\/journal.pone.0071904","article-title":"Mitochondrial Haplogroups Modify the Risk of Developing Hypertrophic Cardiomyopathy in a Danish Population.","volume":"8","author":"CM Hagen","year":"2013","journal-title":"PLoS ONE."},{"key":"pcbi.1009594.ref038","article-title":"Mitochondrial DNA copy number in human disease: the more the better?","author":"R Filograna","journal-title":"FEBS Letters"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009594","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,11,23]],"date-time":"2021-11-23T00:00:00Z","timestamp":1637625600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009594","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,11,23]],"date-time":"2021-11-23T20:46:34Z","timestamp":1637700394000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009594"}},"subtitle":[],"editor":[{"given":"Manja","family":"Marz","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,11,11]]},"references-count":38,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2021,11,11]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009594","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1009594","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,11]]}}}