{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:51Z","timestamp":1772138091497,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2018,9,6]],"date-time":"2018-09-06T00:00:00Z","timestamp":1536192000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Institute of Health","award":["U01CA198933"],"award-info":[{"award-number":["U01CA198933"]}]},{"name":"National Institute of Health","award":["HG002585"],"award-info":[{"award-number":["HG002585"]}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,4,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Quality control plays a major role in the analysis of ancient DNA (aDNA). One key step in this quality control is assessment of DNA damage: aDNA contains unique signatures of DNA damage that distinguish it from modern DNA, and so analyses of damage patterns can help confirm that DNA sequences obtained are from endogenous aDNA rather than from modern contamination. Predominant signatures of DNA damage include a high frequency of cytosine to thymine substitutions (C-to-T) at the ends of fragments, and elevated rates of purines (A &amp; G) before the 5\u2032 strand-breaks. Existing QC procedures help assess damage by simply plotting for each sample, the C-to-T mismatch rate along the read and the composition of bases before the 5\u2032 strand-breaks. Here we present a more flexible and comprehensive model-based approach to infer and visualize damage patterns in aDNA, implemented in an R package aRchaic. This approach is based on a \u2018grade of membership\u2019 model (also known as \u2018admixture\u2019 or \u2018topic\u2019 model) in which each sample has an estimated grade of membership in each of K damage profiles that are estimated from the data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We illustrate aRchaic on data from several aDNA studies and modern individuals from 1000 Genomes Project Consortium (2012). Here, aRchaic clearly distinguishes modern from ancient samples irrespective of DNA extraction, lab and sequencing protocols. Additionally, through an in-silico contamination experiment, we show that the aRchaic grades of membership reflect relative levels of exogenous modern contamination. Together, the outputs of aRchaic provide a concise visual summary of DNA damage patterns, as well as other processes generating mismatches in the data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>aRchaic is available for download from https:\/\/www.github.com\/kkdey\/aRchaic.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty779","type":"journal-article","created":{"date-parts":[[2018,9,4]],"date-time":"2018-09-04T08:21:56Z","timestamp":1536049316000},"page":"1292-1298","source":"Crossref","is-referenced-by-count":11,"title":["Inference and visualization of DNA damage patterns using a grade of membership model"],"prefix":"10.1093","volume":"35","author":[{"given":"Hussein","family":"Al-Asadi","sequence":"first","affiliation":[{"name":"Committee on Evolutionary Biology, University of Chicago, Chicago, IL, USA"},{"name":"Department of Statistics, University of Chicago, Chicago, IL, USA"}]},{"given":"Kushal K","family":"Dey","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Chicago, Chicago, IL, USA"}]},{"given":"John","family":"Novembre","sequence":"additional","affiliation":[{"name":"Committee on Evolutionary Biology, University of Chicago, Chicago, IL, USA"},{"name":"Department of Human Genetics, University of Chicago, Chicago, IL, USA"}]},{"given":"Matthew","family":"Stephens","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Chicago, Chicago, IL, USA"},{"name":"Department of Human Genetics, University of Chicago, Chicago, IL, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,9,6]]},"reference":[{"key":"2023012810013595500_bty779-B1","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An Integrated Map of Genetic Variation from 1, 092 Human Genomes","volume":"491","year":"2012","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B2","doi-asserted-by":"crossref","first-page":"1655","DOI":"10.1101\/gr.094052.109","article-title":"Fast model-based estimation of ancestry in unrelated individuals","volume":"19","author":"Alexander","year":"2009","journal-title":"Genome Res"},{"key":"2023012810013595500_bty779-B3","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1038\/nature14507","article-title":"Population Genomics of Bronze Age Eurasia","volume":"522","author":"Allentoft","year":"2015","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B4","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res"},{"key":"2023012810013595500_bty779-B5","doi-asserted-by":"crossref","first-page":"14616","DOI":"10.1073\/pnas.0704665104","article-title":"Patterns of damage in genomic DNA sequences from a neandertal","volume":"104","author":"Briggs","year":"2007","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012810013595500_bty779-B6","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1126\/science.296.5566.261b","article-title":"A human genome diversity cell line panel","volume":"296","author":"Cann","year":"2002","journal-title":"Science"},{"key":"2023012810013595500_bty779-B7","author":"Dey","year":"2017"},{"key":"2023012810013595500_bty779-B8","doi-asserted-by":"crossref","first-page":"e1006599.","DOI":"10.1371\/journal.pgen.1006599","article-title":"Visualizing the structure of RNA-seq expression data using grade of membership models","volume":"13","author":"Dey","year":"2017","journal-title":"PLoS Genet"},{"key":"2023012810013595500_bty779-B9","doi-asserted-by":"crossref","first-page":"560.","DOI":"10.1038\/287560a0","article-title":"Mutagenic deamination of cytosine residues in DNA","volume":"287","author":"Duncan","year":"1980","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B10","volume-title":"Latent Class Representation of the Grade of Membership Model","author":"Erosheva","year":"2006"},{"key":"2023012810013595500_bty779-B11","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1038\/nature13810","article-title":"Genome sequence of a 45,000-year-old modern muman from Western Siberia","volume":"514","author":"Fu","year":"2014","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B12","doi-asserted-by":"crossref","first-page":"200.","DOI":"10.1038\/nature17993","article-title":"The genetic history of ice age Europe","volume":"534","author":"Fu","year":"2016","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B13","doi-asserted-by":"crossref","first-page":"5257.","DOI":"10.1038\/ncomms6257","article-title":"Genome flux and stasis in a five millennium transect of European Prehistory","volume":"5","author":"Gamba","year":"2014","journal-title":"Nat. Commun"},{"key":"2023012810013595500_bty779-B14","doi-asserted-by":"crossref","first-page":"2153","DOI":"10.1093\/bioinformatics\/btr347","article-title":"mapDamage: testing for damage patterns in ancient DNA sequences","volume":"27","author":"Ginolhac","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012810013595500_bty779-B15","first-page":"725","article-title":"A codon-based model of nucleotide substitution for protein-coding DNA sequences","volume":"11","author":"Goldman","year":"1994","journal-title":"Mol. Biol. Evol"},{"key":"2023012810013595500_bty779-B16","doi-asserted-by":"crossref","first-page":"1682","DOI":"10.1093\/bioinformatics\/btt193","article-title":"mapDamage2.0: fast approximate bayesian estimates of ancient DNA damage parameters","volume":"29","author":"J\u00f3nsson","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012810013595500_bty779-B17","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1186\/s12859-014-0356-4","article-title":"ANGSD: analysis of next generation sequencing data","volume":"15","author":"Korneliussen","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023012810013595500_bty779-B18","first-page":"1","article-title":"A Quasi-Newton acceleration of the EM algorithm","volume":"5","author":"Lange","year":"1995","journal-title":"Stat. Sin"},{"key":"2023012810013595500_bty779-B19","doi-asserted-by":"crossref","first-page":"419.","DOI":"10.1038\/nature19310","article-title":"Genomic insights into the origin of farming in the ancient near east","volume":"536","author":"Lazaridis","year":"2016","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B20","doi-asserted-by":"crossref","first-page":"13175.","DOI":"10.1038\/ncomms13175","article-title":"A time transect of exomes from a native american population before and after European Contact","volume":"7","author":"Lindo","year":"2016","journal-title":"Nat. Commun"},{"key":"2023012810013595500_bty779-B21","doi-asserted-by":"crossref","first-page":"368.","DOI":"10.1038\/nature24476","article-title":"Parallel palaeogenomic transects reveal complex genetic history of early European farmers","volume":"551","author":"Lipson","year":"2017","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B22","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1093\/molbev\/msm015","article-title":"More on contamination: the use of asymmetric molecular behavior to identify authentic ancient human DNA","volume":"24","author":"Malmstr\u00f6m","year":"2007","journal-title":"Mol. Biol. Evol"},{"key":"2023012810013595500_bty779-B23","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1038\/nature16152","article-title":"Genome-wide patterns of selection in 230 ancient Eurasians","volume":"528","author":"Mathieson","year":"2015","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B24","first-page":"197","volume-title":"Nature","author":"Mathieson","year":"2018"},{"key":"2023012810013595500_bty779-B25","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1126\/science.1224344","article-title":"A high-coverage genome sequence from an Archaic Denisovan individual","volume":"338","author":"Meyer","year":"2012","journal-title":"Science"},{"key":"2023012810013595500_bty779-B26","doi-asserted-by":"crossref","first-page":"403.","DOI":"10.1038\/nature12788","article-title":"A mitochondrial genome sequence of a Hominin from Sima De Los Huesos","volume":"505","author":"Meyer","year":"2014","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B27","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1038\/nature25738","article-title":"The Beaker phenomenon and the genomic transformation of northwest Europe","volume":"555","author":"Olalde","year":"2018","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B28","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1093\/genetics\/155.2.945","article-title":"Inference of population structure using multilocus genotype data","volume":"155","author":"Pritchard","year":"2000","journal-title":"Genetics"},{"key":"2023012810013595500_bty779-B29","doi-asserted-by":"crossref","first-page":"43.","DOI":"10.1038\/nature12886","article-title":"The complete genome sequence of a neandertal from the Altai Mountains","volume":"505","author":"Pr\u00fcfer","year":"2014","journal-title":"Nature"},{"key":"2023012810013595500_bty779-B30","doi-asserted-by":"crossref","first-page":"e1005972.","DOI":"10.1371\/journal.pgen.1005972","article-title":"Joint estimation of contamination, error and demography for nuclear DNA from ancient humans","volume":"12","author":"Racimo","year":"2016","journal-title":"PLoS Genet"},{"key":"2023012810013595500_bty779-B31","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1126\/science.1211177","article-title":"An aboriginal Australian genome reveals separate human dispersals into Asia","volume":"334","author":"Rasmussen","year":"2011","journal-title":"Science"},{"key":"2023012810013595500_bty779-B32","doi-asserted-by":"crossref","first-page":"224.","DOI":"10.1186\/s13059-015-0776-0","article-title":"Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA","volume":"16","author":"Renaud","year":"2015","journal-title":"Genome Biol"},{"key":"2023012810013595500_bty779-B33","doi-asserted-by":"crossref","first-page":"20130624.","DOI":"10.1098\/rstb.2013.0624","article-title":"Partial uracil\u2013DNA\u2013glycosylase treatment for screening of ancient DNA","volume":"370","author":"Rohland","year":"2015","journal-title":"Phil. Trans. R. Soc. B"},{"key":"2023012810013595500_bty779-B34","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.1126\/science.1078311","article-title":"Genetic structure of human populations","volume":"298","author":"Rosenberg","year":"2002","journal-title":"Science"},{"key":"2023012810013595500_bty779-B35","doi-asserted-by":"crossref","first-page":"e34131.","DOI":"10.1371\/journal.pone.0034131","article-title":"Temporal patterns of nucleotide misincorporations and DNA fragmentation in ancient DNA","volume":"7","author":"Sawyer","year":"2012","journal-title":"PloS One"},{"key":"2023012810013595500_bty779-B36","doi-asserted-by":"crossref","first-page":"1236573.","DOI":"10.1126\/science.1236573","article-title":"A paleogenomic perspective on evolution and gene function: new insights from ancient DNA","volume":"343","author":"Shapiro","year":"2014","journal-title":"Science"},{"key":"2023012810013595500_bty779-B37","doi-asserted-by":"crossref","first-page":"972","DOI":"10.1093\/nar\/22.6.972","article-title":"The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA","volume":"22","author":"Shen","year":"1994","journal-title":"Nucleic Acids Res"},{"key":"2023012810013595500_bty779-B38","doi-asserted-by":"crossref","first-page":"e1005657.","DOI":"10.1371\/journal.pgen.1005657","article-title":"A simple model-based approach to inferring and visualizing cancer mutation signatures","volume":"11","author":"Shiraishi","year":"2015","journal-title":"PLoS Genet"},{"key":"2023012810013595500_bty779-B39","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1126\/science.1253448","article-title":"Genomic diversity and admixture differs for stone-age Scandinavian foragers and farmers","volume":"344","author":"Skoglund","year":"2014","journal-title":"Science"},{"key":"2023012810013595500_bty779-B40","doi-asserted-by":"crossref","first-page":"2229","DOI":"10.1073\/pnas.1318934111","article-title":"Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal","volume":"111","author":"Skoglund","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012810013595500_bty779-B41","first-page":"1184","author":"Taddy","year":"2012"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/8\/1292\/48941558\/bioinformatics_35_8_1292.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/8\/1292\/48941558\/bioinformatics_35_8_1292.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T05:11:49Z","timestamp":1674882709000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/8\/1292\/5091332"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,9,6]]},"references-count":41,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2019,4,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty779","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/327684","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,4,15]]},"published":{"date-parts":[[2018,9,6]]}}}