{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T14:53:18Z","timestamp":1767970398398,"version":"3.49.0"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T00:00:00Z","timestamp":1752537600000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Copy number variants (CNVs) are pivotal in driving phenotypic variation that facilitates species adaptation. They are significant contributors to various disorders, making ancient genomes crucial for uncovering the genetic origins of disease susceptibility across populations. However, detecting CNVs in ancient DNA (aDNA) samples poses substantial challenges due to several factors: (i) aDNA is often highly degraded; (ii) contamination from microbial DNA and DNA from closely related species introduces additional noise into sequencing data; and finally, (iii) the typically low-coverage of aDNA renders accurate CNV detection particularly difficult. Conventional CNV calling algorithms, which are optimized for high-coverage read-depth signals, underperform under such conditions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>To address these limitations, we introduce LYCEUM, the first machine learning-based CNV caller for aDNA. To overcome challenges related to data quality and scarcity, we employ a two-step training strategy. First, the model is pre-trained on whole genome sequencing data from the 1000 Genomes Project, teaching it CNV-calling capabilities similar to conventional methods. Next, the model is fine-tuned using high-confidence CNV calls derived from only a few existing high-coverage aDNA samples. During this stage, the model adapts to making CNV calls based on the downsampled read depth signals of the same aDNA samples. LYCEUM achieves accurate detection of CNVs even in typically low-coverage ancient genomes. We also observe that the segmental deletion calls made by LYCEUM show correlation with the demographic history of the samples and exhibit patterns of negative selection inline with natural selection.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>LYCEUM is available at https:\/\/github.com\/ciceklab\/LYCEUM.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf244","type":"journal-article","created":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:25Z","timestamp":1752584545000},"page":"i285-i293","source":"Crossref","is-referenced-by-count":1,"title":["LYCEUM: learning to call copy number variants on low-coverage ancient genomes"],"prefix":"10.1093","volume":"41","author":[{"given":"Mehmet Alper","family":"Y\u0131lmaz","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Bilkent University , Ankara 06800, T\u00fcrkiye"}]},{"given":"Ahmet Arda","family":"Ceylan","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Bilkent University , Ankara 06800, T\u00fcrkiye"}]},{"given":"Gun","family":"Kaynar","sequence":"additional","affiliation":[{"name":"Computational Biology Department, School of Computer Science, Carnegie Mellon University , Pittsburgh, PA 06800,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8613-6619","authenticated-orcid":false,"given":"A Erc\u00fcment","family":"\u00c7i\u00e7ek","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Bilkent University , Ankara 06800, T\u00fcrkiye"}]}],"member":"286","published-online":{"date-parts":[[2025,7,15]]},"reference":[{"key":"2025071509022097600_btaf244-B1","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1101\/gr.114876.110","article-title":"CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVS from family and population genome sequencing","volume":"21","author":"Abyzov","year":"2011","journal-title":"Genome Res"},{"key":"2025071509022097600_btaf244-B2","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1038\/s41586-023-06862-3","article-title":"100 ancient genomes show repeated population turnovers in neolithic Denmark","volume":"625","author":"Allentoft","year":"2024","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B3","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1038\/s41586-023-06865-0","article-title":"Population genomics of post-glacial Western eurasia","volume":"625","author":"Allentoft","year":"2024","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B4","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1186\/s13073-023-01265-5","article-title":"Rare copy-number variants as modulators of common disease susceptibility","volume":"16","author":"Auwerx","year":"2024","journal-title":"Genome Med"},{"key":"2025071509022097600_btaf244-B5","first-page":"1","article-title":"Comprehensive genome analysis and variant detection at scale using dragen","author":"Behera","year":"2024","journal-title":"Nat Biotechnol"},{"key":"2025071509022097600_btaf244-B6","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1093\/bioinformatics\/btr670","article-title":"Control-FREEC: A tool for assessing copy number and allelic content using next-generation sequencing data","volume":"28","author":"Boeva","year":"2011","journal-title":"Bioinformatics"},{"key":"2025071509022097600_btaf244-B7","doi-asserted-by":"crossref","first-page":"14616","DOI":"10.1073\/pnas.0704665104","article-title":"Patterns of damage in genomic DNA sequences from a neandertal","volume":"104","author":"Briggs","year":"2007","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025071509022097600_btaf244-B8","doi-asserted-by":"crossref","first-page":"3426","DOI":"10.1016\/j.cell.2022.08.004","article-title":"High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios","volume":"185","author":"Byrska-Bishop","year":"2022","journal-title":"Cell"},{"key":"2025071509022097600_btaf244-B9","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1038\/nature13907","article-title":"Resolving the complexity of the human genome using single-molecule sequencing","volume":"517","author":"Chaisson","year":"2015","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B10","doi-asserted-by":"crossref","first-page":"S30","DOI":"10.1038\/ng2042","article-title":"The population genetics of structural variation","volume":"39","author":"Conrad","year":"2007","journal-title":"Nat Genet"},{"key":"2025071509022097600_btaf244-B11","doi-asserted-by":"crossref","first-page":"S22","DOI":"10.1038\/ng2054","article-title":"Mutational and selective effects on copy-number variants in the human genome","volume":"39","author":"Cooper","year":"2007","journal-title":"Nat Genet"},{"key":"2025071509022097600_btaf244-B12","doi-asserted-by":"crossref","first-page":"giab008","DOI":"10.1093\/gigascience\/giab008","article-title":"Twelve years of SAMtools and BCFtools","volume":"10","author":"Danecek","year":"2021","journal-title":"Gigascience"},{"key":"2025071509022097600_btaf244-B13","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/ng1933","article-title":"Mutations in the gene encoding the synaptic scaffolding protein shank3 are associated with autism spectrum disorders","volume":"39","author":"Durand","year":"2006","journal-title":"Nat Genet"},{"key":"2025071509022097600_btaf244-B14","doi-asserted-by":"crossref","first-page":"8007","DOI":"10.1038\/s41467-024-52027-9","article-title":"Impact and characterization of serial structural variations across humans and great apes","volume":"15","author":"H\u00f6ps","year":"2024","journal-title":"Nat Commun"},{"key":"2025071509022097600_btaf244-B15","doi-asserted-by":"crossref","first-page":"5118","DOI":"10.1038\/s41467-021-25435-4","article-title":"Evidence for opposing selective forces operating on human-specific duplicated TCAF genes in Neanderthals and humans","volume":"12","author":"Hsieh","year":"2021","journal-title":"Nat Commun"},{"key":"2025071509022097600_btaf244-B16","author":"Kingma","year":"2015"},{"key":"2025071509022097600_btaf244-B17","doi-asserted-by":"crossref","first-page":"1223","DOI":"10.1016\/j.cell.2012.02.039","article-title":"CNVs: harbingers of a rare variant revolution in psychiatric genetics","volume":"148","author":"Malhotra","year":"2012","journal-title":"Cell"},{"key":"2025071509022097600_btaf244-B18","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1038\/s41467-023-44116-y","article-title":"ECOLE: learning to call copy number variants on whole exome sequencing data","volume":"15","author":"Mandiracioglu","year":"2024","journal-title":"Nat Commun"},{"key":"2025071509022097600_btaf244-B19","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.1101\/gr.107524.110","article-title":"The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data","volume":"20","author":"McKenna","year":"2010","journal-title":"Genome Res"},{"key":"2025071509022097600_btaf244-B20","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1016\/j.tree.2020.03.002","article-title":"A roadmap for understanding the evolutionary significance of structural genomic variation","volume":"35","author":"M\u00e9rot","year":"2020","journal-title":"Trends Ecol Evol"},{"key":"2025071509022097600_btaf244-B21","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1093\/bioinformatics\/btz660","article-title":"A likelihood method for estimating present-day human contamination in ancient male samples using low-depth x-chromosome data","volume":"36","author":"Moreno-Mayar","year":"2020","journal-title":"Bioinformatics"},{"key":"2025071509022097600_btaf244-B22","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1093\/bioinformatics\/btu828","article-title":"Varsim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications","volume":"31","author":"Mu","year":"2015","journal-title":"Bioinformatics"},{"key":"2025071509022097600_btaf244-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s43586-020-00011-0","article-title":"Ancient DNA analysis","volume":"1","author":"Orlando","year":"2021","journal-title":"Nat Rev Methods Primers"},{"key":"2025071509022097600_btaf244-B24","doi-asserted-by":"crossref","first-page":"1170","DOI":"10.1101\/gr.274845.120","article-title":"Polishing copy number variant calls on exome sequencing data via deep learning","volume":"32","author":"\u00d6zden","year":"2022","journal-title":"Genome Res"},{"key":"2025071509022097600_btaf244-B25","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1038\/s41586-018-0030-5","article-title":"Genome evolution across 1,011 Saccharomyces cerevisiae isolates","volume":"556","author":"Peter","year":"2018","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B26","doi-asserted-by":"crossref","first-page":"e2000081","DOI":"10.1002\/bies.202000081","article-title":"Present-day DNA contamination in ancient DNA datasets","volume":"42","author":"Peyr\u00e9gne","year":"2020","journal-title":"Bioessays"},{"key":"2025071509022097600_btaf244-B27","doi-asserted-by":"crossref","first-page":"10473","DOI":"10.1038\/s41467-024-53087-7","article-title":"Survindel2: improving copy number variant calling from next-generation sequencing using hidden split reads","volume":"15","author":"Rajaby","year":"2024","journal-title":"Nat Commun"},{"key":"2025071509022097600_btaf244-B28","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1038\/s41592-018-0001-7","article-title":"Accurate detection of complex structural variations using single-molecule sequencing","volume":"15","author":"Sedlazeck","year":"2018","journal-title":"Nat Methods"},{"key":"2025071509022097600_btaf244-B29","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1186\/s12864-021-07686-z","article-title":"Evaluation of tools for identifying large copy number variations from ultra-low-coverage whole-genome sequencing data","volume":"22","author":"Smolander","year":"2021","journal-title":"BMC Genomics"},{"key":"2025071509022097600_btaf244-B30","doi-asserted-by":"crossref","first-page":"3660","DOI":"10.1038\/s41467-023-39202-0","article-title":"Imputation of ancient human genomes","volume":"14","author":"Sousa da Mota","year":"2023","journal-title":"Nat Commun"},{"key":"2025071509022097600_btaf244-B31","doi-asserted-by":"crossref","first-page":"e1010788","DOI":"10.1371\/journal.pcbi.1010788","article-title":"Conga: copy number variation genotyping in ancient genomes and low-coverage sequencing data","volume":"18","author":"S\u00f6ylev","year":"2022","journal-title":"PLoS Comput Biol"},{"key":"2025071509022097600_btaf244-B32","doi-asserted-by":"crossref","first-page":"2443","DOI":"10.1111\/mec.16435","article-title":"The population genetics of adaptation through copy number variation in a fungal plant pathogen","volume":"32","author":"Stalder","year":"2023","journal-title":"Mol Ecol"},{"key":"2025071509022097600_btaf244-B33","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1038\/nature07229","article-title":"Large recurrent microdeletions associated with schizophrenia","volume":"455","author":"Stefansson","year":"2008","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B34","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1038\/nature07239","article-title":"Rare chromosomal deletions and duplications increase risk of schizophrenia","volume":"455","author":"The International Schizophrenia Consortium,","year":"2008","journal-title":"Nature"},{"key":"2025071509022097600_btaf244-B35","doi-asserted-by":"crossref","first-page":"eabj6965","DOI":"10.1126\/science.abj6965","article-title":"Segmental duplications and their variation in a complete human genome","volume":"376","author":"Vollger","year":"2022","journal-title":"Science"},{"key":"2025071509022097600_btaf244-B36","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1111\/mec.15066","article-title":"Going beyond SNPs: the role of structural genomic variants in adaptive evolution and species diversification","volume":"28","author":"Wellenreuther","year":"2019","journal-title":"Mol Ecol"},{"key":"2025071509022097600_btaf244-B37","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1146\/annurev.genom.9.081307.164217","article-title":"Copy number variation in human health, disease, and evolution","volume":"10","author":"Zhang","year":"2009","journal-title":"Annu Rev Genomics Hum Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i285\/63745472\/btaf244.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i285\/63745472\/btaf244.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:29Z","timestamp":1752584549000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/41\/Supplement_1\/i285\/8199375"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,1]]},"references-count":37,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf244","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7,1]]}}}