{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:15Z","timestamp":1772138055249,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2022,7,28]],"date-time":"2022-07-28T00:00:00Z","timestamp":1658966400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003006","name":"ETH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003006","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Several recently developed single-cell DNA sequencing technologies enable whole-genome sequencing of thousands of cells. However, the ultra-low coverage of the sequenced data (&amp;lt;0.05\u00d7 per cell) mostly limits their usage to the identification of copy number alterations in multi-megabase segments. Many tumors are not copy number-driven, and thus single-nucleotide variant (SNV)-based subclone detection may contribute to a more comprehensive view on intra-tumor heterogeneity. Due to the low coverage of the data, the identification of SNVs is only possible when superimposing the sequenced genomes of hundreds of genetically similar cells. Thus, we have developed a new approach to efficiently cluster tumor cells based on a Bayesian filtering approach of relevant loci and exploiting read overlap and phasing.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed Single Cell Data Tumor Clusterer (SECEDO, lat. \u2018to separate\u2019), a new method to cluster tumor cells based solely on SNVs, inferred on ultra-low coverage single-cell DNA sequencing data. We applied SECEDO to a synthetic dataset simulating 7250 cells and eight tumor subclones from a single patient and were able to accurately reconstruct the clonal composition, detecting 92.11% of the somatic SNVs, with the smallest clusters representing only 6.9% of the total population. When applied to five real single-cell sequencing datasets from a breast cancer patient, each consisting of \u22482000 cells, SECEDO was able to recover the major clonal composition in each dataset at the original coverage of 0.03\u00d7, achieving an Adjusted Rand Index (ARI) score of \u22480.6. The current state-of-the-art SNV-based clustering method achieved an ARI score of \u22480, even after merging cells to create higher coverage data (factor 10 increase), and was only able to match SECEDOs performance when pooling data from all five datasets, in addition to artificially increasing the sequencing coverage by a factor of 7. Variant calling on the resulting clusters recovered more than twice as many SNVs as would have been detected if calling on all cells together. Further, the allelic ratio of the called SNVs on each subcluster was more than double relative to the allelic ratio of the SNVs called without clustering, thus demonstrating that calling variants on subclones, in addition to both increasing sensitivity of SNV detection and attaching SNVs to subclones, significantly increases the confidence of the called variants.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>SECEDO is implemented in C++ and is publicly available at https:\/\/github.com\/ratschlab\/secedo. Instructions to download the data and the evaluation code to reproduce the findings in this paper are available at: https:\/\/github.com\/ratschlab\/secedo-evaluation. The code and data of the submitted version are archived at: https:\/\/doi.org\/10.5281\/zenodo.6516955.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac510","type":"journal-article","created":{"date-parts":[[2022,7,28]],"date-time":"2022-07-28T09:46:41Z","timestamp":1659001601000},"page":"4293-4300","source":"Crossref","is-referenced-by-count":10,"title":["SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing"],"prefix":"10.1093","volume":"38","author":[{"given":"Hana","family":"Rozho\u0148ov\u00e1","sequence":"first","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Institute of Integrative Biology, ETH Zurich , Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Danciu","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Biomedical Informatics Research, University Hospital Zurich , Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2478-9512","authenticated-orcid":false,"given":"Stefan","family":"Stark","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Biomedical Informatics Research, University Hospital Zurich , Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5486-8532","authenticated-orcid":false,"given":"Gunnar","family":"R\u00e4tsch","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Biomedical Informatics Research, University Hospital Zurich , Zurich, Switzerland"},{"name":"Department of Biology, ETH Zurich , Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3411-0692","authenticated-orcid":false,"given":"Andr\u00e9","family":"Kahles","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Biomedical Informatics Research, University Hospital Zurich , Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1936-298X","authenticated-orcid":false,"given":"Kjong-Van","family":"Lehmann","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Group, Department of Computer Science, ETH Zurich , Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics , Lausanne, Switzerland"},{"name":"Biomedical Informatics Research, University Hospital Zurich , Zurich, Switzerland"},{"name":"Cancer Research Center Cologne Essen, University Hospital Cologne , Cologne, Germany"},{"name":"Joint Research Center for Computational Biomedicine, University Hospital RWTH Aachen , Aachen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,7,28]]},"reference":[{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/nature12477","article-title":"Signatures of mutational processes in human cancer","volume":"500","author":"Alexandrov","year":"2013","journal-title":"Nature"},{"key":"2023041408240285300_","author":"Arthur","year":"2006"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1038\/s41588-019-0366-2","article-title":"Linked-read analysis identifies mutations in single-cell DNA-sequencing data","volume":"51","author":"Bohrson","year":"2019","journal-title":"Nat. Genet"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1534\/genetics.113.154500","article-title":"A novel approach to estimating heterozygosity from low-coverage genome sequence","volume":"195","author":"Bryc","year":"2013","journal-title":"Genetics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1038\/nbt.2514","article-title":"Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples","volume":"31","author":"Cibulskis","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. B Methodol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"2239","DOI":"10.1016\/j.cell.2021.03.009","article-title":"Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes","volume":"184","author":"Dentro","year":"2021","journal-title":"Cell"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1038\/nmeth.4227","article-title":"Accurate identification of single-nucleotide variants in whole-genome-amplified single cells","volume":"14","author":"Dong","year":"2017","journal-title":"Nat. Methods"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"496","DOI":"10.1038\/s41467-019-14256-1","article-title":"Single-cell analysis reveals new evolutionary complexity in uveal melanoma","volume":"11","author":"Durante","year":"2020","journal-title":"Nat. Commun"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1038\/nrg.2015.16","article-title":"Single-cell genome sequencing: current state of the science","volume":"17","author":"Gawad","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1038\/nature25480","article-title":"The landscape of genomic alterations across childhood cancers","volume":"555","author":"Gr\u00f6bner","year":"2018","journal-title":"Nature"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"2877","DOI":"10.3389\/fonc.2021.700568","article-title":"Somatic copy number alterations in human cancers: an analysis of publicly available data from the cancer genome atlas","volume":"11","author":"Harbers","year":"2021","journal-title":"Front. Oncol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1186\/s13059-019-1673-8","article-title":"Conbase: a software for unsupervised discovery of clonal somatic mutations in single cells through read phasing","volume":"20","author":"H\u00e5rd","year":"2019","journal-title":"Genome Biol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2011","journal-title":"Bioinformatics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"2283","DOI":"10.1093\/bioinformatics\/btp373","article-title":"VarScan: variant detection in massively parallel sequencing of individual and pooled samples","volume":"25","author":"Koboldt","year":"2009","journal-title":"Bioinformatics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/j.bbcan.2017.02.001","article-title":"Advances in understanding tumour evolution through single-cell sequencing","volume":"1867","author":"Kuipers","year":"2017","journal-title":"Biochim. Biophys. Acta Rev. Cancer"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"6744","DOI":"10.1038\/s41467-021-26938-w","article-title":"Accurate and scalable variant calling from single cell DNA sequencing data with ProSolo","volume":"12","author":"L\u00e4hnemann","year":"2021","journal-title":"Nat. Commun"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"1207","DOI":"10.1016\/j.cell.2019.10.026","article-title":"Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing","volume":"179","author":"Laks","year":"2019","journal-title":"Cell"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1038\/nature12213","article-title":"Mutational heterogeneity in cancer and the search for new cancer-associated genes","volume":"499","author":"Lawrence","year":"2013","journal-title":"Nature"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nature12912","article-title":"Discovery and saturation analysis of cancer genes across 21 tumour types","volume":"505","author":"Lawrence","year":"2014","journal-title":"Nature"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"2987","DOI":"10.1093\/bioinformatics\/btr509","article-title":"A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data","volume":"27","author":"Li","year":"2011","journal-title":"Bioinformatics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least squares quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Trans. Inform. Theory"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"3908","DOI":"10.1038\/s41467-019-11857-8","article-title":"Identification of somatic mutations in single cell DNA-seq using a spatial model of allelic imbalance","volume":"10","author":"Luquette","year":"2019","journal-title":"Nat. Commun"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1038\/nature25795","article-title":"Pan-cancer genome and transcriptome analyses of 1,699 paediatric leukaemias and solid tumours","volume":"555","author":"Ma","year":"2018","journal-title":"Nature"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1126\/science.1224344","article-title":"A high-coverage genome sequence from an archaic Denisovan individual","volume":"338","author":"Meyer","year":"2012","journal-title":"Science"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1093\/bioinformatics\/btu828","article-title":"VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications","volume":"31","author":"Mu","year":"2014","journal-title":"Bioinformatics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"i186","DOI":"10.1093\/bioinformatics\/btaa449","article-title":"Identifying tumor clones in sparse single-cell mutation data","volume":"36 (Suppl. 1)","author":"Myers","year":"2020","journal-title":"Bioinformatics"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1038\/nature09807","article-title":"Tumour evolution inferred by single-cell sequencing","volume":"472","author":"Navin","year":"2011","journal-title":"Nature"},{"key":"2023041408240285300_","author":"Ng","year":"2001"},{"key":"2023041408240285300_","first-page":"1082","article-title":"Communities in networks","volume":"56","author":"Porter","year":"2009","journal-title":"Not. Am. Math. Soc"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"308","DOI":"10.1093\/nar\/29.1.308","article-title":"dbSNP: the NCBI database of genetic variation","volume":"29","author":"Sherry","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"5144","DOI":"10.1038\/s41467-018-07627-7","article-title":"Single-cell mutation identification via phylogenetic inference","volume":"9","author":"Singer","year":"2018","journal-title":"Nat. Commun"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1038\/nature07943","article-title":"The cancer genome","volume":"458","author":"Stratton","year":"2009","journal-title":"Nature"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"D941","DOI":"10.1093\/nar\/gky1015","article-title":"COSMIC: the catalogue of somatic mutations in cancer","volume":"47","author":"Tate","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1038\/s42003-020-1044-8","article-title":"Single-cell sequencing of genomic DNA resolves sub-clonal heterogeneity in a melanoma cell line","volume":"3","author":"Velazquez-Villarreal","year":"2020","journal-title":"Commun. Biol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/s41587-020-0661-6","article-title":"Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL","volume":"39","author":"Zaccaria","year":"2021","journal-title":"Nat. Biotechnol"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"1134","DOI":"10.1038\/ng.2760","article-title":"Pan-cancer patterns of somatic copy number alteration","volume":"45","author":"Zack","year":"2013","journal-title":"Nat. Genet"},{"key":"2023041408240285300_","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1038\/nmeth.3835","article-title":"Monovar: single-nucleotide variant detection in single cells","volume":"13","author":"Zafar","year":"2016","journal-title":"Nat. Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac510\/45226758\/btac510.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/18\/4293\/49885067\/btac510.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/18\/4293\/49885067\/btac510.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,29]],"date-time":"2024-09-29T22:48:09Z","timestamp":1727650089000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/18\/4293\/6651099"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2022,7,28]]},"references-count":41,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2022,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac510","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.11.08.467510","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,9,15]]},"published":{"date-parts":[[2022,7,28]]}}}