{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T22:42:49Z","timestamp":1776379369243,"version":"3.51.2"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009119","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T00:00:00Z","timestamp":1625788800000}}],"reference-count":40,"publisher":"Public Library of Science (PLoS)","issue":"6","license":[{"start":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T00:00:00Z","timestamp":1624838400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NIH","award":["R01 grant"],"award-info":[{"award-number":["R01 grant"]}]},{"name":"BRCA Foundation","award":["Gift Funding"],"award-info":[{"award-number":["Gift Funding"]}]},{"name":"BRCA Foundation","award":["Young Investigator Award"],"award-info":[{"award-number":["Young Investigator Award"]}]},{"DOI":"10.13039\/501100002954","name":"Universit\u00e0 degli Studi di Milano-Bicocca","doi-asserted-by":"publisher","award":["Starting Grant"],"award-info":[{"award-number":["Starting Grant"]}],"id":[{"id":"10.13039\/501100002954","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002954","name":"Universit\u00e0 degli Studi di Milano-Bicocca","doi-asserted-by":"crossref","award":["Premio Giovani Talenti dell'Universit\u00e0 degli Studi di Milano-Bicocca."],"award-info":[{"award-number":["Premio Giovani Talenti dell'Universit\u00e0 degli Studi di Milano-Bicocca."]}],"id":[{"id":"10.13039\/501100002954","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Cancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or \u201cmutational signatures\u201d. Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates a user-specified background signature, employs regularization to reduce noise in non-background signatures, uses cross-validation to identify the number of signatures, and is scalable to large datasets. We show that SparseSignatures outperforms current state-of-the-art methods on simulated data using a variety of standard metrics. We then apply SparseSignatures to whole genome sequences of pancreatic and breast tumors, discovering well-differentiated signatures that are linked to known mutagenic mechanisms and are strongly associated with patient clinical features.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009119","type":"journal-article","created":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T13:39:59Z","timestamp":1624887599000},"page":"e1009119","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":32,"title":["De novo mutational signature discovery in tumor genomes using SparseSignatures"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5827-0826","authenticated-orcid":true,"given":"Avantika","family":"Lal","sequence":"first","affiliation":[]},{"given":"Keli","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Robert","family":"Tibshirani","sequence":"additional","affiliation":[]},{"given":"Arend","family":"Sidow","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6087-2666","authenticated-orcid":true,"given":"Daniele","family":"Ramazzotti","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2021,6,28]]},"reference":[{"key":"pcbi.1009119.ref001","doi-asserted-by":"crossref","first-page":"1546","DOI":"10.1126\/science.1235122","article-title":"Cancer genome landscapes","volume":"339","author":"B Vogelstein","year":"2013","journal-title":"Science"},{"key":"pcbi.1009119.ref002","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/nature12477","article-title":"Signatures of mutational processes in human cancer","volume":"500","author":"LB Alexandrov","year":"2013","journal-title":"Nature"},{"key":"pcbi.1009119.ref003","doi-asserted-by":"crossref","first-page":"3924","DOI":"10.1038\/s41388-018-0245-9","article-title":"APOBEC3B and APOBEC mutational signature as potential predictive markers for immunotherapy response in non-small cell lung cancer","volume":"37","author":"S Wang","year":"2018","journal-title":"Oncogene"},{"key":"pcbi.1009119.ref004","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1016\/j.celrep.2012.12.008","article-title":"Deciphering signatures of mutational processes operative in human cancer","volume":"3","author":"LB Alexandrov","year":"2013","journal-title":"Cell Rep"},{"key":"pcbi.1009119.ref005","doi-asserted-by":"crossref","first-page":"3673","DOI":"10.1093\/bioinformatics\/btv408","article-title":"SomaticSignatures: inferring mutational signatures from single-nucleotide variants","volume":"31","author":"JS Gehring","year":"2015","journal-title":"Bioinformatics"},{"key":"pcbi.1009119.ref006","doi-asserted-by":"crossref","first-page":"e1005657","DOI":"10.1371\/journal.pgen.1005657","article-title":"A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures","volume":"11","author":"Y Shiraishi","year":"2015","journal-title":"PLoS Genet"},{"key":"pcbi.1009119.ref007","doi-asserted-by":"crossref","first-page":"2997","DOI":"10.1038\/ncomms3997","article-title":"Heterogeneity of genomic evolution and mutational profiles in multiple myeloma","volume":"5","author":"N Bolli","year":"2014","journal-title":"Nat Commun"},{"key":"pcbi.1009119.ref008","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1038\/ng.3252","article-title":"Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets","volume":"47","author":"K Schulze","year":"2015","journal-title":"Nat Genet"},{"key":"pcbi.1009119.ref009","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1038\/nature17676","article-title":"Landscape of somatic mutations in 560 breast cancer whole-genome sequences","volume":"534","author":"S Nik-Zainal","year":"2016","journal-title":"Nature"},{"key":"pcbi.1009119.ref010","doi-asserted-by":"crossref","first-page":"1402","DOI":"10.1038\/ng.3441","article-title":"Clock-like mutational processes in human somatic cells","volume":"47","author":"LB Alexandrov","year":"2015","journal-title":"Nat Genet"},{"key":"pcbi.1009119.ref011","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1038\/s41586-020-1943-3","article-title":"The repertoire of mutational signatures in human cancer","volume":"578","author":"LB Alexandrov","year":"2020","journal-title":"Nature"},{"key":"pcbi.1009119.ref012","doi-asserted-by":"crossref","first-page":"618","DOI":"10.1126\/science.aag0299","article-title":"Mutational signatures associated with tobacco smoking in human cancer","volume":"354","author":"LB Alexandrov","year":"2016","journal-title":"Science"},{"key":"pcbi.1009119.ref013","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1038\/nrg3729","article-title":"Mechanisms underlying mutational signatures in human cancers","author":"T Helleday","year":"2014","journal-title":"Nature Reviews Genetics"},{"key":"pcbi.1009119.ref014","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression Shrinkage and Selection Via the Lasso","author":"R Tibshirani","year":"1996","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)."},{"key":"pcbi.1009119.ref015","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1109\/TPAMI.2006.60","article-title":"Nonsmooth nonnegative matrix factorization (nsNMF)","volume":"28","author":"A Pascual-Montano","year":"2006","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"pcbi.1009119.ref016","doi-asserted-by":"crossref","first-page":"1495","DOI":"10.1093\/bioinformatics\/btm134","article-title":"Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis","author":"H Kim","year":"2007","journal-title":"Bioinformatics"},{"key":"pcbi.1009119.ref017","first-page":"036541","article-title":"Mutation signatures reveal biological processes in human cancer","author":"KR Covington","year":"2016","journal-title":"Cold Spring Harbor Laboratory"},{"key":"pcbi.1009119.ref018","doi-asserted-by":"crossref","first-page":"W514","DOI":"10.1093\/nar\/gkx367","article-title":"Exploring background mutational processes to decipher cancer genetic heterogeneity","volume":"45","author":"A Goncearenco","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009119.ref019","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1016\/j.cell.2019.02.012","article-title":"Characterizing Mutational Signatures in Human Cancer Cell Lines Reveals Episodic APOBEC Mutagenesis","volume":"176","author":"M Petljak","year":"2019","journal-title":"Cell"},{"key":"pcbi.1009119.ref020","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1126\/science.aau3879","article-title":"Somatic mutant clones colonize the human esophagus with age","author":"I Martincorena","year":"2018","journal-title":"Science"},{"key":"pcbi.1009119.ref021","first-page":"2020.11.25.398172","article-title":"The mutational landscape of human somatic and germline cells","author":"L Moore","year":"2020","journal-title":"bioRxiv"},{"key":"pcbi.1009119.ref022","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1038\/nature19768","article-title":"Tissue-specific mutation accumulation in human adult stem cells during life","volume":"538","author":"F Blokzijl","year":"2016","journal-title":"Nature"},{"key":"pcbi.1009119.ref023","first-page":"2021.01.09.426041","article-title":"Signatures of Mutational Processes in Human DNA Evolution","author":"H Hamidi","year":"2021","journal-title":"bioRxiv"},{"key":"pcbi.1009119.ref024","first-page":"126","article-title":"Timing, rates and spectra of human germline mutation","author":"UK10K Consortium","year":"2016","journal-title":"Nature Genetics"},{"key":"pcbi.1009119.ref025","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.apnum.2014.05.009","article-title":"Non-negative Matrix Factorization under equality constraints\u2014a study of industrial source identification","volume":"85","author":"A Limem","year":"2014","journal-title":"Appl Numer Math"},{"key":"pcbi.1009119.ref026","author":"K Gori","journal-title":"sigfit: flexible Bayesian inference of mutational signatures"},{"key":"pcbi.1009119.ref027","first-page":"2287","article-title":"Spectral Regularization Algorithms for Learning Large Incomplete Matrices","volume":"11","author":"R Mazumder","year":"2010","journal-title":"J Mach Learn Res."},{"key":"pcbi.1009119.ref028","first-page":"1592","article-title":"Automatic Relevance Determination in Nonnegative Matrix Factorization with the \/spl beta\/-Divergence","author":"VYF Tan","year":"2013","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"pcbi.1009119.ref029","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1038\/s41588-020-0692-4","article-title":"The mutational signature profile of known and suspected human carcinogens in mice","volume":"52","author":"L Riva","year":"2020","journal-title":"Nat Genet"},{"key":"pcbi.1009119.ref030","first-page":"564","article-title":"Bi-cross-validation of the SVD and the nonnegative matrix factorization","volume":"3","author":"AB Owen","year":"2009","journal-title":"Ann Appl Stat"},{"key":"pcbi.1009119.ref031","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1093\/bioinformatics\/btw572","article-title":"signeR: an empirical Bayesian approach to mutational signature discovery","volume":"33","author":"RA Rosales","year":"2017","journal-title":"Bioinformatics"},{"key":"pcbi.1009119.ref032","doi-asserted-by":"crossref","first-page":"4453","DOI":"10.1038\/s41467-018-06921-8","article-title":"Multi-omic tumor data reveal diversity of molecular mechanisms that correlate with survival.","volume":"9","author":"D Ramazzotti","year":"2018","journal-title":"Nat Commun"},{"key":"pcbi.1009119.ref033","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1038\/nmeth.4207","article-title":"Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning","volume":"14","author":"B Wang","year":"2017","journal-title":"Nat Methods"},{"key":"pcbi.1009119.ref034","doi-asserted-by":"crossref","DOI":"10.1002\/pmic.201700232","article-title":"SIMLR: A tool for large-scale genomic analyses by Multi-kernel LeaRning","volume":"18","author":"B Wang","year":"2018","journal-title":"Proteomics"},{"key":"pcbi.1009119.ref035","doi-asserted-by":"crossref","first-page":"514","DOI":"10.1038\/ng1103","article-title":"NISC Comparative Sequencing Program, Green ED. Transcription-associated mutational asymmetry in mammalian evolution","volume":"33","author":"P Green","year":"2003","journal-title":"Nat Genet"},{"key":"pcbi.1009119.ref036","doi-asserted-by":"crossref","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"J-P Brunet","year":"2004","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1009119.ref037","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization Paths for Generalized Linear Models via Coordinate Descent","volume":"33","author":"J Friedman","year":"2010","journal-title":"J Stat Softw"},{"key":"pcbi.1009119.ref038","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1186\/s13059-016-0893-4","article-title":"DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution","volume":"17","author":"R Rosenthal","year":"2016","journal-title":"Genome Biol"},{"key":"pcbi.1009119.ref039","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1186\/1471-2105-11-367","article-title":"A flexible R package for nonnegative matrix factorization","volume":"11","author":"R Gaujoux","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1009119.ref040","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1038\/ng.2591","article-title":"Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity","volume":"45","author":"AM Dulak","year":"2013","journal-title":"Nat Genet"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009119","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T00:00:00Z","timestamp":1625788800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009119","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T16:59:56Z","timestamp":1725296396000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009119"}},"subtitle":[],"editor":[{"given":"Daniel","family":"Huebschmann","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,6,28]]},"references-count":40,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,6,28]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009119","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/384834","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,28]]}}}