{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T22:42:46Z","timestamp":1776379366234,"version":"3.51.2"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,9,1]],"date-time":"2016-09-01T00:00:00Z","timestamp":1472688000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Mutational signatures can be used to understand cancer origins and provide a unique opportunity to group tumor types that share the same origins and result from similar processes. These signatures have been identified from high throughput sequencing data generated from cancer genomes by using non-negative matrix factorisation (NMF) techniques. Current methods based on optimization techniques are strongly sensitive to initial conditions due to high dimensionality and nonconvexity of the NMF paradigm. In this context, an important question consists in the determination of the actual number of signatures that best represent the data. The extraction of mutational signatures from high-throughput data still remains a daunting task.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here we present a new method for the statistical estimation of mutational signatures based on an empirical Bayesian treatment of the NMF model. While requiring minimal intervention from the user, our method addresses the determination of the number of signatures directly as a model selection problem. In addition, we introduce two new concepts of significant clinical relevance for evaluating the mutational profile. The advantages brought by our approach are shown by the analysis of real and synthetic data. The later is used to compare our approach against two alternative methods mostly used in the literature and with the same NMF parametrization as the one considered here. Our approach is robust to initial conditions and more accurate than competing alternatives. It also estimates the correct number of signatures even when other methods fail. Results on real data agree well with current knowledge.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>signeR is implemented in R and C\u2009++, and is available as a R package at http:\/\/bioconductor.org\/packages\/signeR.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw572","type":"journal-article","created":{"date-parts":[[2016,9,3]],"date-time":"2016-09-03T00:19:48Z","timestamp":1472861988000},"page":"8-16","source":"Crossref","is-referenced-by-count":112,"title":["signeR: an empirical Bayesian approach to mutational signature discovery"],"prefix":"10.1093","volume":"33","author":[{"given":"Rafael A","family":"Rosales","sequence":"first","affiliation":[{"name":"Departamento de Computa\u00e7\u00e3o e Matem\u00e1tica, Universidade de S\u00e3o Paulo, Ribeir\u00e3o Preto, SP, Brazil"}]},{"given":"Rodrigo D","family":"Drummond","sequence":"additional","affiliation":[{"name":"Laboratory of Bioinformatics and Computational Biology, A. C. Camargo Cancer Center, S\u00e3o Paulo, SP, Brazil"}]},{"given":"Renan","family":"Valieris","sequence":"additional","affiliation":[{"name":"Laboratory of Bioinformatics and Computational Biology, A. C. Camargo Cancer Center, S\u00e3o Paulo, SP, Brazil"}]},{"given":"Emmanuel","family":"Dias-Neto","sequence":"additional","affiliation":[{"name":"Laboratory of Medical Genomics, A. C. Camargo Cancer Center, S\u00e3o Paulo, SP, Brazil"},{"name":"Laboratory of Neurosciences (LIM27), Department and Institute of Psychiatry, Faculty of Medicine, University of S\u00e3o Paulo, S\u00e3o Paulo, SP, Brazil"}]},{"given":"Israel T","family":"da Silva","sequence":"additional","affiliation":[{"name":"Laboratory of Bioinformatics and Computational Biology, A. C. Camargo Cancer Center, S\u00e3o Paulo, SP, Brazil"},{"name":"Laboratory of Molecular Immunology, The Rockefeller University, New York, NY, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,9,1]]},"reference":[{"key":"2023020204303589200_btw572-B1","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1016\/j.celrep.2012.12.008","article-title":"Deciphering signatures of mutational processes operative in human cancer","volume":"3","author":"Alexandrov","year":"2013","journal-title":"Cell Rep"},{"key":"2023020204303589200_btw572-B2","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.gde.2013.11.014","article-title":"Mutational signatures: the patterns of somatic mutations hidden in cancer genomes","volume":"24","author":"Alexandrov","year":"2014","journal-title":"Curr. Opin. Genet. Dev"},{"key":"2023020204303589200_btw572-B3","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/nature12477","article-title":"Signatures of mutational processes in human cancer","volume":"500","author":"Alexandrov","year":"2013","journal-title":"Nature"},{"key":"2023020204303589200_btw572-B4","first-page":"1705","article-title":"Clustering with Bregman divergences","volume":"6","author":"Banerjee","year":"2005","journal-title":"J. Mach. Learn. Res"},{"key":"2023020204303589200_btw572-B5","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1038\/nature13480","article-title":"Comprehensive molecular characterization of gastric adenocarcinoma","volume":"513","author":"Bass","year":"2014","journal-title":"Nature"},{"key":"2023020204303589200_btw572-B6","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.csda.2006.11.006","article-title":"Algorithms and applications for approximate nonnegative matrix factorization","volume":"52","author":"Berry","year":"2007","journal-title":"Comput. Stat. Data Anal"},{"key":"2023020204303589200_btw572-B7","doi-asserted-by":"crossref","first-page":"1350","DOI":"10.1016\/j.patcog.2007.09.010","article-title":"Svd based initialization: a head start for nonnegative matrix factorization","volume":"41","author":"Boutsidis","year":"2008","journal-title":"Pattern Recogn"},{"key":"2023020204303589200_btw572-B8","doi-asserted-by":"crossref","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"Brunet","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023020204303589200_btw572-B9","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1093\/biostatistics\/2.4.485","article-title":"Empirical Bayes Gibbs sampling","volume":"2","author":"Casella","year":"2001","journal-title":"Biostatistics"},{"key":"2023020204303589200_btw572-B10","first-page":"4:1","article-title":"Bayesian inference for nonnegative matrix factorisation models","volume":"2009","author":"Cemgil","year":"2009","journal-title":"Intell. Neurosci"},{"key":"2023020204303589200_btw572-B11","doi-asserted-by":"crossref","first-page":"1313","DOI":"10.1080\/01621459.1995.10476635","article-title":"Marginal likelihood from the Gibbs output","volume":"90","author":"Chib","year":"1995","journal-title":"J. Am. Statist. Assoc"},{"key":"2023020204303589200_btw572-B12","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1101\/gr.125567.111","article-title":"Mutual exclusivity analysis identifies oncogenic network modules","volume":"22","author":"Ciriello","year":"2012","journal-title":"Genome Res"},{"key":"2023020204303589200_btw572-B13","doi-asserted-by":"crossref","first-page":"537","DOI":"10.2217\/bmt.13.59","article-title":"Roles of p53 and p16 in triple-negative breast cancer","volume":"2","author":"Dang","year":"2013","journal-title":"Breast Cancer Manag"},{"key":"2023020204303589200_btw572-B14","author":"F\u00e9votte","year":"2009"},{"key":"2023020204303589200_btw572-B15","doi-asserted-by":"crossref","first-page":"R39+","DOI":"10.1186\/gb-2013-14-4-r39","article-title":"EMu: probabilistic inference of mutational processes and their localization in the cancer genome","volume":"14","author":"Fischer","year":"2013","journal-title":"Genome Biol"},{"key":"2023020204303589200_btw572-B16","doi-asserted-by":"crossref","first-page":"1220","DOI":"10.1214\/aos\/1059655912","article-title":"Convergence of the Monte Carlo expectation maximization for curved exponential families","volume":"31","author":"Fort","year":"2003","journal-title":"Ann. Statist"},{"key":"2023020204303589200_btw572-B17","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1186\/1471-2105-11-367","article-title":"A flexible R package for nonnegative matrix factorization","volume":"11","author":"Gaujoux","year":"2010","journal-title":"BMC Bioinf"},{"key":"2023020204303589200_btw572-B18","first-page":"147","article-title":"Conjugate likelihood distributions","volume":"20","author":"George","year":"1993","journal-title":"Scand. J. Stat"},{"key":"2023020204303589200_btw572-B19","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1038\/nrg3729","article-title":"Mechanisms underlying mutational signatures in human cancers","volume":"15","author":"Helleday","year":"2014","journal-title":"Nat. Rev. Genet"},{"key":"2023020204303589200_btw572-B20","doi-asserted-by":"crossref","first-page":"2684","DOI":"10.1093\/bioinformatics\/btn526","article-title":"Position-dependent motif characterization using non-negative matrix factorization","volume":"24","author":"Hutchins","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020204303589200_btw572-B21","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1080\/01621459.1995.10476572","article-title":"Bayes factors","volume":"90","author":"Kass","year":"1995","journal-title":"J. Am. Statist. Assoc"},{"key":"2023020204303589200_btw572-B22","first-page":"556","volume-title":"Advances in Neural Information Processing Systems 13","author":"Lee","year":"2001"},{"key":"2023020204303589200_btw572-B23","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1186\/bcr417","article-title":"Distinct functions of BRCA1 and BRCA2 in double-strand break repair","volume":"4","author":"Liu","year":"2002","journal-title":"Breast Cancer Res"},{"key":"2023020204303589200_btw572-B24","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1038\/nrc.2015.21","article-title":"BRCAness revisited","volume":"16","author":"Lord","year":"2016","journal-title":"Nat. Rev. Cancer"},{"key":"2023020204303589200_btw572-B25","doi-asserted-by":"crossref","first-page":"2959","DOI":"10.3892\/ol.2016.4337","article-title":"Lauren classification and individualized chemotherapy in gastric cancer","volume":"11","author":"Ma","year":"2016","journal-title":"Oncol. Lett"},{"key":"2023020204303589200_btw572-B26","doi-asserted-by":"crossref","first-page":"65","DOI":"10.2307\/1268384","article-title":"Bayesian analysis of the two-parameter gamma distribution","volume":"22","author":"Miller","year":"1980","journal-title":"Technometrics"},{"key":"2023020204303589200_btw572-B27","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1016\/j.cell.2012.04.024","article-title":"Mutational processes molding the genomes of 21 breast cancers","volume":"149","author":"Nik-Zainal","year":"2012","journal-title":"Cell"},{"key":"2023020204303589200_btw572-B28","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1038\/nrc3816","article-title":"Hypermutation in human cancer genomes: footprints and mechanisms","volume":"14","author":"Roberts","year":"2014","journal-title":"Nat. Rev. Cancer"},{"key":"2023020204303589200_btw572-B29","doi-asserted-by":"crossref","first-page":"31.","DOI":"10.1186\/s13059-016-0893-4","article-title":"deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution","volume":"17","author":"Rosenthal","year":"2016","journal-title":"Genome Biol"},{"key":"2023020204303589200_btw572-B30","first-page":"540","volume-title":"Independent Component Analysis and Signal Separation, vol. 5441, of Lecture Notes in Computer Science","author":"Schmidt","year":"2009"},{"key":"2023020204303589200_btw572-B31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1005657","article-title":"A simple model-based approach to inferring and visualizing cancer mutation signatures","volume":"11","author":"Shiraishi","year":"2015","journal-title":"PLoS Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/1\/8\/49037248\/bioinformatics_33_1_8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/1\/8\/49037248\/bioinformatics_33_1_8.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T04:31:25Z","timestamp":1675312285000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/1\/8\/2525683"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,9,1]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw572","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,1,1]]},"published":{"date-parts":[[2016,9,1]]}}}