{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T23:58:55Z","timestamp":1740182335784,"version":"3.37.3"},"reference-count":19,"publisher":"IOP Publishing","issue":"1","license":[{"start":{"date-parts":[[2020,12,24]],"date-time":"2020-12-24T00:00:00Z","timestamp":1608768000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,12,24]],"date-time":"2020-12-24T00:00:00Z","timestamp":1608768000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"name":"Koret-UC Berkeley-Tel Aviv University Initiative in Computational Biology and Bioinformatics"},{"name":"Edmond J. Safra Center for Bioinformatics at Tel-Aviv University"}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2021,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Non-negative matrix factorization (NMF) is a popular method for finding a low rank approximation of a matrix, thereby revealing the latent components behind it. In genomics, NMF is widely used to interpret mutation data and derive the underlying mutational processes and their activities. A key challenge in the use of NMF is determining the number of components, or rank of the factorization. Here we propose a novel method, CV2K, to choose this number automatically from data that is based on a detailed cross validation procedure combined with a parsimony consideration. We apply our method for mutational signature analysis and demonstrate its utility on both simulated and real data sets. In comparison to previous approaches, some of which involve human assessment, CV2K leads to improved predictions across a wide range of data sets.<\/jats:p>","DOI":"10.1088\/2632-2153\/abc60a","type":"journal-article","created":{"date-parts":[[2020,10,29]],"date-time":"2020-10-29T22:31:31Z","timestamp":1604010691000},"page":"015013","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning"],"prefix":"10.1088","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1913-8918","authenticated-orcid":false,"given":"Gal","family":"Gilad","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Itay","family":"Sason","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Roded","family":"Sharan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2020,12,24]]},"reference":[{"key":"mlstabc60abib1","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"mlstabc60abib2","doi-asserted-by":"publisher","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"Brunet","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"mlstabc60abib3","doi-asserted-by":"publisher","first-page":"2684","DOI":"10.1093\/bioinformatics\/btn526","article-title":"Position-dependent motif characterization using non-negative matrix factorization","volume":"24","author":"Hutchins","year":"2008","journal-title":"Bioinformatics"},{"key":"mlstabc60abib4","doi-asserted-by":"publisher","first-page":"1592","DOI":"10.1109\/TPAMI.2012.240","article-title":"Automatic relevance determination in nonnegative matrix factorization with the \/spl beta\/-divergence","volume":"35","author":"Tan","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"mlstabc60abib5","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.1007\/s00216-007-1790-1","article-title":"Cross-validation of component models: A critical look at current methods","volume":"390","author":"Bro","year":"2008","journal-title":"Anal. Bioanal. Chem."},{"key":"mlstabc60abib6","doi-asserted-by":"publisher","first-page":"564","DOI":"10.1214\/08-AOAS227","article-title":"Bi-cross-validation of the SVD and the nonnegative matrix factorization","volume":"3","author":"Owen","year":"06 2009","journal-title":"Ann. Appl. Stat."},{"key":"mlstabc60abib7","first-page":"5","article-title":"Le biplot\u2014outil d\u2019exploration de donn\u00e9es multidimensionnelles","volume":"143","author":"Gabriel","year":"2002","journal-title":"J. Soc. Fran\u00e7aise Stat."},{"key":"mlstabc60abib8","doi-asserted-by":"publisher","first-page":"397","DOI":"10.1080\/00401706.1978.10489693","article-title":"Cross-validatory estimation of the number of components in factor and principal components models","volume":"20","author":"Wold","year":"1978","journal-title":"Technometrics"},{"key":"mlstabc60abib9","doi-asserted-by":"publisher","first-page":"01","DOI":"10.1.1.185.1337","article-title":"Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs","volume":"1","author":"Kanagal","year":"2010","journal-title":"Proc. Conf. Adv. Neural Inf. Process"},{"key":"mlstabc60abib10","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1186\/s12859-019-3312-5","article-title":"Optimization and expansion of non-negative matrix factorization","volume":"21","author":"Lin","year":"2020","journal-title":"BMC Bioinform."},{"key":"mlstabc60abib11","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1016\/j.celrep.2012.12.008","article-title":"Deciphering signatures of mutational processes operative in human cancer","volume":"3","author":"Alexandrov","year":"2013","journal-title":"Cell Rep"},{"key":"mlstabc60abib12","doi-asserted-by":"publisher","first-page":"D777","DOI":"10.1038\/s41467-018-04002-4","article-title":"Distinct mutational signatures characterize concurrent loss of polymerase proofreading and mismatch repair","volume":"9","author":"Haradhvala","year":"05 2018","journal-title":"Nat. Commun."},{"key":"mlstabc60abib13","doi-asserted-by":"publisher","first-page":"1364","DOI":"10.1137\/070709967","article-title":"On the complexity of nonnegative matrix factorization","volume":"20","author":"Vavasis","year":"2010","journal-title":"SIAM J. Optim."},{"key":"mlstabc60abib14","doi-asserted-by":"publisher","first-page":"pp 307","DOI":"10.1145\/2339530.2339582","article-title":"Fast Bregman divergence NMF using Taylor expansion and coordinate descent","author":"Li","year":"2012"},{"year":"1995","author":"Lawson","key":"mlstabc60abib15","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611971217"},{"key":"mlstabc60abib16","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1038\/s41586-020-1943-3","article-title":"The repertoire of mutational signatures in human cancer","volume":"578","author":"Alexandrov","year":"02 2020","journal-title":"Nature"},{"key":"mlstabc60abib17","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1038\/nature17676","article-title":"Landscape of somatic mutations in 560 breast cancer whole-genome sequences","volume":"534","author":"Nik-Zainal","year":"2016","journal-title":"Nature"},{"key":"mlstabc60abib18","doi-asserted-by":"publisher","first-page":"A68","DOI":"10.5114\/wo.2014.47136","article-title":"The cancer genome atlas (TCGA): An immeasurable source of knowledge","volume":"19","author":"Tomczak","year":"2015","journal-title":"Contemp. Oncol. (Pozn)"},{"key":"mlstabc60abib19","doi-asserted-by":"publisher","first-page":"D777","DOI":"10.1093\/nar\/gkw1121","article-title":"COSMIC: somatic cancer genetics at high-resolution","volume":"45","author":"Forbes","year":"11 2016","journal-title":"Nucleic Acids Res."}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a","content-type":"text\/html","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,22]],"date-time":"2022-01-22T01:43:24Z","timestamp":1642815804000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc60a"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,24]]},"references-count":19,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,12,24]]},"published-print":{"date-parts":[[2021,3,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/abc60a","relation":{},"ISSN":["2632-2153"],"issn-type":[{"type":"electronic","value":"2632-2153"}],"subject":[],"published":{"date-parts":[[2020,12,24]]},"assertion":[{"value":"An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2020 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2020-07-02","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2020-10-29","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2020-12-24","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}