{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,22]],"date-time":"2026-01-22T21:32:07Z","timestamp":1769117527392,"version":"3.49.0"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,5,8]],"date-time":"2023-05-08T00:00:00Z","timestamp":1683504000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,5,8]],"date-time":"2023-05-08T00:00:00Z","timestamp":1683504000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>The spectrum of mutations in a collection of cancer genomes can be described by a mixture of a few mutational signatures. The mutational signatures can be found using non-negative matrix factorization (NMF). To extract the mutational signatures we have to assume a distribution for the observed mutational counts and a number of mutational signatures. In most applications, the mutational counts are assumed to be Poisson distributed, and the rank is chosen by comparing the fit of several models with the same underlying distribution and different values for the rank using classical model selection procedures. However, the counts are often overdispersed, and thus the Negative Binomial distribution is more appropriate.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We propose a Negative Binomial NMF with a patient specific dispersion parameter to capture the variation across patients and derive the corresponding update rules for parameter estimation. We also introduce a novel model selection procedure inspired by cross-validation to determine the number of signatures. Using simulations, we study the influence of the distributional assumption on our method together with other classical model selection procedures. We also present a simulation study with a method comparison where we show that state-of-the-art methods are highly overestimating the number of signatures when overdispersion is present. We apply our proposed analysis on a wide range of simulated data and on two real data sets from breast and prostate cancer patients. On the real data we describe a residual analysis to investigate and validate the model choice.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>With our results on simulated and real data we show that our model selection procedure is more robust at determining the correct number of signatures under model misspecification. We also show that our model selection procedure is more accurate than the available methods in the literature for finding the true number of signatures. Lastly, the residual analysis clearly emphasizes the overdispersion in the mutational count data. The code for our model selection procedure and Negative Binomial NMF is available in the R package SigMoS and can be found at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/MartaPelizzola\/SigMoS\">https:\/\/github.com\/MartaPelizzola\/SigMoS<\/jats:ext-link>.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-023-05304-1","type":"journal-article","created":{"date-parts":[[2023,5,8]],"date-time":"2023-05-08T10:02:19Z","timestamp":1683540139000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Model selection and robust inference of mutational signatures using Negative Binomial non-negative matrix factorization"],"prefix":"10.1186","volume":"24","author":[{"given":"Marta","family":"Pelizzola","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ragnhild","family":"Laursen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Asger","family":"Hobolth","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,5,8]]},"reference":[{"issue":"1","key":"5304_CR1","doi-asserted-by":"publisher","DOI":"10.1371\/JOURNAL.PGEN.1007108","volume":"14","author":"RA Risques","year":"2018","unstructured":"Risques RA, Kennedy SR. Aging and the rise of somatic cancer-associated mutations in normal tissues. PLoS Genet. 2018;14(1): e1007108. https:\/\/doi.org\/10.1371\/JOURNAL.PGEN.1007108.","journal-title":"PLoS Genet"},{"issue":"1","key":"5304_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-017-15008-1","volume":"7","author":"A Shibai","year":"2017","unstructured":"Shibai A, Takahashi Y, Ishizawa Y, Motooka D, Nakamura S, Ying B-W, Tsuru S. Mutation accumulation under UV radiation in Escherichia coli. Sci Rep. 2017;7(1):1\u201312. https:\/\/doi.org\/10.1038\/s41598-017-15008-1.","journal-title":"Sci Rep"},{"issue":"6312","key":"5304_CR3","doi-asserted-by":"publisher","first-page":"618","DOI":"10.1126\/SCIENCE.AAG0299","volume":"354","author":"LB Alexandrov","year":"2016","unstructured":"Alexandrov LB, Ju YS, Haase K, Van Loo P, Martincorena I, Nik-Zainal S, Totoki Y, Fujimoto A, Nakagawa H, Shibata T, Campbell PJ, Vineis P, Phillips DH, Stratton MR. Mutational signatures associated with tobacco smoking in human cancer. Science. 2016;354(6312):618\u201322. https:\/\/doi.org\/10.1126\/SCIENCE.AAG0299.","journal-title":"Science"},{"issue":"7793","key":"5304_CR4","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1038\/s41586-020-1943-3","volume":"578","author":"LB Alexandrov","year":"2020","unstructured":"Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, Boot A, Covington KR, Gordenin DA, Bergstrom EN, Islam SMA, Lopez-Bigas N, Klimczak LJ, McPherson JR, Morganella S, Sabarinathan R, Wheeler DA, Mustonen V, Getz G, Rozen SG, Stratton MR. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94\u2013101. https:\/\/doi.org\/10.1038\/s41586-020-1943-3.","journal-title":"Nature"},{"issue":"D1","key":"5304_CR5","doi-asserted-by":"publisher","first-page":"941","DOI":"10.1093\/NAR\/GKY1015","volume":"47","author":"JG Tate","year":"2019","unstructured":"Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, Boutselakis H, Cole CG, Creatore C, Dawson E, Fish P, Harsha B, Hathaway C, Jupe SC, Kok CY, Noble K, Ponting L, Ramshaw CC, Rye CE, Speedy HE, Stefancsik R, Thompson SL, Wang S, Ward S, Campbell PJ, Forbes SA. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47(D1):941\u20137. https:\/\/doi.org\/10.1093\/NAR\/GKY1015.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"5304_CR6","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1016\/j.celrep.2012.12.008","volume":"3","author":"LB Alexandrov","year":"2013","unstructured":"Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 2013;3(1):264\u2013259.","journal-title":"Cell Rep"},{"issue":"5","key":"5304_CR7","doi-asserted-by":"publisher","first-page":"979","DOI":"10.1016\/j.cell.2012.04.024","volume":"149","author":"S Nik-Zainal","year":"2012","unstructured":"...Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA, Menzies A, Martin S, Leung K, Chen L, Leroy C, Ramakrishna M, Rance R, Lau KW, Mudie LJ, Varela I, McBride DJ, Bignell GR, Cooke SL, Shlien A, Gamble J, Whitmore I, Maddison M, Tarpey PS, Davies HR, Papaemmanuil E, Stephens PJ, McLaren S, Butler AP, Teague JW, J\u00f6nsson G, Garber JE, Silver D, Miron P, Fatima A, Boyault S, Langerod A, Tutt A, Martens JWM, Aparicio SAJR, Borg \u00c5, Salomon AV, Thomas G, Borresen-Dale AL, Richardson AL, Neuberger MS, Futreal PA, Campbell PJ, Stratton MR. Mutational processes molding the genomes of 21 breast cancers. Cell. 2012;149(5):979\u201393. https:\/\/doi.org\/10.1016\/j.cell.2012.04.024.","journal-title":"Cell"},{"issue":"6","key":"5304_CR8","doi-asserted-by":"publisher","first-page":"1009119","DOI":"10.1371\/JOURNAL.PCBI.1009119","volume":"17","author":"A Lal","year":"2021","unstructured":"Lal A, Liu K, Tibshirani R, Sidow A, Ramazzotti D. De novo mutational signature discovery in tumor genomes using SparseSignatures. PLoS Comput Biol. 2021;17(6):1009119. https:\/\/doi.org\/10.1371\/JOURNAL.PCBI.1009119.","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"5304_CR9","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1093\/bib\/bbx082","volume":"20","author":"A Baez-Ortega","year":"2017","unstructured":"Baez-Ortega A, Gori K. Computational approaches for discovery of mutational signatures in cancer. Brief Bioinform. 2017;20(1):77\u201388. https:\/\/doi.org\/10.1093\/bib\/bbx082.","journal-title":"Brief Bioinform"},{"issue":"9","key":"5304_CR10","doi-asserted-by":"publisher","first-page":"0221235","DOI":"10.1371\/journal.pone.0221235","volume":"14","author":"H Omichessan","year":"2019","unstructured":"Omichessan H, Severi G, Perduca V. Computational tools to detect signatures of mutational processes in DNA from tumours: a review and empirical comparison of performance. PLoS ONE. 2019;14(9):0221235. https:\/\/doi.org\/10.1371\/journal.pone.0221235.","journal-title":"PLoS ONE"},{"issue":"7463","key":"5304_CR11","doi-asserted-by":"publisher","first-page":"415","DOI":"10.1038\/nature12477","volume":"500","author":"LB Alexandrov","year":"2013","unstructured":"Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, B\u00f8rresen-Dale A-L, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415\u201321.","journal-title":"Nature"},{"issue":"4","key":"5304_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/gb-2013-14-4-r39","volume":"14","author":"A Fischer","year":"2013","unstructured":"Fischer A, Illingworth CJR, Campbell PJ, Mustonen V. EMu: Probabilistic inference of mutational processes and their localization in the cancer genome. Genome Biol. 2013;14(4):1\u201310. https:\/\/doi.org\/10.1186\/gb-2013-14-4-r39.","journal-title":"Genome Biol"},{"issue":"1","key":"5304_CR13","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1093\/bioinformatics\/btw572","volume":"33","author":"RA Rosales","year":"2017","unstructured":"Rosales RA, Drummond RD, Valieris R, Dias-Neto E, Da Silva IT. signeR: an empirical Bayesian approach to mutational signature discovery. Bioinformatics. 2017;33(1):8\u201316. https:\/\/doi.org\/10.1093\/bioinformatics\/btw572.","journal-title":"Bioinformatics"},{"issue":"6755","key":"5304_CR14","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1038\/44565","volume":"401","author":"DD Lee","year":"1999","unstructured":"Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401(6755):788\u201391. https:\/\/doi.org\/10.1038\/44565.","journal-title":"Nature"},{"issue":"2","key":"5304_CR15","doi-asserted-by":"publisher","first-page":"176","DOI":"10.2307\/3001850","volume":"9","author":"CI Bliss","year":"1953","unstructured":"Bliss CI, Fisher RA. Fitting the negative binomial distribution to biological data. Biometrics. 1953;9(2):176. https:\/\/doi.org\/10.2307\/3001850.","journal-title":"Biometrics"},{"issue":"5","key":"5304_CR16","doi-asserted-by":"publisher","first-page":"1029","DOI":"10.1016\/J.CELL.2017.09.042","volume":"171","author":"I Martincorena","year":"2017","unstructured":"Martincorena I, Raine K, Gerstung M, Dawson K, Haase K, Van Loo P, Davies H, Stratton M, Campbell P. Universal patterns of selection in cancer and somatic tissues. Cell. 2017;171(5):1029\u2013104121. https:\/\/doi.org\/10.1016\/J.CELL.2017.09.042.","journal-title":"Cell"},{"issue":"1","key":"5304_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S12859-020-03758-1","volume":"21","author":"J Zhang","year":"2020","unstructured":"Zhang J, Liu J, McGillivray P, Yi C, Lochovsky L, Lee D, Gerstein M. NIMBus: a negative binomial regression based integrative method for mutation burden analysis. BMC Bioinform 2020 21:1. 2020;21(1):1\u201325. https:\/\/doi.org\/10.1186\/S12859-020-03758-1.","journal-title":"BMC Bioinform 2020 21:1"},{"key":"5304_CR18","doi-asserted-by":"publisher","first-page":"815","DOI":"10.1109\/LSP.2020.2991613","volume":"27","author":"O Gouvert","year":"2020","unstructured":"Gouvert O, Oberlin T, Fevotte C. Negative binomial matrix factorization. IEEE Signal Process Lett. 2020;27:815\u20139. https:\/\/doi.org\/10.1109\/LSP.2020.2991613.","journal-title":"IEEE Signal Process Lett"},{"key":"5304_CR19","doi-asserted-by":"publisher","unstructured":"Gori K, Baez-Ortega A. sigfit: flexible Bayesian inference of mutational signatures; 2018. https:\/\/doi.org\/10.1101\/372896","DOI":"10.1101\/372896"},{"issue":"Suppl-1","key":"5304_CR20","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1093\/BIOINFORMATICS\/BTAA473","volume":"36","author":"X Lyu","year":"2020","unstructured":"Lyu X, Garret J, R\u00e4tsch G, Lehmann KV. Mutational signature learning with supervised negative binomial non-negative matrix factorization. Bioinformatics. 2020;36(Suppl-1):154\u201360. https:\/\/doi.org\/10.1093\/BIOINFORMATICS\/BTAA473.","journal-title":"Bioinformatics"},{"issue":"1","key":"5304_CR21","doi-asserted-by":"publisher","first-page":"3628","DOI":"10.1038\/s41467-021-23551-9","volume":"12","author":"H V\u00f6hringer","year":"2021","unstructured":"V\u00f6hringer H, Hoeck AV, Cuppen E, Gerstung M. Learning mutational signatures and their multidimensional genomic properties with TensorSignatures. Nat Commun. 2021;12(1):3628. https:\/\/doi.org\/10.1038\/s41467-021-23551-9.","journal-title":"Nat Commun"},{"issue":"3","key":"5304_CR22","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1162\/NECO.2008.04-08-771","volume":"21","author":"C F\u00e9votte","year":"2009","unstructured":"F\u00e9votte C, Bertin N, Durrieu J. Nonnegative matrix factorization with the Itakura\u2013Saito divergence: with application to music analysis. Neural Comput. 2009;21(3):793\u2013830. https:\/\/doi.org\/10.1162\/NECO.2008.04-08-771.","journal-title":"Neural Comput"},{"issue":"11","key":"5304_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.xgen.2022.100179","volume":"2","author":"SMA Islam","year":"2022","unstructured":"Islam SMA, D\u00edaz-Gay M, Wu Y, Barnes M, Vangara R, Bergstrom EN, He Y, Vella M, Wang J, Teague JW, Clapham P, Moody S, Senkin S, Li YR, Riva L, Zhang T, Gruber AJ, Steele CD, Otlu B, Khandekar A, Abbasi A, Humphreys L, Syulyukina N, Brady SW, Alexandrov BS, Pillay N, Zhang J, Adams DJ, Martincorena I, Wedge DC, Landi MT, Brennan P, Stratton MR, Rozen SG, Alexandrov LB. Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor. Cell Genomics. 2022;2(11): 100179. https:\/\/doi.org\/10.1016\/j.xgen.2022.100179.","journal-title":"Cell Genomics"},{"issue":"1","key":"5304_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-019-1836-7","volume":"20","author":"A Taylor-Weiner","year":"2019","unstructured":"Taylor-Weiner A, Aguet F, Haradhvala NJ, Gosai S, Anand S, Kim J, Ardlie K, Allen EMV, Getz G. Scaling computational genomics to millions of individuals with GPUs. Genome Biol. 2019;20(1):1\u20135. https:\/\/doi.org\/10.1186\/s13059-019-1836-7.","journal-title":"Genome Biol"},{"issue":"7793","key":"5304_CR25","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1038\/s41586-020-1969-6","volume":"578","author":"PJ Campbell","year":"2020","unstructured":"Campbell PJ. Pan-cancer analysis of whole genomes. Nature. 2020;578(7793):82\u201393. https:\/\/doi.org\/10.1038\/s41586-020-1969-6.","journal-title":"Nature"},{"issue":"4","key":"5304_CR26","doi-asserted-by":"publisher","first-page":"351","DOI":"10.1080\/00401706.1993.10485350","volume":"35","author":"RD Cook","year":"1993","unstructured":"Cook RD. Exploring partial residual plots. Technometrics. 1993;35(4):351\u201362. https:\/\/doi.org\/10.1080\/00401706.1993.10485350.","journal-title":"Technometrics"},{"key":"5304_CR27","doi-asserted-by":"publisher","unstructured":"Miles J. Residual plot. 2014. https:\/\/doi.org\/10.1002\/9781118445112.stat06619.","DOI":"10.1002\/9781118445112.stat06619"},{"issue":"6591","key":"5304_CR28","doi-asserted-by":"publisher","first-page":"9283","DOI":"10.1126\/science.abl9283","volume":"376","author":"A Degasperi","year":"2022","unstructured":"Degasperi A, Zou X, Amarante TD, Martinez-Martinez A, Koh GCC, Dias JML, Heskin L, Chmelova L, Rinaldi G, Wang VYW, Nanda AS, Bernstein A, Momen SE, Young J, Perez-Gil D, Memari Y, Badja C, Shooter S, Czarnecki J, Brown MA, Davies HR, Nik-Zainal S, Ambrose JC, Arumugam P, Bevers R, Bleda M, Boardman-Pretty F, Boustred CR, Brittain H, Caulfield MJ, Chan GC, Fowler T, Giess A, Hamblin A, Henderson S, Hubbard TJP, Jackson R, Jones LJ, Kasperaviciute D, Kayikci M, Kousathanas A, Lahnstein L, Leigh SEA, Leong IUS, Lopez FJ, Maleady-Crowe F, McEntagart M, Minneci F, Moutsianas L, Mueller M, Murugaesu N, Need AC, O\u2019Donovan P, Odhams CA, Patch C, Perez-Gil D, Pereira MB, Pullinger J, Rahim T, Rendon A, Rogers T, Savage K, Sawant K, Scott RH, Siddiq A, Sieghart A, Smith SC, Sosinsky A, Stuckey A, Tanguy M, Tavares ALT, Thomas ERA, Thompson SR, Tucci A, Welland MJ, Williams E, Witkowska K, Wood SM. Substitution mutational signatures in whole-genome sequenced cancers in the UK population. Science. 2022;376(6591):9283. https:\/\/doi.org\/10.1126\/science.abl9283.","journal-title":"Science"},{"key":"5304_CR29","doi-asserted-by":"publisher","unstructured":"Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, Martincorena I, Alexandrov LB, Martin S, Wedge DC, Loo PV, Ju YS, Smid M, Brinkman AB, Morganella S, Aure MR, Lingj\u00e6rde OC, Langer\u00f8d A, Ringn\u00e9r M, Ahn S-M, Boyault S, Brock JE, Broeks A, Butler A, Desmedt C, Dirix L, Dronov S, Fatima A, Foekens JA, Gerstung M, Hooijer GKJ, Jang SJ, Jones DR, Kim H-Y, King TA, Krishnamurthy S, Lee HJ, Lee J-Y, Li Y, McLaren S, Menzies A, Mustonen V, O\u2019Meara S, Pauport\u00e9 I, Pivot X, Purdie CA, Raine K, Ramakrishnan K, Rodr\u00edguez-Gonz\u00e1lez FG, Romieu G, Sieuwerts AM, Simpson PT, Shepherd R, Stebbings L, Stefansson OA, Teague J, Tommasi S, Treilleux I, den Eynden GGV, Vermeulen P, Vincent-Salomon A, Yates L, Caldas C, van\u2019t Veer L, Tutt A, Knappskog S, Tan BKT, Jonkers J, Borg \u00c5, Ueno NT, Sotiriou C, Viari A, Futreal PA, Campbell PJ, Span PN, Laere SV, Lakhani SR, Eyfjord JE, Thompson AM, Birney E, Stunnenberg HG, van\u00a0de Vijver MJ, Martens JWM, B\u00f8rresen-Dale A-L, Richardson AL, Kong G, Thomas G, Stratton MR. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 2016;534(7605):47\u201354. https:\/\/doi.org\/10.1038\/nature17676","DOI":"10.1038\/nature17676"},{"issue":"4","key":"5304_CR30","doi-asserted-by":"publisher","first-page":"1009309","DOI":"10.1371\/journal.pcbi.1009309","volume":"18","author":"D Lee","year":"2022","unstructured":"Lee D, Wang D, Yang XR, Shi J, Landi MT, Zhu B. SUITOR: selecting the number of mutational signatures through cross-validation. PLoS Comput Biol. 2022;18(4):1009309. https:\/\/doi.org\/10.1371\/journal.pcbi.1009309.","journal-title":"PLoS Comput Biol"},{"issue":"27","key":"5304_CR31","doi-asserted-by":"publisher","first-page":"5031","DOI":"10.1038\/s41388-020-1343-z","volume":"39","author":"G Pei","year":"2020","unstructured":"Pei G, Hu R, Dai Y, Zhao Z, Jia P. Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network. Oncogene. 2020;39(27):5031\u201341. https:\/\/doi.org\/10.1038\/s41388-020-1343-z.","journal-title":"Oncogene"},{"issue":"9","key":"5304_CR32","doi-asserted-by":"publisher","first-page":"2421","DOI":"10.1162\/NECO_a_00168","volume":"23","author":"C F\u00e9votte","year":"2011","unstructured":"F\u00e9votte C, Idier J. Algorithms for nonnegative matrix factorization with the $$\\beta$$-divergence. Neural Comput. 2011;23(9):2421\u201356 arXiv:1010.1763.","journal-title":"Neural Comput"},{"key":"5304_CR33","doi-asserted-by":"crossref","unstructured":"Li L, Lebanon G, Park H. Fast Bregman divergence NMF using Taylor expansion and coordinate descent. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. 2012.","DOI":"10.1145\/2339530.2339582"},{"issue":"11","key":"5304_CR34","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.1038\/NG.3101","volume":"46","author":"N Weinhold","year":"2014","unstructured":"Weinhold N, Jacobsen A, Schultz N, Sander C, Lee W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat Genet. 2014;46(11):1160\u20135. https:\/\/doi.org\/10.1038\/NG.3101.","journal-title":"Nat Genet"},{"issue":"17","key":"5304_CR35","doi-asserted-by":"publisher","first-page":"8123","DOI":"10.1093\/NAR\/GKV803","volume":"43","author":"L Lochovsky","year":"2015","unstructured":"Lochovsky L, Zhang J, Fu Y, Khurana E, Gerstein M. LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res. 2015;43(17):8123\u201334. https:\/\/doi.org\/10.1093\/NAR\/GKV803.","journal-title":"Nucleic Acids Res"},{"issue":"7457","key":"5304_CR36","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1038\/nature12213","volume":"499","author":"MS Lawrence","year":"2013","unstructured":"Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cort\u00e9s ML, Auclair D, Saksena G, Voet D, Noble M, Dicara D, Lin P, Lichtenstein L, Heiman DI, Fennell T, Imielinski M, Hernandez B, Hodis E, Baca S, Dulak AM, Lohr J, Landau DA, Wu CJ, Melendez-Zajgla J, Hidalgo-Miranda A, Koren A, McCarroll SA, Mora J, Lee RS, Crompton B, Onofrio R, Parkin M, Winckler W, Ardlie K, Gabriel SB, Roberts CWM, Biegel JA, Stegmaier K, Bass AJ, Garraway LA, Meyerson M, Golub TR, Gordenin DA, Sunyaev S, Lander ES, Getz G. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499(7457):214\u20138. https:\/\/doi.org\/10.1038\/nature12213.","journal-title":"Nature"},{"issue":"1","key":"5304_CR37","doi-asserted-by":"publisher","first-page":"39","DOI":"10.12732\/ijpam.v98i1.5","volume":"98","author":"K Teerapabolarn","year":"2015","unstructured":"Teerapabolarn K. Negative Binomial approximation to the Beta Binomial distribution. Int J Pure Appl Math. 2015;98(1):39\u201343. https:\/\/doi.org\/10.12732\/ijpam.v98i1.5.","journal-title":"Int J Pure Appl Math"},{"issue":"1","key":"5304_CR38","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1137\/20M1378971","volume":"43","author":"R Laursen","year":"2022","unstructured":"Laursen R, Hobolth A. A sampling algorithm to compute the set of feasible solutions for non-negative matrix factorization with an arbitrary rank. SIAM J Matrix Anal Appl. 2022;43(1):257\u201373.","journal-title":"SIAM J Matrix Anal Appl"},{"key":"5304_CR39","doi-asserted-by":"publisher","first-page":"72","DOI":"10.1016\/J.PATREC.2018.09.003","volume":"116","author":"A Gupta","year":"2018","unstructured":"Gupta A, Datta S, Das S. Fast automatic estimation of the number of clusters from the minimum inter-center distance for k-means clustering. Pattern Recogn Lett. 2018;116:72\u20139. https:\/\/doi.org\/10.1016\/J.PATREC.2018.09.003.","journal-title":"Pattern Recogn Lett"},{"issue":"2","key":"5304_CR40","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1093\/genetics\/155.2.945","volume":"155","author":"JK Pritchard","year":"2000","unstructured":"Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945\u201359.","journal-title":"Genetics"},{"issue":"4","key":"5304_CR41","doi-asserted-by":"publisher","first-page":"1827","DOI":"10.1534\/genetics.115.180992","volume":"203","author":"R Verity","year":"2016","unstructured":"Verity R, Nichols RA. Estimating the number of subpopulations (K) in structured populations. Genetics. 2016;203(4):1827\u201339. https:\/\/doi.org\/10.1534\/genetics.115.180992.","journal-title":"Genetics"},{"issue":"6","key":"5304_CR42","doi-asserted-by":"publisher","first-page":"997","DOI":"10.1007\/S11222-013-9416-2","volume":"24","author":"A Gelman","year":"2013","unstructured":"Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for Bayesian models. Stat Comput. 2013;24(6):997\u20131016. https:\/\/doi.org\/10.1007\/S11222-013-9416-2.","journal-title":"Stat Comput"},{"issue":"2","key":"5304_CR43","first-page":"183","volume":"59","author":"Y Luo","year":"2017","unstructured":"Luo Y, Al-Harbi K, Luo Y, Al-Harbi K. Performances of LOO and WAIC as IRT model selection methods. Psychol Test Assess Model. 2017;59(2):183\u2013205.","journal-title":"Psychol Test Assess Model"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05304-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05304-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05304-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,8]],"date-time":"2023-05-08T10:03:05Z","timestamp":1683540185000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05304-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,8]]},"references-count":43,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5304"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05304-1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,8]]},"assertion":[{"value":"11 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 April 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 May 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"187"}}