{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T20:58:20Z","timestamp":1767905900301,"version":"3.49.0"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"15","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Over the past decade, several biclustering approaches have been published in the field of gene expression data analysis. Despite of huge diversity regarding the mathematical concepts of the different biclustering methods, many of them can be related to the singular value decomposition (SVD). Recently, a sparse SVD approach (SSVD) has been proposed to reveal biclusters in gene expression data. In this article, we propose to incorporate stability selection to improve this method. Stability selection is a subsampling-based variable selection that allows to control Type I error rates. The here proposed S4VD algorithm incorporates this subsampling approach to find stable biclusters, and to estimate the selection probabilities of genes and samples to belong to the biclusters.<\/jats:p><jats:p>Results: So far, the S4VD method is the first biclustering approach that takes the cluster stability regarding perturbations of the data into account. Application of the S4VD algorithm to a lung cancer microarray dataset revealed biclusters that correspond to coregulated genes associated with cancer subtypes. Marker genes for different lung cancer subtypes showed high selection probabilities to belong to the corresponding biclusters. Moreover, the genes associated with the biclusters belong to significantly enriched cancer-related Gene Ontology categories. In a simulation study, the S4VD algorithm outperformed the SSVD algorithm and two other SVD-related biclustering methods in recovering artificial biclusters and in being robust to noisy data.<\/jats:p><jats:p>Availability: R-Code of the S4VD algorithm as well as a documentation can be found at http:\/\/s4vd.r-forge.r-project.org\/.<\/jats:p><jats:p>Contact: \u00a0m.sill@dkfz.de<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr322","type":"journal-article","created":{"date-parts":[[2011,6,3]],"date-time":"2011-06-03T11:53:17Z","timestamp":1307101997000},"page":"2089-2097","source":"Crossref","is-referenced-by-count":64,"title":["Robust biclustering by sparse singular value decomposition incorporating stability selection"],"prefix":"10.1093","volume":"27","author":[{"given":"Martin","family":"Sill","sequence":"first","affiliation":[{"name":"1 Division of Biostatistics, DKFZ, 69120 Heidelberg and 2Working Group Computational Statistics, LMU, 80539 M\u00fcnchen, Germany"}]},{"given":"Sebastian","family":"Kaiser","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, DKFZ, 69120 Heidelberg and 2Working Group Computational Statistics, LMU, 80539 M\u00fcnchen, Germany"}]},{"given":"Axel","family":"Benner","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, DKFZ, 69120 Heidelberg and 2Working Group Computational Statistics, LMU, 80539 M\u00fcnchen, Germany"}]},{"given":"Annette","family":"Kopp-Schneider","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, DKFZ, 69120 Heidelberg and 2Working Group Computational Statistics, LMU, 80539 M\u00fcnchen, Germany"}]}],"member":"286","published-online":{"date-parts":[[2011,6,2]]},"reference":[{"key":"2023012511533422500_B1","doi-asserted-by":"crossref","first-page":"1600","DOI":"10.1093\/bioinformatics\/btl140","article-title":"Improved scoring of functional groups from gene expression data by decorrelating GO graph structure","volume":"22","author":"Alexa","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B2","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1089\/10665270360688075","article-title":"Discovering local structure in gene expression data: the order-preserving submatrix problem","volume":"10","author":"Ben-Dor","year":"2003","journal-title":"J. Comput. Biol."},{"issue":"3 Pt 1","key":"2023012511533422500_B3","doi-asserted-by":"crossref","first-page":"031902","DOI":"10.1103\/PhysRevE.67.031902","article-title":"Iterative signature algorithm for the analysis of large-scale gene expression data","volume":"67","author":"Bergmann","year":"2003","journal-title":"Phys. Rev. E. Stat. Nonlin. Soft. Matter Phys."},{"key":"2023012511533422500_B4","doi-asserted-by":"crossref","first-page":"2795","DOI":"10.1093\/bioinformatics\/btp526","article-title":"Bi-correlation clustering algorithm for determining a set of co-regulated genes","volume":"25","author":"Bhattacharya","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B5","doi-asserted-by":"crossref","first-page":"13790","DOI":"10.1073\/pnas.191502998","article-title":"Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses","volume":"98","author":"Bhattacharjee","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511533422500_B6","doi-asserted-by":"crossref","first-page":"2964","DOI":"10.1016\/j.cor.2007.01.005","article-title":"Biclustering in data mining","volume":"35","author":"Busygin","year":"2008","journal-title":"Comput. Oper. Res."},{"key":"2023012511533422500_B7","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1186\/1471-2105-7-78","article-title":"Biclustering of gene expression data by non-smooth non-negative matrix factorization","volume":"7","author":"Carmona-Saez","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012511533422500_B8","first-page":"93","article-title":"Biclustering of expression data","volume":"8","author":"Cheng","year":"2000","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"2023012511533422500_B9","doi-asserted-by":"crossref","first-page":"1376","DOI":"10.1093\/bioinformatics\/btq130","article-title":"Modular analysis of gene expression data with r","volume":"26","author":"Csardi","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B10","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1214\/ss\/1056397487","article-title":"Multiple hypothesis testing in microarray experiments","volume":"18","author":"Dudoit","year":"2003","journal-title":"Stat. Sci."},{"key":"2023012511533422500_B11","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/BF02288367","article-title":"The approximation of one matrix by another of lower rank","volume":"1","author":"Eckart","year":"1936","journal-title":"Psychometrika"},{"key":"2023012511533422500_B12","doi-asserted-by":"crossref","first-page":"12079","DOI":"10.1073\/pnas.210134797","article-title":"Coupled two-way clustering analysis of gene microarray data","volume":"97","author":"Getz","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511533422500_B13","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1080\/01621459.1972.10481214","article-title":"Direct clustering of a data matrix","volume":"67","author":"Hartigan","year":"1972","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012511533422500_B14","doi-asserted-by":"crossref","first-page":"1520","DOI":"10.1093\/bioinformatics\/btq227","article-title":"Fabia: factor analysis for bicluster acquisition","volume":"26","author":"Hochreiter","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B15","first-page":"61","article-title":"Plaid models for gene expression data","volume":"12","author":"Lazzeroni","year":"2000","journal-title":"Stat. Sin."},{"key":"2023012511533422500_B16","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1111\/j.1541-0420.2010.01392.x","article-title":"Biclustering via sparse singular value decomposition","volume":"66","author":"Lee","year":"2010","journal-title":"Biometrics"},{"key":"2023012511533422500_B17","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1109\/TCBB.2004.2","article-title":"Biclustering algorithms for biological data analysis: a survey","volume":"1","author":"Madeira","year":"2004","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinformatics"},{"key":"2023012511533422500_B18","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1111\/j.1467-9868.2010.00740.x","article-title":"Stability selection","volume":"72","author":"Meinshausen","year":"2010","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"2023012511533422500_B19","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/A:1023949509487","article-title":"Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data","volume":"52","author":"Monti","year":"2003","journal-title":"Mach. Learn."},{"key":"2023012511533422500_B20","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1093\/bioinformatics\/btl060","article-title":"A systematic comparison and evaluation of biclustering methods for gene expression data","volume":"22","author":"Prelic","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B21","first-page":"780","article-title":"Methods to bicluster validation and comparison in microarray data","volume-title":"Proceedings of the 8th International Conference on Intelligent Data Engineering and Automated Learning","author":"Santamar\u00eda","year":"2007"},{"key":"2023012511533422500_B22","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1093\/bioinformatics\/btl117","article-title":"Pvclust: an r package for assessing the uncertainty in hierarchical clustering","volume":"22","author":"Suzuki","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511533422500_B23","doi-asserted-by":"crossref","first-page":"2981","DOI":"10.1073\/pnas.0308661100","article-title":"Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data","volume":"101","author":"Tanay","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511533422500_B24","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B Methodol."},{"key":"2023012511533422500_B25","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1109\/TCBB.2005.49","article-title":"Biclustering models for structured microarray data","volume":"2","author":"Turner","year":"2005","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012511533422500_B26","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1191\/0962280204sm373ra","article-title":"Two-mode clustering methods: a structured overview","volume":"13","author":"Van Mechelen","year":"2004","journal-title":"Stat. Methods Med. Res."},{"key":"2023012511533422500_B27","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1198\/016214506000000735","article-title":"The adaptive lasso and its oracle properties","volume":"101","author":"Zou","year":"2006","journal-title":"J. Am. Stat. Assoc."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/15\/2089\/48866640\/bioinformatics_27_15_2089.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/15\/2089\/48866640\/bioinformatics_27_15_2089.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T23:15:13Z","timestamp":1741216513000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/15\/2089\/400713"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,6,2]]},"references-count":27,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2011,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr322","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,8,1]]},"published":{"date-parts":[[2011,6,2]]}}}