{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T20:34:18Z","timestamp":1761597258617,"version":"3.38.0"},"reference-count":21,"publisher":"Springer Science and Business Media LLC","issue":"S1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (<jats:italic>i.e.<\/jats:italic>, a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed<jats:italic>Pro<\/jats:italic>Bic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We applied<jats:italic>Pro<\/jats:italic>Bic on a large scale<jats:italic>Escherichia coli<\/jats:italic>compendium to extend partially described regulons with potentially novel members. We compared<jats:italic>Pro<\/jats:italic>Bic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance.<\/jats:p><jats:p>This comparison learns that<jats:italic>Pro<\/jats:italic>Bic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p><jats:italic>Pro<\/jats:italic>Bic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-12-s1-s37","type":"journal-article","created":{"date-parts":[[2011,2,18]],"date-time":"2011-02-18T20:07:34Z","timestamp":1298059654000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Query-based biclustering of gene expression data using Probabilistic Relational Models"],"prefix":"10.1186","volume":"12","author":[{"given":"Hui","family":"Zhao","sequence":"first","affiliation":[]},{"given":"Lore","family":"Cloots","sequence":"additional","affiliation":[]},{"given":"Tim","family":"Van den Bulcke","sequence":"additional","affiliation":[]},{"given":"Yan","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Riet","family":"De Smet","sequence":"additional","affiliation":[]},{"given":"Valerie","family":"Storms","sequence":"additional","affiliation":[]},{"given":"Pieter","family":"Meysman","sequence":"additional","affiliation":[]},{"given":"Kristof","family":"Engelen","sequence":"additional","affiliation":[]},{"given":"Kathleen","family":"Marchal","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,2,15]]},"reference":[{"key":"4395_CR1","doi-asserted-by":"publisher","first-page":"525","DOI":"10.2174\/138920208786847935","volume":"9","author":"AC Fierro","year":"2008","unstructured":"Fierro AC, Vandenbussche F, Engelen K, Van de Peer Y, Marchal K: Meta Analysis of Gene Expression Data within and Across Species. Curr Genomics 2008, 9: 525\u2013534. 10.2174\/138920208786847935","journal-title":"Curr Genomics"},{"key":"4395_CR2","doi-asserted-by":"crossref","first-page":"1828","DOI":"10.1101\/gr.1125403","volume":"13","author":"AB Owen","year":"2003","unstructured":"Owen AB, Stuart J, Mach K, Villeneuve AM, Kim S: A gene recommender algorithm to identify coexpressed genes in C. elegans. Genome Res 2003, 13: 1828\u20131837.","journal-title":"Genome Res"},{"issue":"3 Pt 1","key":"4395_CR3","doi-asserted-by":"publisher","first-page":"031902","DOI":"10.1103\/PhysRevE.67.031902","volume":"67","author":"S Bergmann","year":"2003","unstructured":"Bergmann S, Ihmels J, Barkai N: Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Res E Stat Nonlin Soft Matter Phys 2003, 67(3 Pt 1):031902. 10.1103\/PhysRevE.67.031902","journal-title":"Phys Res E Stat Nonlin Soft Matter Phys"},{"key":"4395_CR4","doi-asserted-by":"publisher","first-page":"W596","DOI":"10.1093\/nar\/gki469","volume":"33","author":"CJ Wu","year":"2005","unstructured":"Wu CJ, Kasif S: GEMS: a web server for biclustering analysis of biclustering data. Nucleic Acids Res 2005, 33: W596-W599. 10.1093\/nar\/gki469","journal-title":"Nucleic Acids Res"},{"key":"4395_CR5","doi-asserted-by":"publisher","first-page":"2573","DOI":"10.1093\/bioinformatics\/btm387","volume":"23","author":"T Dhollander","year":"2007","unstructured":"Dhollander T, Sheng Q, Lemmens K, De Moor B, Marchal K, Moreau Y: Query-driven module discovery in microarray data. Bioinformatics 2007, 23: 2573\u20132580. 10.1093\/bioinformatics\/btm387","journal-title":"Bioinformatics"},{"key":"4395_CR6","doi-asserted-by":"publisher","first-page":"2692","DOI":"10.1093\/bioinformatics\/btm403","volume":"23","author":"MA Hibbs","year":"2007","unstructured":"Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG: Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 2007, 23: 2692\u20132699. 10.1093\/bioinformatics\/btm403","journal-title":"Bioinformatics"},{"key":"4395_CR7","first-page":"580","volume-title":"Proceedings of the Fifteenth National Conference on Artificial Intelligence: 26-30 July 1998; Madison","author":"D Koller","year":"1998","unstructured":"Koller D, Pfeffer A: Probabilistic frame-based systems. Proceedings of the Fifteenth National Conference on Artificial Intelligence: 26\u201330 July 1998; Madison 1998, 580\u2013587."},{"key":"4395_CR8","first-page":"1300","volume-title":"International Joint Conference on Artificial Intelligence: 31 July \u2013 6 August 1999; Stockholm","author":"N Friedman","year":"1999","unstructured":"Friedman N, Getoor L, Koller D, Pfeffer A: Learning probabilistic relational models. International Joint Conference on Artificial Intelligence: 31 July \u2013 6 August 1999; Stockholm 1999, 1300\u20131309."},{"key":"4395_CR9","first-page":"170","volume-title":"Proceedings of the 18th International Conference on Machine Learning: 2001; San Francisco","author":"L Getoor","year":"2001","unstructured":"Getoor L, Friedman N, Koller D, Taskar B: Learning probabilistic models of relational structure. Proceedings of the 18th International Conference on Machine Learning: 2001; San Francisco 2001, 170\u2013177."},{"key":"4395_CR10","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1109\/TCBB.2004.2","volume":"1","author":"SC Madeira","year":"2004","unstructured":"Madeira SC, Oliveira AL: Biclustering algorithms for biological data analysis: a survey. IEEE\/ACM Trans Comput Biol Bioinform 2004, 1: 24\u201345. 10.1109\/TCBB.2004.2","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"4395_CR11","volume-title":"PhD thesis","author":"T Van den Bulcke","year":"2009","unstructured":"Van den Bulcke T: Robust algorithms for inferring regulatory networks based on gene expression measurements and biological prior information. PhD thesis. Katholieke Universiteit Leuven, Faculty of Engineering; 2009."},{"key":"4395_CR12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","volume":"39","author":"AP Dempster","year":"1977","unstructured":"Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society series B 1977, 39: 1\u201338.","journal-title":"Journal of the Royal Statistical Society series B"},{"key":"4395_CR13","doi-asserted-by":"publisher","first-page":"1122","DOI":"10.1093\/bioinformatics\/btl060","volume":"22","author":"A Preli\u0107","year":"2006","unstructured":"Preli\u0107 A, Bleuler S, Zimmermann P, Wille A, B\u00fchlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 2006, 22: 1122\u20131129. 10.1093\/bioinformatics\/btl060","journal-title":"Bioinformatics"},{"key":"4395_CR14","doi-asserted-by":"publisher","first-page":"R27","DOI":"10.1186\/gb-2009-10-3-r27","volume":"10","author":"K Lemmens","year":"2009","unstructured":"Lemmens K, De Bie T, Dhollander T, De Keersmaecker SC, Thijs IM, Schoofs G, De Weerdt A, De Moor B, Vanderleyden J, Collado-Vides J, Engelen K, Marchal K: DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol 2009, 10: R27. 10.1186\/gb-2009-10-3-r27","journal-title":"Genome Biol"},{"key":"4395_CR15","doi-asserted-by":"publisher","first-page":"D120","DOI":"10.1093\/nar\/gkm994","volume":"36","author":"S Gama-Castro","year":"2008","unstructured":"Gama-Castro S, Jimenez-Jacinto V, Peralta-Gil M, Santos-Zavaleta A, Penaloza-Spinola MI, Contreras-Moreira B, Segura-Salazar J, Muniz-Rascado L, Martinez-Flores I, Salgado H, Bonavides-Martinez C, Abreu-Goodger C, Rodriguez-Penagos C, Miranda-Rios J, Morett E, Merino E, Huerta AM, Trevino-Quintanilla L, Collado-Vides J: RegulonDB: gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res 2008, 36: D120\u2013124. 10.1093\/nar\/gkm994","journal-title":"Nucleic Acids Res"},{"key":"4395_CR16","unstructured":"ISA matlab package[http:\/\/www2.unil.ch\/cbg\/index.php?title=ISA]"},{"key":"4395_CR17","unstructured":"QDB source code[http:\/\/homes.esat.kuleuven.be\/_tdhollan\/Supplementary_Information_Dhollander_2007\/index.html]"},{"key":"4395_CR18","doi-asserted-by":"publisher","first-page":"D464","DOI":"10.1093\/nar\/gkn751","volume":"37","author":"IM Keseler","year":"2009","unstructured":"Keseler IM, Bonavides-Mart\u00ednez C, Collado-Vides J, Gama-Castro S, Gunsalus RP, Johnson DA, Krummenacker M, Nolan LM, Paley S, Paulsen IT, Peralta-Gil M, Santos-Zavaleta A, Shearer AG, Karp PD: EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res 2009, 37: D464-D470. 10.1093\/nar\/gkn751","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"4395_CR19","doi-asserted-by":"publisher","first-page":"1372","DOI":"10.1093\/nar\/gkh299","volume":"32","author":"MC Frith","year":"2004","unstructured":"Frith MC, Fu Y, Yu L, Chen J-F, Hansen U, Weng Z: Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 2004, 32(4):1372\u201381. 10.1093\/nar\/gkh299","journal-title":"Nucleic Acids Res"},{"key":"4395_CR20","unstructured":"NCBI (NC_000913) Escherichia coli str. K-12 substr. MG1655 chromosome, complete genome[http:\/\/www.ncbi.nlm.nih.gov\/nuccore\/49175990]"},{"key":"4395_CR21","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1093\/bioinformatics\/btl633","volume":"23","author":"I Rivals","year":"2007","unstructured":"Rivals I, Personnaz L, Taing L, Potier MC: Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 2007, 23: 401\u2013407. 10.1093\/bioinformatics\/btl633","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-S1-S37.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T20:00:27Z","timestamp":1740945627000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-S1-S37"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,2,15]]},"references-count":21,"journal-issue":{"issue":"S1","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["4395"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-s1-s37","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2011,2,15]]},"assertion":[{"value":"15 February 2011","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S37"}}