{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T21:25:16Z","timestamp":1774905916693,"version":"3.50.1"},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3324,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The increasing availability of gene expression microarray technology has resulted in the publication of thousands of microarray gene expression datasets investigating various biological conditions. This vast repository is still underutilized due to the lack of methods for fast, accurate exploration of the entire compendium.<\/jats:p><jats:p>Results: We have collected Saccharomyces cerevisiae gene expression microarray data containing roughly 2400 experimental conditions. We analyzed the functional coverage of this collection and we designed a context-sensitive search algorithm for rapid exploration of the compendium. A researcher using our system provides a small set of query genes to establish a biological search context; based on this query, we weight each dataset's relevance to the context, and within these weighted datasets we identify additional genes that are co-expressed with the query set. Our method exhibits an average increase in accuracy of 273% compared to previous mega-clustering approaches when recapitulating known biology. Further, we find that our search paradigm identifies novel biological predictions that can be verified through further experimentation. Our methodology provides the ability for biological researchers to explore the totality of existing microarray data in a manner useful for drawing conclusions and formulating hypotheses, which we believe is invaluable for the research community.<\/jats:p><jats:p>Availability: Our query-driven search engine, called SPELL, is available at http:\/\/function.princeton.edu\/SPELL<\/jats:p><jats:p>Contact: \u00a0ogt@genomics.princeton.edu<\/jats:p><jats:p>Supplementary information: Several additional data files, figures and discussions are available at http:\/\/function.princeton.edu\/SPELL\/supplement<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm403","type":"journal-article","created":{"date-parts":[[2007,8,28]],"date-time":"2007-08-28T00:13:35Z","timestamp":1188260015000},"page":"2692-2699","source":"Crossref","is-referenced-by-count":238,"title":["Exploring the functional landscape of gene expression: directed search of large microarray compendia"],"prefix":"10.1093","volume":"23","author":[{"given":"Matthew A.","family":"Hibbs","sequence":"first","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"},{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]},{"given":"David C.","family":"Hess","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]},{"given":"Chad L.","family":"Myers","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"},{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]},{"given":"Curtis","family":"Huttenhower","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"},{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]},{"given":"Kai","family":"Li","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]},{"given":"Olga G.","family":"Troyanskaya","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"},{"name":"1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory and 2Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,8,27]]},"reference":[{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"10101","DOI":"10.1073\/pnas.97.18.10101","article-title":"Singular value decomposition for genome-wide expression data processing and modeling","volume":"97","author":"Alter","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"R2","DOI":"10.1186\/gb-2002-4-1-r2","article-title":"A gene-expression program reflecting the innate immune response of cultured intestinal epithelial cells to infection by Listeria monocytogenes","volume":"4","author":"Baldwin","year":"2003","journal-title":"Genome Biol"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1093\/nar\/gkg091","article-title":"ArrayExpress\u2013a public repository for microarray gene expression data at the EBI","volume":"31","author":"Brazma","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.gde.2005.01.003","article-title":"Chromatin remodeling complexes: strength in diversity, precision through specialization","volume":"15","author":"Cairns","year":"2005","journal-title":"Curr. Opin. Genet. Dev"},{"key":"2023041105592913900_","first-page":"93","article-title":"Biclustering of expression data","volume":"8","author":"Cheng","year":"2000","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1093\/nar\/26.1.73","article-title":"SGD: Saccharomyces Genome Database","volume":"26","author":"Cherry","year":"1998","journal-title":"Nucleic Acids Res"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1093\/nar\/30.1.207","article-title":"Gene Expression Omnibus: NCBI gene expression and hybridization array data repository","volume":"30","author":"Edgar","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023041105592913900_","first-page":"507","article-title":"Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population","volume":"10","author":"Fisher","year":"1915","journal-title":"Biometrika"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"4241","DOI":"10.1091\/mbc.11.12.4241","article-title":"Genomic expression programs in the response of yeast cells to environmental changes","volume":"11","author":"Gasch","year":"2000","journal-title":"Mol. Biol. Cell"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1038\/nature00935","article-title":"Functional profiling of the Saccharomyces cerevisiae genome","volume":"418","author":"Giaever","year":"2002","journal-title":"Nature"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1038\/ng941","article-title":"Revealing modular organization in the yeast transcriptional network","volume":"31","author":"Ihmels","year":"2002","journal-title":"Nat. Genet"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"7696","DOI":"10.1128\/MCB.25.17.7696-7710.2005","article-title":"Immunoisolation of the yeast Golgi subcompartments and characterization of a novel membrane protein, Svp26, discovered in the Sed5-containing compartments","volume":"25","author":"Inadome","year":"2005","journal-title":"Mol. Cell. Biol"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1093\/nar\/30.1.76","article-title":"yMGV: helping biologists with yeast microarray data mining","volume":"30","author":"Le Crom","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023041105592913900_","first-page":"39","article-title":"A Linear Time Biclustering Algorithm for Time Series Gene Expression Data","author":"Madeira","year":"2005"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1109\/TCBB.2004.2","article-title":"Biclustering algorithms for biological data analysis: a survey","volume":"1","author":"Madeira","year":"2004","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform"},{"key":"2023041105592913900_","volume-title":"Engineering Statistics","author":"Montgomery","year":"2001"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1186\/1471-2164-7-187","article-title":"Finding function: evaluation methods for functional genomic data","volume":"7","author":"Myers","year":"2006","journal-title":"BMC Genomics"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"1828","DOI":"10.1101\/gr.1125403","article-title":"A gene recommender algorithm to identify coexpressed genes in C. elegans","volume":"13","author":"Owen","year":"2003","journal-title":"Genome Res"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/82539","article-title":"The core meiotic transcriptome in budding yeasts","volume":"26","author":"Primig","year":"2000","journal-title":"Nat. Genet"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"4089","DOI":"10.1091\/mbc.e04-04-0306","article-title":"Nutritional homeostasis in batch and steady-state culture of yeast","volume":"15","author":"Saldanha","year":"2004","journal-title":"Mol. Biol. Cell"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1038\/35020123","article-title":"A chromatin remodelling complex involved in transcription and DNA processing","volume":"406","author":"Shen","year":"2000","journal-title":"Nature"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1093\/nar\/29.1.152","article-title":"The Stanford Microarray Database","volume":"29","author":"Sherlock","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"S136","DOI":"10.1093\/bioinformatics\/18.suppl_1.S136","article-title":"Discovering statistically significant biclusters in gene expression data","volume":"18","author":"Tanay","year":"2002","journal-title":"Bioinformatics"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1038\/nature02062","article-title":"Targets of the cyclin-dependent kinase Cdk1","volume":"425","author":"Ubersax","year":"2003","journal-title":"Nature"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.4161\/cc.4.8.1887","article-title":"ATP-dependent chromatin remodeling and DNA double-strand break repair","volume":"4","author":"van Attikum","year":"2005","journal-title":"Cell Cycle"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1007\/0-306-47815-3_5","article-title":"Singular value decomposition and principal component analysis","volume-title":"A Practical Approach to Microarray Data Analysis","author":"Wall","year":"2003"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1016\/j.yexcr.2003.12.008","article-title":"Saccharomyces cerevisiae CSM1 gene encoding a protein influencing chromosome segregation in meiosis I interacts with elements of the DNA replication complex","volume":"294","author":"Wysocka","year":"2004","journal-title":"Exp. Cell Res"},{"key":"2023041105592913900_","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1016\/j.sbi.2003.10.003","article-title":"SET domains and histone methylation","volume":"13","author":"Xiao","year":"2003","journal-title":"Curr. Opin. Struct. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/20\/2692\/49818591\/bioinformatics_23_20_2692.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/20\/2692\/49818591\/bioinformatics_23_20_2692.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T22:54:50Z","timestamp":1684018490000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/20\/2692\/229926"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,8,27]]},"references-count":30,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2007,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm403","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,10,15]]},"published":{"date-parts":[[2007,8,27]]}}}