{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T12:06:48Z","timestamp":1767960408303,"version":"3.49.0"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: With the emergence of genome-wide expression profiling data sets, the guilt by association (GBA) principle has been a cornerstone for deriving gene functional interpretations in silico. Given the limited success of traditional methods for producing clusters of genes with great amounts of functional similarity, new data-mining algorithms are required to fully exploit the potential of high-throughput genomic approaches.<\/jats:p><jats:p>Results: Ontology-based pattern identification (OPI) is a novel data-mining algorithm that systematically identifies expression patterns that best represent existing knowledge of gene function. Instead of relying on a universal threshold of expression similarity to define functionally related groups of genes, OPI finds the optimal analysis settings that yield gene expression patterns and gene lists that best predict gene function using the principle of GBA. We applied OPI to a publicly available gene expression data set on the life cycle of the malarial parasite Plasmodium falciparum and systematically annotated genes for 320 functional categories based on current Gene Ontology annotations. An ontology-based hierarchical tree of the 320 categories provided a systems-wide biological view of this important malarial parasite.<\/jats:p><jats:p>Availability: A web accessible P. falciparum e-annotation database containing the results of this study can be accessed online at http:\/\/carrier.gnf.org\/publications\/OPI<\/jats:p><jats:p>Contact: \u00a0zhou@gnf.org<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti111","type":"journal-article","created":{"date-parts":[[2004,11,6]],"date-time":"2004-11-06T01:14:14Z","timestamp":1099703654000},"page":"1237-1245","source":"Crossref","is-referenced-by-count":56,"title":["<i>In silico<\/i>gene function prediction using ontology-based pattern identification"],"prefix":"10.1093","volume":"21","author":[{"given":"Yingyao","family":"Zhou","sequence":"first","affiliation":[]},{"given":"Jason A.","family":"Young","sequence":"additional","affiliation":[]},{"given":"Andrey","family":"Santrosyan","sequence":"additional","affiliation":[]},{"given":"Kaisheng","family":"Chen","sequence":"additional","affiliation":[]},{"given":"S. Frank","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Elizabeth A.","family":"Winzeler","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2004,11,5]]},"reference":[{"key":"2023013107281727700_B1","doi-asserted-by":"crossref","unstructured":"Allocco, D.J., Kohane, I.S., Butte, A.J. 2004Quantifying the relationship between co-expression, co-regulation and gene function. BMC Bioinformatics518","DOI":"10.1186\/1471-2105-5-18"},{"key":"2023013107281727700_B2","unstructured":"Benjamini, Y. and Hochberg, Y. 1995Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B57289\u2013300"},{"key":"2023013107281727700_B3","doi-asserted-by":"crossref","unstructured":"Breitling, R., Amtmann, A., Herzyk, P. 2004Iterative Group Analysis (iGA): a simple tool to enhance sensitivity and facilitate interpretation of microarray experiments. BMC Bioinformatics534","DOI":"10.1186\/1471-2105-5-34"},{"key":"2023013107281727700_B4","doi-asserted-by":"crossref","unstructured":"Brown, M.P., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C.W., Furey, T.S., Ares, M., Jr., Haussler, D. 2000Knowledge-based analysis of microarray gene expression data by using support vector machine. Proc. Natl Acad. Sci. USA97262\u2013267","DOI":"10.1073\/pnas.97.1.262"},{"key":"2023013107281727700_B5","doi-asserted-by":"crossref","unstructured":"Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. 1998Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA9514863\u201314868","DOI":"10.1073\/pnas.95.25.14863"},{"key":"2023013107281727700_B6","doi-asserted-by":"crossref","unstructured":"Geller, S.C., Gregg, J.P., Hagerman, P., Rocke, D.M. 2003Transformation and normalization of oligonucleotide microarray data. Bioinformatics191817\u20131823","DOI":"10.1093\/bioinformatics\/btg245"},{"key":"2023013107281727700_B7","unstructured":"Herrero, J., Diaz-Uriarte, R., Dopazo, J. 2003Gene expression data preprocessing. Bioinformatics19655\u2013656"},{"key":"2023013107281727700_B8","doi-asserted-by":"crossref","unstructured":"Le Roch, K.G., Zhou, Y., Batalov, S., Winzeler, E.A. 2002Monitoring the chromosome 2 intraerythrocytic transcriptome of Plasmodium falciparum using oligonucleotide arrays. Am. J. Trop. Med. Hyg.67233\u2013243","DOI":"10.4269\/ajtmh.2002.67.233"},{"key":"2023013107281727700_B9","doi-asserted-by":"crossref","unstructured":"Le Roch, K.G., Zhou, Y., Blair, P.L., Grainger, M., Moch, J.K., Haynes, J.D., De La Vega, P., Holder, A.A., Batalov, S., Carucci, D.J., Winzeler, E.A. 2003Discovery of gene function by expression profiling of the malaria parasite life cycle. Science3011503\u20131508","DOI":"10.1126\/science.1087025"},{"key":"2023013107281727700_B10","doi-asserted-by":"crossref","unstructured":"Mootha, V.K., Lindgren, C.M., Eriksson, K.F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., et al. 2003PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet.34267\u2013273","DOI":"10.1038\/ng1180"},{"key":"2023013107281727700_B11","doi-asserted-by":"crossref","unstructured":"Pan, W., Lin, J., Le, C.T. 2002Model-based cluster analysis of microarray gene-expression data. Genome Biol.3","DOI":"10.1186\/gb-2002-3-2-research0009"},{"key":"2023013107281727700_B12","unstructured":"Quackenbush, J. 2003Genomics. Microarrays\u2014guilt by association. Science302240\u2013241"},{"key":"2023013107281727700_B13","unstructured":"Storey, J.D. 2002A direct approach to false discovery rate. J. R. Statist. Soc. B64479\u2013498"},{"key":"2023013107281727700_B14","doi-asserted-by":"crossref","unstructured":"Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R. 1999Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA962907\u20132912","DOI":"10.1073\/pnas.96.6.2907"},{"key":"2023013107281727700_B15","unstructured":"Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G. 1999Systematic determination of genetic network architecture. Nat. Genet.3281\u2013285"},{"key":"2023013107281727700_B16","unstructured":"The Gene Ontology Consortium. 2000Gene Ontology (2000) tool for the unification of biology. Nat. Genet.2525\u201329"},{"key":"2023013107281727700_B17","doi-asserted-by":"crossref","unstructured":"Toronen, P. 2004Selection of informative clusters from hierarchical cluster tree with gene classes. BMC Bioinformatics532","DOI":"10.1186\/1471-2105-5-32"},{"key":"2023013107281727700_B18","doi-asserted-by":"crossref","unstructured":"Troyanskaya, O.G., Dolinski, K., Owen, A.B., Altman, R.B., Botstein, D. 2003A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc. Natl Acad. Sci.1008348\u20138353","DOI":"10.1073\/pnas.0832373100"},{"key":"2023013107281727700_B19","doi-asserted-by":"crossref","unstructured":"Walker, M.G., Volkmuth, W., Sprinzak, E., Hodgson, D., Klingler, T. 1999Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes. Genome Res.91198\u2013203","DOI":"10.1101\/gr.9.12.1198"},{"key":"2023013107281727700_B20","doi-asserted-by":"crossref","unstructured":"Wu, L.F., Hughes, T.R., Davierwala, A.P., Robinson, M.D., Stoughton, R., Altschuler, S.J. 2002Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters. Nat. Genet.31255\u2013265","DOI":"10.1038\/ng906"},{"key":"2023013107281727700_B21","unstructured":"Yeung, K. and Haynor and Ruzzo, W. 2001Validating clustering for gene expression data. Bioinformatics17309\u2013318"},{"key":"2023013107281727700_B22","unstructured":"Zar, J.H. Biostatistical Analysis1999 4th edn. , NJ Prentice Hall, pp. 523"},{"key":"2023013107281727700_B23","doi-asserted-by":"crossref","unstructured":"Zhou, Y. and Abagyan, R. 2002Match-only integral distribution (MOID) algorithm for high-density oligonucleotide array analysis. BMC Bioinformatics33","DOI":"10.1186\/1471-2105-3-3"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/1237\/48966765\/bioinformatics_21_7_1237.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/1237\/48966765\/bioinformatics_21_7_1237.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,19]],"date-time":"2024-12-19T18:30:59Z","timestamp":1734633059000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/7\/1237\/268914"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,11,5]]},"references-count":23,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2005,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti111","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,4,1]]},"published":{"date-parts":[[2004,11,5]]}}}