{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T15:35:20Z","timestamp":1776353720117,"version":"3.51.2"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Expression array data are used to predict biological functions of uncharacterized genes by comparing their expression profiles to those of characterized genes. While biologically plausible, this is both statistically and computationally challenging. Typical approaches are computationally expensive and ignore correlations among expression profiles and functional categories.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We propose a factor analysis model (FAM) for functional genomics and give a two-step algorithm, using genome-wide expression data for yeast and a subset of Gene-Ontology Biological Process functional annotations. We show that the predictive performance of our method is comparable to the current best approach while our total computation time was faster by a factor of 4000. We discuss the unique challenges in performance evaluation of algorithms used for genome-wide functions genomics. Finally, we discuss extensions to our method that can incorporate the inherent correlation structure of the functional categories to further improve predictive performance.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>Our factor analysis model is a computationally efficient technique for functional genomics and provides a clear and unified statistical framework with potential for incorporating important gene ontology information to improve predictions.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-7-216","type":"journal-article","created":{"date-parts":[[2006,4,21]],"date-time":"2006-04-21T18:20:33Z","timestamp":1145643633000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["A factor analysis model for functional genomics"],"prefix":"10.1186","volume":"7","author":[{"given":"Rafal","family":"Kustra","sequence":"first","affiliation":[]},{"given":"Romy","family":"Shioda","sequence":"additional","affiliation":[]},{"given":"Mu","family":"Zhu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2006,4,21]]},"reference":[{"key":"955_CR1","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1038\/ng906","volume":"31","author":"L Wu","year":"2002","unstructured":"Wu L, Hughes T, Davierwala A, Robinson M, Stoughton R, Altschuler S: Large Scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters. Nature Genetics 2002, 31: 255\u2013260. 10.1038\/ng906","journal-title":"Nature Genetics"},{"issue":"20","key":"955_CR2","doi-asserted-by":"publisher","first-page":"12783","DOI":"10.1073\/pnas.192159399","volume":"99","author":"X Zhou","year":"2002","unstructured":"Zhou X, Kao MC, Wong W: Transitive functional annotation by shortest-path analysis of gene-expression data. Proceedings of the National Academy of Sciences 2002, 99(20):12783\u201388. 10.1073\/pnas.192159399","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"955_CR3","doi-asserted-by":"crossref","unstructured":"Zhang W, Morris Q, Chang R, Shai O, Bakowski M, Mitsakakis N, Mohammad N, Robinson M, Zirngibl R, Somogyi E, Laurin N, Eftekharpour E, Sat E, Grigull J, Pan Q, Peng W, Krogan N, Greenblatt J, Fehlings M, van derKooy D, Aubin J, Bruneau B, Rossant J, Blencowe B, Frey B, Hughes T: The functional landscape of mouse gene expression. Journal of Biology 2004., 3(21):","DOI":"10.1186\/jbiol16"},{"key":"955_CR4","unstructured":"the Gene Ontology[http:\/\/www.geneontology.org]"},{"key":"955_CR5","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1038\/47048","volume":"402","author":"E Marcotte","year":"1999","unstructured":"Marcotte E, Pellegrini M, Thompson M, Yeates T, Eisenberg D: A combined algorithm for genome-wide prediction of protein function. Nature 1999, 402: 83\u201386. 10.1038\/47048","journal-title":"Nature"},{"issue":"8","key":"955_CR6","doi-asserted-by":"publisher","first-page":"1644","DOI":"10.1093\/bioinformatics\/bti103","volume":"21","author":"P Kemmeren","year":"2005","unstructured":"Kemmeren P, Kockelkorn T, Bijma T, Donders R, Holstege F: Predicting gene function through systematic analysis and quality assessment of high-throughput data. Bioinformatics 2005, 21(8):1644\u20131652. 10.1093\/bioinformatics\/bti103","journal-title":"Bioinformatics"},{"issue":"21","key":"955_CR7","doi-asserted-by":"publisher","first-page":"6414","DOI":"10.1093\/nar\/gkh978","volume":"32","author":"Y Chen","year":"2004","unstructured":"Chen Y, Xu D: Global protein function annotation through mining genome-scale data in yeast Saccharomyces cerevisiae . Nucleic Acids Research 2004, 32(21):6414\u20136424. 10.1093\/nar\/gkh978","journal-title":"Nucleic Acids Research"},{"issue":"16","key":"955_CR8","doi-asserted-by":"publisher","first-page":"2626","DOI":"10.1093\/bioinformatics\/bth294","volume":"20","author":"G Lanckriet","year":"2004","unstructured":"Lanckriet G, De Brie T, Cristianini N, Jordan M, Noble W: A statistical framework for genomic data fusion. Bioinformatics 2004, 20(16):2626\u20132635. 10.1093\/bioinformatics\/bth294","journal-title":"Bioinformatics"},{"key":"955_CR9","first-page":"D258","volume":"31","author":"M Harris","year":"2004","unstructured":"Harris M, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin G, Blake J, Bult C, Dolan M, Drabkin H, Eppig J, Hill D, Ni L, Ringwald M, Balakrishnan R, Cherry J, Christie K, Costanzo M, Dwight S, Engel S, Fisk D, Hirschman J, Hong E, Nash R, Sethuraman A, Theesfeld C, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee S, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz E, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 2004, 31: D258\u201361.","journal-title":"Nucleic Acids Research"},{"key":"955_CR10","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1016\/0031-3203(76)90047-9","volume":"8","author":"G McLachlan","year":"1976","unstructured":"McLachlan G: Further results on the effect of intraclass correlation among training samples in discriminant analysis. Pattern Recognition 1976, 8: 273\u2013275. 10.1016\/0031-3203(76)90047-9","journal-title":"Pattern Recognition"},{"key":"955_CR11","doi-asserted-by":"publisher","first-page":"351","DOI":"10.1016\/0031-3203(80)90011-4","volume":"12","author":"J Tubbs","year":"1980","unstructured":"Tubbs J: Effect of autocorrelated training samples on Bayes's probabilities of misclassification. Pattern Recognition 1980, 12: 351\u2013354. 10.1016\/0031-3203(80)90011-4","journal-title":"Pattern Recognition"},{"key":"955_CR12","volume-title":"Multiviariate Analysis","author":"KV Mardia","year":"1979","unstructured":"Mardia KV, Kent JT, Bibby JM: Multiviariate Analysis. London, Great Britain: Academic Press; 1979."},{"issue":"2","key":"955_CR13","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1109\/TCBB.2005.29","volume":"2","author":"S Rogers","year":"2005","unstructured":"Rogers S, Girolami M, Campbell C, Breitling R: The Latent Process Decomposition of cDNA Microarray Data Sets. ACM\/IEEE Transactions on Computational Biology and Bioinformatics 2005, 2(2):143\u2013156. 10.1109\/TCBB.2005.29","journal-title":"ACM\/IEEE Transactions on Computational Biology and Bioinformatics"},{"issue":"9","key":"955_CR14","doi-asserted-by":"publisher","first-page":"991","DOI":"10.1038\/ng1630","volume":"37","author":"B Frey","year":"2005","unstructured":"Frey B, Mohammad N, Morris Q, Zhan W, Robinson M, Mnaimneh S, Chang R, Pan Q, Sat E, Rossant J, Bruneau B, Aubin J, Blencowe B, Hughes T: Genome-wide analysis of mouse transcript using exon microarrays and factor graphs. Nature Genetics 2005, 37(9):991\u2013997. 10.1038\/ng1630","journal-title":"Nature Genetics"},{"issue":"6","key":"955_CR15","doi-asserted-by":"publisher","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","volume":"17","author":"O Troyanskaya","year":"2001","unstructured":"Troyanskaya O, Cantor M, Sherlock G, Eisen M, Brown P, Botstein D: Imputing Missing Data for Gene Expression Arrays. Bioinformatics 2001, 17(6):520\u201325. 10.1093\/bioinformatics\/17.6.520","journal-title":"Bioinformatics"},{"key":"955_CR16","unstructured":"Kernel Machines[http:\/\/www.kernel-machines.org]"},{"key":"955_CR17","volume-title":"Gist: Support Vector Machine and Kernel Principal Components Analysis Software Toolkit","author":"WS Noble","year":"2002","unstructured":"Noble WS, Pavlidis P: Gist: Support Vector Machine and Kernel Principal Components Analysis Software Toolkit.Columbia University; 2002. [http:\/\/microarray.genomecenter.columbia.edu\/gist\/]"},{"key":"955_CR18","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198509844.001.0001","volume-title":"The Statistical Evaluation of Medical Tests for Classification and Prediction","author":"MS Pepe","year":"2003","unstructured":"Pepe MS: The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press; 2003."},{"key":"955_CR19","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","volume":"143","author":"JA Hanley","year":"1982","unstructured":"Hanley JA, McNeil BJ: The Meaning and Use of the Area Under an ROC curve. Radiology 1982, 143: 29\u201336.","journal-title":"Radiology"},{"key":"955_CR20","volume-title":"Advances in Neural Information Processing Systems 16","author":"C Cortes","year":"2004","unstructured":"Cortes C, Mohri M: AUC Optimization vs. Error Rate Minimization. In Advances in Neural Information Processing Systems 16. Edited by: Thrun S, Saul L, Sch\u00f6lkopf B. Cambridge, MA: MIT Press; 2004."},{"key":"955_CR21","volume-title":"R Foundation for Statistical Computing, Vienna, Austria","author":"R Development Core Team","year":"2004","unstructured":"R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria 2004. [ISBN 3\u2013900051\u201307\u20130] [http:\/\/www.r-project.org] [ISBN 3-900051-07-0]"},{"key":"955_CR22","first-page":"296","volume-title":"Proceedings of 15th International Conference on Machine Learning, San Francisco","author":"D Lin","year":"1998","unstructured":"Lin D: An information-theoretic definition of similarity. In Proceedings of 15th International Conference on Machine Learning, San Francisco. Morgan Kaufmann; 1998:296\u2013304."},{"key":"955_CR23","first-page":"448","volume-title":"Proceedings of the 14th International Joint Conference on Artificial Intelligence","author":"P Resnik","year":"1995","unstructured":"Resnik P: Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence 1995, 448\u2013453."},{"issue":"4","key":"955_CR24","doi-asserted-by":"publisher","first-page":"825","DOI":"10.1016\/S0165-1684(02)00475-9","volume":"83","author":"N Bolshakova","year":"2003","unstructured":"Bolshakova N, Azuaje F: Cluster validation techniques forgenome expression data. Signal Process 2003, 83(4):825\u2013833. 10.1016\/S0165-1684(02)00475-9","journal-title":"Signal Process"},{"key":"955_CR25","volume-title":"Proceedings of the International Conference on Research in Computational Linguistics, Taiwan","author":"J Jiang","year":"1998","unstructured":"Jiang J, Conrath D: Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the International Conference on Research in Computational Linguistics, Taiwan 1998."},{"key":"955_CR26","volume-title":"Tech Rep DI\/FCUL TR 03-29","author":"F Couto","year":"2003","unstructured":"Couto F, Silva M, Coutinho P: Implementation of a Functional Semantic Similarity Measure between Gene-Products. Tech Rep DI\/FCUL TR 03\u201329 Department of Informatics, University of Lisbon; 2003. [http:\/\/www.di.fc.ul.pt\/tech-reports]"},{"issue":"2","key":"955_CR27","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1007\/BF02289343","volume":"34","author":"K J\u00f6reskog","year":"1969","unstructured":"J\u00f6reskog K: A General Approach to confirmatory maximum likelihood factor analysis. Psychometrika 1969, 34(2):183\u2013202. 10.1007\/BF02289343","journal-title":"Psychometrika"},{"key":"955_CR28","volume-title":"Optimization Over Integers","author":"D Bertsimas","year":"2005","unstructured":"Bertsimas D, Weismantel R: Optimization Over Integers. Belmont, MA: Dynamic Ideas; 2005."},{"key":"955_CR29","volume-title":"Nonlinear Programming: Theory and Algorithms","author":"M Bazaraa","year":"1993","unstructured":"Bazaraa M, Sherali HD, Shetty CM: Nonlinear Programming: Theory and Algorithms. New York: John Wiley and Sons; 1993."},{"key":"955_CR30","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-4381-7","volume-title":"Handbook of Semidefinite Programming","author":"H Wolkowicz","year":"2000","unstructured":"Wolkowicz H, Saigal R, Vandenberghe L: Handbook of Semidefinite Programming. Norwell, MA: Kluwer Academic Press; 2000."},{"key":"955_CR31","unstructured":"Computational INfrastructure for Operations Research[http:\/\/www.coin-or.org]"},{"key":"955_CR32","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-21606-5","volume-title":"The Elements of Statistical Learning: Data-Mining, Inference and Prediction","author":"TJ Hastie","year":"2001","unstructured":"Hastie TJ, Tibshirani RJ, Friedman JH: The Elements of Statistical Learning: Data-Mining, Inference and Prediction. Springer-Verlag; 2001."},{"issue":"457","key":"955_CR33","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1198\/016214502753479248","volume":"97","author":"S Dudoit","year":"2002","unstructured":"Dudoit S, Fridlyand J, Speed T: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 2002, 97(457):77\u201388. 10.1198\/016214502753479248","journal-title":"Journal of the American Statistical Association"},{"issue":"9","key":"955_CR34","doi-asserted-by":"publisher","first-page":"5116","DOI":"10.1073\/pnas.091062498","volume":"98","author":"V Tusher","year":"2001","unstructured":"Tusher V, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiarion response. Proceedings of the National Academy of Sciences 2001, 98(9):5116\u20135121. 10.1073\/pnas.091062498","journal-title":"Proceedings of the National Academy of Sciences"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-216.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,3]],"date-time":"2024-02-03T22:16:13Z","timestamp":1706998573000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-216"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,4,21]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["955"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-216","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,4,21]]},"assertion":[{"value":"13 October 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 April 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 April 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"216"}}