{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T15:57:02Z","timestamp":1761580622072},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO) database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>A cross-validation study, using data from the yeast <jats:italic>Saccharomyces cerevisiae<\/jats:italic>, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor) and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased positive predictive value), and that this increase is consistent uniformly with GO-term depth. Additional <jats:italic>in silico<\/jats:italic> validation on a collection of new annotations recently added to GO confirms the advantages suggested by the cross-validation study. Taken as a whole, our results show that a hierarchical approach to network-based protein function prediction, that exploits the ontological structure of protein annotation databases in a principled manner, can offer substantial advantages over the successive application of 'flat' network-based methods.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-350","type":"journal-article","created":{"date-parts":[[2008,8,23]],"date-time":"2008-08-23T06:13:21Z","timestamp":1219472001000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":31,"title":["Integration of relational and hierarchical network information for protein function prediction"],"prefix":"10.1186","volume":"9","author":[{"given":"Xiaoyu","family":"Jiang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Naoki","family":"Nariai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martin","family":"Steffen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Simon","family":"Kasif","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eric D","family":"Kolaczyk","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2008,8,22]]},"reference":[{"key":"2335_CR1","doi-asserted-by":"publisher","first-page":"1474","DOI":"10.1038\/nbt1206-1474","volume":"24","author":"TM Murali","year":"2006","unstructured":"Murali TM, Wu CJ, Kasif S: The art of gene function prediction. Nature Biotechnology 2006, 24: 1474\u20131475. 10.1038\/nbt1206-1474","journal-title":"Nature Biotechnology"},{"key":"2335_CR2","doi-asserted-by":"publisher","first-page":"D138","DOI":"10.1093\/nar\/gkh121","volume":"32","author":"A Bateman","year":"2004","unstructured":"Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SR: The Pfam protein families database. Nucleic Acids Res 2004, 32: D138\u201341. [Database issue]. 10.1093\/nar\/gkh121","journal-title":"Nucleic Acids Res"},{"issue":"17","key":"2335_CR3","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Schaffer A, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389\u2013402. 10.1093\/nar\/25.17.3389","journal-title":"Nucleic Acids Res"},{"key":"2335_CR4","doi-asserted-by":"publisher","first-page":"i197","DOI":"10.1093\/bioinformatics\/btg1026","volume":"19","author":"S Letovsky","year":"2003","unstructured":"Letovsky S, Kasif S: Predicting protein function from protein\/protein interaction data: a probabilistic approach. Bioinformatics 2003, 19: i197-i204. 10.1093\/bioinformatics\/btg1026","journal-title":"Bioinformatics"},{"key":"2335_CR5","doi-asserted-by":"publisher","first-page":"8348","DOI":"10.1073\/pnas.0832373100","volume":"100","author":"OG Troyanskaya","year":"2003","unstructured":"Troyanskaya OG, Dolinski K, Owen AB, Altman RB, Botstein D: A bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae ). Proc Natl Acad Sci USA 2003, 100: 8348\u20138353. 10.1073\/pnas.0832373100","journal-title":"Proc Natl Acad Sci USA"},{"key":"2335_CR6","doi-asserted-by":"publisher","first-page":"1555","DOI":"10.1126\/science.1099511","volume":"306","author":"I Lee","year":"2004","unstructured":"Lee I, Date SV, Adai AT, Marcotte EM: A probabilistic functional network of yeast genes. Science 2004, 306: 1555\u20131558. 10.1126\/science.1099511","journal-title":"Science"},{"issue":"3","key":"2335_CR7","doi-asserted-by":"publisher","first-page":"e337","DOI":"10.1371\/journal.pone.0000337","volume":"2","author":"N Nariai","year":"2007","unstructured":"Nariai N, Kolaczyk ED, Kasif S: Probabilistic protein function prediction from heterogeneous genome-wide data. PLoS ONE 2007, 2(3):e337. 10.1371\/journal.pone.0000337","journal-title":"PLoS ONE"},{"issue":"9","key":"2335_CR8","doi-asserted-by":"publisher","first-page":"1464","DOI":"10.1093\/bioinformatics\/bth088","volume":"20","author":"T Beissbarth","year":"2004","unstructured":"Beissbarth T, Speed TP: GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 2004, 20(9):1464\u20135. 10.1093\/bioinformatics\/bth088","journal-title":"Bioinformatics"},{"issue":"18","key":"2335_CR9","doi-asserted-by":"publisher","first-page":"3710","DOI":"10.1093\/bioinformatics\/bth456","volume":"20","author":"EI Boyle","year":"2004","unstructured":"Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G: GO::TermFinder-open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 2004, 20(18):3710\u20135. 10.1093\/bioinformatics\/bth456","journal-title":"Bioinformatics"},{"issue":"12","key":"2335_CR10","doi-asserted-by":"publisher","first-page":"R101","DOI":"10.1186\/gb-2004-5-12-r101","volume":"5","author":"D Martin","year":"2004","unstructured":"Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B: GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol 2004, 5(12):R101. 10.1186\/gb-2004-5-12-r101","journal-title":"Genome Biol"},{"key":"2335_CR11","doi-asserted-by":"publisher","first-page":"2626","DOI":"10.1093\/bioinformatics\/bth294","volume":"20","author":"GRG Lanckriet","year":"2004","unstructured":"Lanckriet GRG, Bie TD, Cristianini N, Jordan MI, Noble WS: A statistical framework for genomic data fusion. Bioinformatics 2004, 20: 2626\u20132635. 10.1093\/bioinformatics\/bth294","journal-title":"Bioinformatics"},{"key":"2335_CR12","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1089\/1066527041410346","volume":"11","author":"M Deng","year":"2004","unstructured":"Deng M, Chen T, Sun F: An integrated analysis of protein function prediction. Journal of Computational Biology 2004, 11: 463\u2013475. 10.1089\/1066527041410346","journal-title":"Journal of Computational Biology"},{"key":"2335_CR13","doi-asserted-by":"publisher","first-page":"830","DOI":"10.1093\/bioinformatics\/btk048","volume":"22","author":"Z Barutcuoglu","year":"2006","unstructured":"Barutcuoglu Z, Schapire RE, Troyanskaya OG: Hierarchical multi-label prediction of gene function. Bioinformatics 2006, 22: 830\u2013836. 10.1093\/bioinformatics\/btk048","journal-title":"Bioinformatics"},{"key":"2335_CR14","volume-title":"IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","author":"R Eisner","year":"2005","unstructured":"Eisner R, Poulin B, Szafron D, Lu P, Greiner R: Improving protein function prediction using the hierarchical structure of the Gene Ontology. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology 2005."},{"key":"2335_CR15","volume-title":"proceedings of the 14th International Conference on Machine Learning (ICML)","author":"D Koller","year":"1997","unstructured":"Koller D, Sahami M: Hierarchically classifying documents using very few words. proceedings of the 14th International Conference on Machine Learning (ICML) 1997., 223:"},{"key":"2335_CR16","doi-asserted-by":"publisher","first-page":"448","DOI":"10.1186\/1471-2105-7-448","volume":"7","author":"B Shahbaba","year":"2006","unstructured":"Shahbaba B, Neal M: Gene function classification using Bayesian models with hierarchy-based priors. BMC Bioinformatics 2006, 7: 448. 10.1186\/1471-2105-7-448","journal-title":"BMC Bioinformatics"},{"key":"2335_CR17","volume-title":"Probabilistic Modeling and Machine Learning in Structural and Systems Biology (PMSB)","author":"H Blockeel","year":"2006","unstructured":"Blockeel H, Schietgat L, Struyf J, Clare ADS: Hierarchical multilabel classification trees for gene function prediction. Probabilistic Modeling and Machine Learning in Structural and Systems Biology (PMSB) 2006."},{"issue":"3","key":"2335_CR18","doi-asserted-by":"publisher","first-page":"462","DOI":"10.1109\/TIT.1968.1054142","volume":"IT-14","author":"CK Chow","year":"1968","unstructured":"Chow CK, Liu CN: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 1968, IT-14(3):462\u2013467. 10.1109\/TIT.1968.1054142","journal-title":"IEEE Transactions on Information Theory"},{"key":"2335_CR19","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1023\/A:1009778005914","volume":"1","author":"JH Friedman","year":"1997","unstructured":"Friedman JH: On bias, variance, 0\/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1997, 1: 55\u201377. 10.1023\/A:1009778005914","journal-title":"Data Mining and Knowledge Discovery"},{"key":"2335_CR20","doi-asserted-by":"publisher","first-page":"12579","DOI":"10.1073\/pnas.2132527100","volume":"100","author":"MP Samanta","year":"2003","unstructured":"Samanta MP, Liang S: Predicting protein functions from redundancies in large-scale protein interaction networks. PNAS 2003, 100: 12579\u201312583. 10.1073\/pnas.2132527100","journal-title":"PNAS"},{"key":"2335_CR21","doi-asserted-by":"publisher","first-page":"R6","DOI":"10.1186\/gb-2003-5-1-r6","volume":"5","author":"C Brun","year":"2003","unstructured":"Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, Jacq B: Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biology 2003, 5: R6. 10.1186\/gb-2003-5-1-r6","journal-title":"Genome Biology"},{"issue":"13","key":"2335_CR22","doi-asserted-by":"publisher","first-page":"1623","DOI":"10.1093\/bioinformatics\/btl145","volume":"22","author":"HN Chua","year":"2006","unstructured":"Chua HN, Sung WK, L W: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 2006, 22(13):1623\u20131630. 10.1093\/bioinformatics\/btl145","journal-title":"Bioinformatics"},{"key":"2335_CR23","doi-asserted-by":"publisher","first-page":"S8","DOI":"10.1186\/1471-2105-8-S4-S8","volume":"8","author":"HN Chua","year":"2007","unstructured":"Chua HN, Sung WK, L W: Using indirect protein interactions for the prediction of Gene Ontology functions. BMC Bioinformatics 2007, 8: S8. 10.1186\/1471-2105-8-S4-S8","journal-title":"BMC Bioinformatics"},{"key":"2335_CR24","doi-asserted-by":"publisher","first-page":"i302","DOI":"10.1093\/bioinformatics\/bti1054","volume":"21","author":"E Navieva","year":"2005","unstructured":"Navieva E, Jin K, Agarwal A, Chazelle B, Singh M: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 2005, 21: i302-i310. 10.1093\/bioinformatics\/bti1054","journal-title":"Bioinformatics"},{"key":"2335_CR25","first-page":"48737","volume":"NRC","author":"S Kiritchenko","year":"2006","unstructured":"Kiritchenko S, Famili F, Matwin S, Nock R: Learning and evaluation in the presence of class hierarchies: application to text categorization. Proceedings of the 19th Canadian Conference on Artificial Intelligence 2006, NRC: 48737.","journal-title":"Proceedings of the 19th Canadian Conference on Artificial Intelligence"},{"key":"2335_CR26","first-page":"48050","volume":"NRC","author":"S Kiritchenko","year":"2004","unstructured":"Kiritchenko S, Matwin S, Famili AF: Hierarchical text categorization as a tool of associating genes with gene ontology codes. Proceedings of the 2nd European Workshop on Data Mining and Text Mining in Bioinformatics 2004, NRC: 48050.","journal-title":"Proceedings of the 2nd European Workshop on Data Mining and Text Mining in Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-350.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T11:01:04Z","timestamp":1630494064000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-350"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8,22]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2335"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-350","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,8,22]]},"assertion":[{"value":"14 March 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 August 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 August 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"350"}}