{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T06:44:50Z","timestamp":1776408290293,"version":"3.51.2"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"9","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction.<\/jats:p>\n               <jats:p>Results: We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested.<\/jats:p>\n               <jats:p>Availability: Code and Data are available at: http:\/\/bonneaulab.bio.nyu.edu\/funcprop.html<\/jats:p>\n               <jats:p>Contact: shasha@courant.nyu.edu or bonneau@cs.nyu.edu<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt110","type":"journal-article","created":{"date-parts":[[2013,3,20]],"date-time":"2013-03-20T01:40:36Z","timestamp":1363743636000},"page":"1190-1198","source":"Crossref","is-referenced-by-count":32,"title":["Parametric Bayesian priors and better choice of negative examples improve protein function prediction"],"prefix":"10.1093","volume":"29","author":[{"given":"Noah","family":"Youngs","sequence":"first","affiliation":[{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Duncan","family":"Penfold-Brown","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kevin","family":"Drew","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dennis","family":"Shasha","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Richard","family":"Bonneau","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"},{"name":"1 Department of Computer Science and 2Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2013,3,19]]},"reference":[{"key":"2023012810374101800_btt110-B1","doi-asserted-by":"crossref","first-page":"1981","DOI":"10.1101\/gr.121475.111","article-title":"The Proteome Folding Project: proteome-scale prediction of structure and function","volume":"21","author":"Drew","year":"2011","journal-title":"Genome Res."},{"key":"2023012810374101800_btt110-B2","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btg352","article-title":"Learning to predict protein- protein interactions","volume":"19","author":"Gomez","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B3","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1111\/j.1749-6632.2011.06383.x","article-title":"Accurate evaluation and analysis of functional genomics data and methods","volume":"1260","author":"Greene","year":"2012","journal-title":"Ann. NY Acad. Sci."},{"key":"2023012810374101800_btt110-B4","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/gb-2008-9-s1-s3","article-title":"Predicting gene function in a hierarchical context with an ensemble of classifiers","volume":"9","author":"Guan","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B5","doi-asserted-by":"crossref","first-page":"2404","DOI":"10.1093\/bioinformatics\/btp397","article-title":"The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction","volume":"25","author":"Huttenhower","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B6","doi-asserted-by":"crossref","first-page":"S5","DOI":"10.1186\/gb-2008-9-s1-s5","article-title":"Inferring mouse gene functions from genomic-scale data using a combined functional network\/classification strategy","volume":"9","author":"Kim","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B7","first-page":"S5","article-title":"Predicting gene function from patterns of annotation","volume":"9","author":"King","year":"2003","journal-title":"Genome Res."},{"key":"2023012810374101800_btt110-B8","first-page":"896","article-title":"Diffusion Kernel-based logistic regression models for protein function prediction","volume":"13","author":"Lee","year":"2006","journal-title":"OMICS"},{"key":"2023012810374101800_btt110-B9","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1093\/bioinformatics\/bth491","article-title":"Predicting protein functions with message passing algorithms","volume":"21","author":"Leone","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B10","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1038\/47048","article-title":"A combined algorithm for genome-wide prediction of protein function","volume":"402","author":"Marcotte","year":"1999","journal-title":"Nature"},{"key":"2023012810374101800_btt110-B11","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s1-s4","article-title":"GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function","volume":"9","author":"Mostafavi","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B12","article-title":"Using the gene ontology hierarchy when predicting gene function","volume-title":"Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence","author":"Mostafavi","year":"2009"},{"key":"2023012810374101800_btt110-B13","doi-asserted-by":"crossref","first-page":"1759","DOI":"10.1093\/bioinformatics\/btq262","article-title":"Fast integration of heterogeneous data sources for predicting gene function with limited annotation","volume":"26","author":"Mostafavi","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B14","doi-asserted-by":"crossref","first-page":"S6","DOI":"10.1186\/gb-2008-9-s1-s6","article-title":"Consistent probabilistic outputs for protein function prediction","volume":"9","author":"Obozinski","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B15","doi-asserted-by":"crossref","first-page":"14","DOI":"10.12688\/f1000research.1-14.v1","article-title":"Progress and challenges in the computational prediction of gene function using networks","volume":"1","author":"Pavlidis","year":"2012","journal-title":"F1000 Res."},{"key":"2023012810374101800_btt110-B16","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/gb-2008-9-s1-s2","article-title":"A critical assessment of Mus musculus gene function prediction using integrated genomic evidence","volume":"9","author":"Pe\u00f1a-Castillo","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B17","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1093\/bioinformatics\/btq675","article-title":"Cytoscape 2.8: new features for data integration and network visualization","volume":"27","author":"Smoot","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B18","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012810374101800_btt110-B19","doi-asserted-by":"crossref","first-page":"4185","DOI":"10.1002\/nme.1620372405","article-title":"Successive conjugate gradient methods for structural analysis with multiple load cases","volume":"37","author":"Suarjana","year":"1994","journal-title":"Int. J. Num. Methods Eng."},{"key":"2023012810374101800_btt110-B20","doi-asserted-by":"crossref","first-page":"S8","DOI":"10.1186\/gb-2008-9-s1-s8","article-title":"An en masse phenotype and function prediction system for Mus musculus","volume":"9","author":"Tasan","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810374101800_btt110-B21","doi-asserted-by":"crossref","first-page":"8348","DOI":"10.1073\/pnas.0832373100","article-title":"A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae)","volume":"100","author":"Troyanskaya","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012810374101800_btt110-B22","doi-asserted-by":"crossref","first-page":"ii59","DOI":"10.1093\/bioinformatics\/bti1110","article-title":"Fast protein classification with multiple networks","volume":"21","author":"Tsuda","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012810374101800_btt110-B23","first-page":"531","article-title":"Random forest similarity for protein-protein interaction prediction from multiple sources","author":"Qi","year":"2008","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012810374101800_btt110-B24","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1504\/IJCBDD.2008.021418","article-title":"An integrated probabilistic approach for gene function prediction using multiple sources of high-throughput data","volume":"1","author":"Zhang","year":"2008","journal-title":"Int. J. Comput. Biol. Drug Des."},{"key":"2023012810374101800_btt110-B25","first-page":"321","article-title":"Learning with local and global consistency","volume":"16","author":"Zhou","year":"2004","journal-title":"Adv. Neural Inf. Process Syst."},{"key":"2023012810374101800_btt110-B26","article-title":"Semi-supervised learning using Gaussian fields and harmonic functions","volume-title":"Proceedings of the Twentieth International Conference on Machine Learning","author":"Zhu","year":"2003"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/9\/1190\/48897396\/bioinformatics_29_9_1190.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/9\/1190\/48897396\/bioinformatics_29_9_1190.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T12:22:14Z","timestamp":1674908534000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/29\/9\/1190\/218296"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,3,19]]},"references-count":26,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2013,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt110","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2013,5,1]]},"published":{"date-parts":[[2013,3,19]]}}}