{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T06:17:18Z","timestamp":1772173038485,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010075","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,5,25]],"date-time":"2022-05-25T00:00:00Z","timestamp":1653436800000}}],"reference-count":29,"publisher":"Public Library of Science (PLoS)","issue":"5","license":[{"start":{"date-parts":[[2022,5,13]],"date-time":"2022-05-13T00:00:00Z","timestamp":1652400000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"schweizerischer nationalfonds zur f\u00f6rderung der wissenschaftlichen forschung","doi-asserted-by":"publisher","award":["PP00P3_170664"],"award-info":[{"award-number":["PP00P3_170664"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","award":["PP00P3_202669"],"award-info":[{"award-number":["PP00P3_202669"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community\u2019s best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1010075","type":"journal-article","created":{"date-parts":[[2022,5,13]],"date-time":"2022-05-13T13:41:28Z","timestamp":1652449288000},"page":"e1010075","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":5,"title":["CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5657-4762","authenticated-orcid":true,"given":"Maarten J. M. F.","family":"Reijnders","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4199-9052","authenticated-orcid":true,"given":"Robert M.","family":"Waterhouse","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,5,13]]},"reference":[{"key":"pcbi.1010075.ref001","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene Ontology: tool for the unification of biology","volume":"25","author":"M Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"pcbi.1010075.ref002","doi-asserted-by":"crossref","first-page":"D330","DOI":"10.1093\/nar\/gky1055","article-title":"The Gene Ontology Resource: 20 years and still GOing strong","volume":"47","author":"The Gene Ontology Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010075.ref003","doi-asserted-by":"crossref","first-page":"400","DOI":"10.3389\/fgene.2020.00400","article-title":"A Literature Review of Gene Function Prediction by Modeling Gene Ontology","volume":"11","author":"Y Zhao","year":"2020","journal-title":"Front Genet"},{"key":"pcbi.1010075.ref004","doi-asserted-by":"crossref","first-page":"1264","DOI":"10.3390\/genes11111264","article-title":"Automatic Gene Function Prediction in the 2020\u2019s","volume":"11","author":"S Makrodimitris","year":"2020","journal-title":"Genes"},{"key":"pcbi.1010075.ref005","doi-asserted-by":"crossref","first-page":"S5","DOI":"10.1186\/1471-2105-14-S3-S5","article-title":"Protein function prediction using domain families","volume":"14","author":"R Rentzsch","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010075.ref006","doi-asserted-by":"crossref","first-page":"1544","DOI":"10.1093\/bioinformatics\/btu851","article-title":"PANNZER: high-throughput functional annotation of uncharacterized proteins in an error-prone environment","volume":"31","author":"P Koskinen","year":"2015","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref007","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.ymeth.2015.08.021","article-title":"Enhancing protein function prediction with taxonomic constraints\u2013The Argot2.5 web server","volume":"93","author":"E Lavezzo","year":"2016","journal-title":"Methods"},{"key":"pcbi.1010075.ref008","doi-asserted-by":"crossref","first-page":"D351","DOI":"10.1093\/nar\/gky1100","article-title":"InterPro in 2019: improving coverage, classification and access to protein sequence annotations","volume":"47","author":"AL Mitchell","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010075.ref009","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1038\/nmeth.3213","article-title":"The I-TASSER Suite: protein structure and function prediction","volume":"12","author":"J Yang","year":"2015","journal-title":"Nat Methods"},{"key":"pcbi.1010075.ref010","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btx624","article-title":"DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier","volume":"34","author":"M Kulmanov","year":"2018","journal-title":"Bioinforma Oxf Engl"},{"key":"pcbi.1010075.ref011","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"DeepGOPlus: improved protein function prediction from sequence. Cowen L, editor","volume":"36","author":"M Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref012","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1186\/s13059-019-1835-8","article-title":"The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens","volume":"20","author":"N Zhou","year":"2019","journal-title":"Genome Biol"},{"key":"pcbi.1010075.ref013","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"The UniProt Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010075.ref014","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1006\/jcss.1997.1504","article-title":"A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting","volume":"55","author":"Y Freund","year":"1997","journal-title":"J Comput Syst Sci"},{"key":"pcbi.1010075.ref015","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/978-1-4939-3743-1_12","volume-title":"The Gene Ontology Handbook","author":"C. Pesquita","year":"2017"},{"key":"pcbi.1010075.ref016","first-page":"296","article-title":"An information-theoretic definition of similarity","author":"D. Lin","year":"1998","journal-title":"Proc 15th Int Conf Mach Learn."},{"key":"pcbi.1010075.ref017","doi-asserted-by":"crossref","first-page":"e12931","DOI":"10.7717\/peerj.12931","article-title":"Wei2GO: weighted sequence similarity-based protein function prediction","volume":"10","author":"MJMF Reijnders","year":"2022","journal-title":"PeerJ"},{"key":"pcbi.1010075.ref018","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"InterProScan 5: genome-scale protein function classification","volume":"30","author":"P Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref019","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1186\/s12859-019-2988-x","article-title":"FunFam protein families improve residue level molecular function prediction","volume":"20","author":"L Scheibenreif","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010075.ref020","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"F Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"pcbi.1010075.ref021","doi-asserted-by":"crossref","first-page":"2465","DOI":"10.1093\/bioinformatics\/bty130","article-title":"GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank. Wren J, editor","volume":"34","author":"R You","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref022","doi-asserted-by":"crossref","first-page":"W373","DOI":"10.1093\/nar\/gkz375","article-title":"INGA 2.0: improving protein function prediction for the dark proteome","volume":"47","author":"D Piovesan","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010075.ref023","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1093\/bioinformatics\/bty704","article-title":"Phylo-PFP: improved automated protein function prediction using phylogenetic distance of distantly related sequences. Schwartz R, editor","volume":"35","author":"A Jain","year":"2019","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref024","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.ymeth.2015.08.009","article-title":"GoFDR: A sequence alignment based method for predicting protein functions","volume":"93","author":"Q Gong","year":"2016","journal-title":"Methods"},{"key":"pcbi.1010075.ref025","doi-asserted-by":"crossref","first-page":"2520","DOI":"10.1093\/bioinformatics\/bts480","article-title":"Snakemake\u2014a scalable bioinformatics workflow engine","volume":"28","author":"J K\u00f6ster","year":"2012","journal-title":"Bioinformatics"},{"key":"pcbi.1010075.ref026","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bar068","article-title":"Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation","volume":"2012","author":"S Burge","year":"2012","journal-title":"Database"},{"key":"pcbi.1010075.ref027","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"T Chen","year":"2016"},{"key":"pcbi.1010075.ref028","first-page":"886","article-title":"Gene Ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery","volume":"18","author":"GK Mazandu","year":"2016","journal-title":"Brief Bioinform"},{"key":"pcbi.1010075.ref029","article-title":"Semantic similarity and machine learning with ontologies","author":"M Kulmanov","year":"2020","journal-title":"Brief Bioinform"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010075","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,5,25]],"date-time":"2022-05-25T00:00:00Z","timestamp":1653436800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010075","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,5]],"date-time":"2023-02-05T00:39:10Z","timestamp":1675557550000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010075"}},"subtitle":[],"editor":[{"given":"Jacquelyn S.","family":"Fetrow","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,5,13]]},"references-count":29,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2022,5,13]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010075","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/731596","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,13]]}}}