{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T10:00:17Z","timestamp":1771063217973,"version":"3.50.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T00:00:00Z","timestamp":1767744000000},"content-version":"vor","delay-in-days":6,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 Marie Sk\u0142odowska-Curie Actions"},{"name":"United Kingdom Research and Innovation (UKRI) Biotechnology and Biological Sciences Research Council","award":["UKRI746:24BBR"],"award-info":[{"award-number":["UKRI746:24BBR"]}]},{"name":"United Kingdom Medical Research Council","award":["MR\/W024233\/1"],"award-info":[{"award-number":["MR\/W024233\/1"]}]},{"DOI":"10.13039\/100010269","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["310300\/Z\/24\/Z & 218236\/Z\/19\/Z"],"award-info":[{"award-number":["310300\/Z\/24\/Z & 218236\/Z\/19\/Z"]}],"id":[{"id":"10.13039\/100010269","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100013060","name":"European Molecular Biology Laboratory","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100013060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The exponential growth of non-coding RNA research\u2014with over 230\u2009000 papers published since 2000\u2014has created an urgent knowledge management crisis in molecular biology. Despite their crucial regulatory roles, microRNAs (miRNAs) face a significant curation bottleneck, with only 1400 articles manually curated to the Gene Ontology (GO) knowledgebase over a decade. This highlights the critical need for automated systems that can accelerate biocuration while maintaining high-quality standards.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present GOFlowLLM, an automated curation pipeline powered by reasoning-enabled Large Language Models (LLMs) that follows established GO curation flowcharts to extract and structure miRNA-mediated gene silencing data at scale. When evaluated on existing curation, GOFlowLLM selects the correct GO term in 90% of cases, with curators agreeing with 95% of the system\u2019s reasoning steps and 90% of the evidence selected. Applied to 6996 previously uncurated articles using the Qwen QwQ-32B model, our system identified 2538 new candidate GO annotations on 1785 articles in just 58\u2009hours\u2014potentially doubling the available miRNA GO curation. Manual review shows curators agreed with the selected term in 87% of cases, the model\u2019s reasoning in 92% of cases, and the extracted evidence in 93%. The integration of reasoning traces provides transparent justification for annotations that can be reviewed by human curators, addressing a key challenge in adopting AI for scientific curation.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>GOFlowLLM is implemented as an automated pipeline that follows expert-designed reasoning frameworks to maintain curation quality. The system is available on GitHub: https:\/\/github.com\/RNAcentral\/GO_Flow_LLM.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf683","type":"journal-article","created":{"date-parts":[[2026,1,7]],"date-time":"2026-01-07T04:08:21Z","timestamp":1767758901000},"source":"Crossref","is-referenced-by-count":1,"title":["GOFlowLLM\u2014curating miRNA literature with large language models and flowcharts"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8297-0953","authenticated-orcid":false,"given":"Andrew","family":"Green","sequence":"first","affiliation":[{"name":"European Bioinformatics Institute , Cambridge, CB10 1SD,","place":["United Kingdom"]}]},{"given":"Nancy","family":"Ontiveros-Palacios","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute , Cambridge, CB10 1SD,","place":["United Kingdom"]}]},{"given":"Isaac","family":"Jandalala","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute , Cambridge, CB10 1SD,","place":["United Kingdom"]}]},{"given":"Simona","family":"Panni","sequence":"additional","affiliation":[{"name":"Biology, Ecology and Earth Science, University of Calabria , Rende, 87036,","place":["Italy"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6330-7526","authenticated-orcid":false,"given":"Valerie","family":"Wood","sequence":"additional","affiliation":[{"name":"Department of Biochemistry, University of Cambridge , Cambridge, CB2 1GA,","place":["United Kingdom"]}]},{"given":"Giulia","family":"Antonazzo","sequence":"additional","affiliation":[{"name":"Department of Physiology, Development and Neuroscience, University of Cambridge , Cambridge, CB2 3DY,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3212-6364","authenticated-orcid":false,"given":"Helen","family":"Attrill","sequence":"additional","affiliation":[{"name":"Department of Physiology, Development and Neuroscience, University of Cambridge , Cambridge, CB2 3DY,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6982-4660","authenticated-orcid":false,"given":"Alex","family":"Bateman","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute , Cambridge, CB10 1SD,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6497-2883","authenticated-orcid":false,"given":"Blake","family":"Sweeney","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute , Cambridge, CB10 1SD,","place":["United Kingdom"]}]}],"member":"286","published-online":{"date-parts":[[2026,1,6]]},"reference":[{"key":"2026011114011140700_btaf683-B1","author":"Aggarwal","year":"2025"},{"key":"2026011114011140700_btaf683-B2","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1080\/15476286.2024.2408523","article-title":"Representation of non-coding RNA-mediated regulation of gene expression using the gene ontology","volume":"21","author":"Antonazzo","year":"2024","journal-title":"RNA Biol"},{"key":"2026011114011140700_btaf683-B3","doi-asserted-by":"crossref","first-page":"btae104","DOI":"10.1093\/bioinformatics\/btae104","article-title":"Structured prompt interrogation and recursive extraction of semantics (SPIRES): a method for populating knowledge bases using zero-shot learning","volume":"40","author":"Caufield","year":"2024","journal-title":"Bioinformatics"},{"key":"2026011114011140700_btaf683-B4","author":"Caufield","year":"2024"},{"key":"2026011114011140700_btaf683-B5","doi-asserted-by":"crossref","first-page":"D116","DOI":"10.1093\/nar\/gkae1094","article-title":"MirGeneDB 3.0: improved taxonomic sampling, uniform nomenclature of novel conserved microRNA families and updated covariance models","volume":"53","author":"Clarke","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2026011114011140700_btaf683-B6","doi-asserted-by":"crossref","first-page":"D147","DOI":"10.1093\/nar\/gkae1072","article-title":"miRTarBase 2025: updates to the collection of experimentally validated microRNA-target interactions","volume":"53","author":"Cui","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2026011114011140700_btaf683-B7","doi-asserted-by":"crossref","first-page":"baac062","DOI":"10.1093\/database\/baac062","article-title":"A roadmap for the functional annotation of protein families: a community perspective","volume":"2022","author":"de Cr\u00e9cy-Lagard","year":"2022","journal-title":"Database"},{"key":"2026011114011140700_btaf683-B8","doi-asserted-by":"crossref","first-page":"W783","DOI":"10.1093\/nar\/gki470","article-title":"GoPubMed: exploring PubMed with the gene ontology","volume":"33","author":"Doms","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2026011114011140700_btaf683-B9","doi-asserted-by":"crossref","first-page":"baaf006","DOI":"10.1093\/database\/baaf006","article-title":"LitSumm: large language models for literature summarization of noncoding RNAs","volume":"2025","author":"Green","year":"2025","journal-title":"Database"},{"key":"2026011114011140700_btaf683-B10","doi-asserted-by":"crossref","first-page":"1005","DOI":"10.1261\/rna.065565.118","article-title":"Expanding the horizons of microRNA bioinformatics","volume":"24","author":"Huntley","year":"2018","journal-title":"RNA"},{"key":"2026011114011140700_btaf683-B11","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1261\/rna.055301.115","article-title":"Guidelines for the functional annotation of microRNAs using the gene ontology","volume":"22","author":"Huntley","year":"2016","journal-title":"RNA"},{"key":"2026011114011140700_btaf683-B12","doi-asserted-by":"crossref","first-page":"D326","DOI":"10.1093\/nar\/gkab997","article-title":"RNAInter v4.0: RNA interactome repository with redefined confidence scoring system and improved accessibility","volume":"50","author":"Kang","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2026011114011140700_btaf683-B13","doi-asserted-by":"crossref","first-page":"D155","DOI":"10.1093\/nar\/gky1141","article-title":"miRBase: from microRNA sequences to function","volume":"47","author":"Kozomara","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2026011114011140700_btaf683-B14","doi-asserted-by":"crossref","first-page":"e309","DOI":"10.1371\/journal.pbio.0020309","article-title":"Textpresso: an ontology-based information retrieval and extraction system for biological literature","volume":"2","author":"M\u00fcller","year":"2004","journal-title":"PLoS Biol"},{"key":"2026011114011140700_btaf683-B15","author":"Niyonkuru","year":"2024"},{"key":"2026011114011140700_btaf683-B16","doi-asserted-by":"crossref","first-page":"iyad211","DOI":"10.1093\/genetics\/iyad211","article-title":"FlyBase: updates to the Drosophila genes and genomes database","volume":"227","author":"\u00d6zt\u00fcrk-\u00c7olak","year":"2024","journal-title":"Genetics"},{"key":"2026011114011140700_btaf683-B17","doi-asserted-by":"crossref","first-page":"baad066","DOI":"10.1093\/database\/baad066","article-title":"The landscape of microRNA interaction annotation: analysis of three rare disorders as a case study","volume":"2023","author":"Panni","year":"2023","journal-title":"Database"},{"key":"2026011114011140700_btaf683-B18","author":"Qwen Team","year":"2025"},{"key":"2026011114011140700_btaf683-B19","doi-asserted-by":"crossref","first-page":"btae756","DOI":"10.1093\/bioinformatics\/btae756","article-title":"FuncFetch: an LLM-assisted workflow enables mining thousands of enzyme\u2013substrate interactions from published manuscripts","volume":"41","author":"Smith","year":"2024","journal-title":"Bioinformatics"},{"key":"2026011114011140700_btaf683-B20","doi-asserted-by":"crossref","first-page":"iyad031","DOI":"10.1093\/genetics\/iyad031","article-title":"The gene ontology knowledgebase in 2023","volume":"224","author":"The Gene Ontology Consortium","year":"2023","journal-title":"Genetics"},{"key":"2026011114011140700_btaf683-B21","doi-asserted-by":"crossref","first-page":"btaf113","DOI":"10.1093\/bioinformatics\/btaf113","article-title":"Lit-OTAR framework for extracting biological evidences from literature","volume":"41","author":"Tirunagari","year":"2025","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf683\/66285579\/btaf683.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btaf683\/66285579\/btaf683.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btaf683\/66285579\/btaf683.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T19:01:20Z","timestamp":1768158080000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf683\/8415910"}},"subtitle":[],"editor":[{"suffix":"Dr.","given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,1]]},"references-count":21,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf683","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,1]]},"published":{"date-parts":[[2026,1]]},"article-number":"btaf683"}}