{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T15:53:38Z","timestamp":1778860418417,"version":"3.51.4"},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T00:00:00Z","timestamp":1752105600000},"content-version":"vor","delay-in-days":9,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Self-supporting Program of Guangzhou National Laboratory","award":["SRPG22007"],"award-info":[{"award-number":["SRPG22007"]}]},{"name":"Guangzhou National Laboratory","award":["QNPG23-12"],"award-info":[{"award-number":["QNPG23-12"]}]},{"name":"Major Project of Guangzhou National Laboratory","award":["GZNL2025C01013"],"award-info":[{"award-number":["GZNL2025C01013"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Advancements in high-throughput sequencing technologies and artificial intelligence (AI) offer unprecedented opportunities for groundbreaking discoveries in bioinformatics research. However, the challenges of exponential growth of omics data and the rapid development of AI technologies require automated big biological data analysis capability and interdisciplinary knowledge-driven scientific insight. Here, we propose a data-intelligence-intensive bioinformatics copilot (Bio-Copilot) system that synergizes AI capabilities with human researchers to facilitate hypothesis-free exploratory research and inspire novel scientific insights in large-scale omics studies. Bio-Copilot forms high-quality intensive intelligence through close collaboration between multiple agents, driven by large language models (LLMs), and human researchers. To augment the capabilities of Bio-Copilot, this study devises an agent group management strategy, an effective human\u2013agent interaction mechanism, a shared interdisciplinary knowledge database, and continuous learning strategies for the agents. We comprehensively compare Bio-Copilot against GPT-4o and several leading AI agents across diverse bioinformatics tasks, using a broad range of evaluation metrics. Bio-Copilot achieves overall state-of-the-art performance across all tasks, while showcasing exceptional task completeness. Furthermore, on application to constructing a large-scale human lung cell atlas, Bio-Copilot not only reproduces the intricate data integration process detailed in a seminal study but also introduces a recursive, multilevel annotation strategy to capture the continuous nature of cellular states and uncovers the characteristics of rare cell types, highlighting its potential to unravel hidden complexities in biological systems. Beyond the technical achievements, this study also underscores the profound implications of integrating AI capabilities with expert knowledge in accelerating impactful biological discoveries and exploring uncharted territories.<\/jats:p>","DOI":"10.1093\/bib\/bbaf312","type":"journal-article","created":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T22:04:33Z","timestamp":1752185073000},"source":"Crossref","is-referenced-by-count":6,"title":["A data-intelligence-intensive bioinformatics copilot system for large-scale omics research and scientific insights"],"prefix":"10.1093","volume":"26","author":[{"given":"Yang","family":"Liu","sequence":"first","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rongbo","family":"Shen","sequence":"additional","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lu","family":"Zhou","sequence":"additional","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qingyu","family":"Xiao","sequence":"additional","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiao","family":"Yuan","sequence":"additional","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yixue","family":"Li","sequence":"additional","affiliation":[{"name":"Guangzhou National Laboratory , No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou 510005 , Guangdong Province,","place":["China"]},{"name":"Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University , Guangzhou 511436 ,","place":["China"]},{"name":"Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences , Shanghai 200030 ,","place":["China"]},{"name":"Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of the Chinese Academy of Sciences , Hangzhou 310024 ,","place":["China"]},{"name":"School of Life Sciences and Biotechnology, Shanghai Jiao Tong University , Shanghai 200240 ,","place":["China"]},{"name":"Collaborative Innovation Center for Genetics and Development, Fudan University , Shanghai 200433 ,","place":["China"]},{"name":"Shanghai Institute for Biomedical and Pharmaceutical Technologies , Shanghai 200032 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2025,7,10]]},"reference":[{"key":"2025071107013106700_ref1","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1038\/ng1109","article-title":"Bioinformatics in the post-sequence era","volume":"33","author":"Kanehisa","year":"2003","journal-title":"Nat Genet"},{"key":"2025071107013106700_ref2","doi-asserted-by":"publisher","first-page":"1981","DOI":"10.1093\/bib\/bby063","article-title":"A brief history of bioinformatics","volume":"20","author":"Gauthier","year":"2019","journal-title":"Brief Bioinform"},{"key":"2025071107013106700_ref3","doi-asserted-by":"publisher","first-page":"640","DOI":"10.1038\/msb.2012.61","article-title":"High-throughput sequencing for biology and medicine","volume":"9","author":"Soon","year":"2013","journal-title":"Mol Syst Biol"},{"key":"2025071107013106700_ref4","doi-asserted-by":"publisher","first-page":"640","DOI":"10.1038\/nbt.3880","article-title":"Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding","volume":"35","author":"Lan","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2025071107013106700_ref5","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1038\/nbt1486","article-title":"Next-generation DNA sequencing","volume":"26","author":"Shendure","year":"2008","journal-title":"Nat Biotechnol"},{"key":"2025071107013106700_ref6","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1038\/nmeth0411-311","article-title":"Single-cell genomics","volume":"8","author":"Kalisky","year":"2011","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref7","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1038\/nrg.2015.16","article-title":"Single-cell genome sequencing: current state of the science","volume":"17","author":"Gawad","year":"2016","journal-title":"Nat Rev Genet"},{"key":"2025071107013106700_ref8","doi-asserted-by":"publisher","first-page":"331","DOI":"10.1038\/nature21350","article-title":"Scaling single-cell genomics from phenomenology to mechanism","volume":"541","author":"Tanay","year":"2017","journal-title":"Nature"},{"key":"2025071107013106700_ref9","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1038\/nmeth.2764","article-title":"Entering the era of single-cell transcriptomics in biology and medicine","volume":"11","author":"Sandberg","year":"2014","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref10","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1038\/nrg3833","article-title":"Computational and analytical challenges in single-cell transcriptomics","volume":"16","author":"Stegle","year":"2015","journal-title":"Nat Rev Genet"},{"key":"2025071107013106700_ref11","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1038\/s41592-018-0273-y","article-title":"Single-cell proteomics","volume":"16","author":"Doerr","year":"2019","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref12","doi-asserted-by":"publisher","first-page":"809","DOI":"10.1038\/s41592-019-0540-6","article-title":"A dream of single-cell proteomics","volume":"16","author":"Marx","year":"2019","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref13","doi-asserted-by":"publisher","first-page":"3341","DOI":"10.1038\/s41467-021-23667-y","article-title":"Quantitative single-cell proteomics as a tool to characterize cellular hierarchies","volume":"12","author":"Schoof","year":"2021","journal-title":"Nat Commun"},{"key":"2025071107013106700_ref14","doi-asserted-by":"publisher","first-page":"1243259","DOI":"10.1126\/science.1243259","article-title":"Single-cell metabolomics: analytical and biological perspectives","volume":"342","author":"Zenobi","year":"2013","journal-title":"Science"},{"key":"2025071107013106700_ref15","doi-asserted-by":"publisher","first-page":"1452","DOI":"10.1038\/s41592-021-01333-x","article-title":"Single-cell metabolomics hits its stride","volume":"18","author":"Seydel","year":"2021","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref16","doi-asserted-by":"publisher","first-page":"D20","DOI":"10.1093\/nar\/gkv1352","article-title":"The European Bioinformatics Institute in 2016: data growth and integration","volume":"44","author":"Cook","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2025071107013106700_ref17","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1093\/bib\/bbw114","article-title":"Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences","volume":"19","author":"Manzoni","year":"2018","journal-title":"Brief Bioinform"},{"key":"2025071107013106700_ref18","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1038\/nrm3187","article-title":"A decade of molecular cell biology: achievements and challenges","volume":"12","author":"Akhtar","year":"2011","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2025071107013106700_ref19","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1056\/NEJMp1006304","article-title":"The path to personalized medicine","volume":"363","author":"Hamburg","year":"2010","journal-title":"N Engl J Med"},{"key":"2025071107013106700_ref20","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1038\/nm.2323","article-title":"Cancer genomics: from discovery science to personalized medicine","volume":"17","author":"Chin","year":"2011","journal-title":"Nat Med"},{"key":"2025071107013106700_ref21","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1038\/s41591-018-0300-7","article-title":"High-performance medicine: the convergence of human and artificial intelligence","volume":"25","author":"Topol","year":"2019","journal-title":"Nat Med"},{"key":"2025071107013106700_ref22","doi-asserted-by":"publisher","first-page":"747","DOI":"10.1038\/s41568-021-00399-1","article-title":"Artificial intelligence in cancer research, diagnosis and therapy","volume":"21","author":"Elemento","year":"2021","journal-title":"Nat Rev Cancer"},{"key":"2025071107013106700_ref23","doi-asserted-by":"publisher","first-page":"1202","DOI":"10.1038\/s41587-021-00895-7","article-title":"Computational principles and challenges in single-cell data integration","volume":"39","author":"Argelaguet","year":"2021","journal-title":"Nat Biotechnol"},{"key":"2025071107013106700_ref24","doi-asserted-by":"publisher","first-page":"851","DOI":"10.1093\/bib\/bbw068","article-title":"Deep learning in bioinformatics","volume":"18","author":"Min","year":"2017","journal-title":"Brief Bioinform"},{"key":"2025071107013106700_ref25","doi-asserted-by":"publisher","first-page":"829","DOI":"10.1038\/nbt.4233","article-title":"Deep learning in biomedicine","volume":"36","author":"Wainberg","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2025071107013106700_ref26","doi-asserted-by":"publisher","first-page":"2063","DOI":"10.1109\/TNNLS.2018.2790388","article-title":"Applications of deep learning and reinforcement learning to biological data","volume":"29","author":"Mahmud","year":"2018","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2025071107013106700_ref27","doi-asserted-by":"publisher","first-page":"686","DOI":"10.1126\/science.1193147","article-title":"Evidence for a collective intelligence factor in the performance of human groups","volume":"330","author":"Woolley","year":"2010","journal-title":"Science"},{"key":"2025071107013106700_ref28","doi-asserted-by":"publisher","first-page":"2318","DOI":"10.1109\/TKDE.2017.2720168","article-title":"Theory-guided data science: a new paradigm for scientific discovery from data","volume":"29","author":"Karpatne","year":"2017","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2025071107013106700_ref29","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1016\/j.nbt.2023.02.001","article-title":"AI for life: trends in artificial intelligence for biotechnology","volume":"74","author":"Holzinger","year":"2023","journal-title":"N Biotechnol"},{"key":"2025071107013106700_ref30","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1038\/s41586-024-07146-0","article-title":"Artificial intelligence and illusions of understanding in scientific research","volume":"627","author":"Messeri","year":"2024","journal-title":"Nature"},{"key":"2025071107013106700_ref31","doi-asserted-by":"publisher","first-page":"1930","DOI":"10.1038\/s41591-023-02448-8","article-title":"Large language models in medicine","volume":"29","author":"Thirunavukarasu","year":"2023","journal-title":"Nat Med"},{"key":"2025071107013106700_ref32","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1038\/s41586-023-06291-2","article-title":"Large language models encode clinical knowledge","volume":"620","author":"Singhal","year":"2023","journal-title":"Nature"},{"key":"2025071107013106700_ref33","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2303.18223","article-title":"A survey of large language models","author":"Zhao","year":"2023"},{"key":"2025071107013106700_ref34","doi-asserted-by":"publisher","DOI":"10.36227\/techrxiv.23589741.v1","article-title":"A survey on large language models: applications, challenges, limitations, and practical usage","author":"Hadi","year":"10  2023"},{"key":"2025071107013106700_ref35","article-title":"Reflexion: language agents with verbal reinforcement learning","volume":"36","author":"Shinn","year":"2024","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2025071107013106700_ref36","doi-asserted-by":"publisher","first-page":"5","DOI":"10.7554\/eLife.27041","article-title":"The Human Cell Atlas","volume":"6","author":"Regev","year":"2017","journal-title":"eLife"},{"key":"2025071107013106700_ref37","doi-asserted-by":"publisher","first-page":"844","DOI":"10.1038\/d41586-024-00173-x","article-title":"Seven technologies to watch in 2024","volume":"625","author":"Eisenstein","year":"2024","journal-title":"Nature"},{"key":"2025071107013106700_ref38","doi-asserted-by":"publisher","first-page":"1563","DOI":"10.1038\/s41591-023-02327-2","article-title":"An integrated cell atlas of the lung in health and disease","volume":"29","author":"Sikkema","year":"2023","journal-title":"Nat Med"},{"key":"2025071107013106700_ref39","doi-asserted-by":"publisher","first-page":"e9620","DOI":"10.15252\/msb.20209620","article-title":"Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models","volume":"17","author":"Xu","year":"2021","journal-title":"Mol Syst Biol"},{"key":"2025071107013106700_ref40","doi-asserted-by":"publisher","first-page":"vbac016","DOI":"10.1093\/bioadv\/vbac016","article-title":"decoupleR: ensemble of computational methods to infer biological activities from omics data","volume":"2","author":"Badia-I-Mompel","year":"2022","journal-title":"Bioinform Adv"},{"key":"2025071107013106700_ref41","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-024-02235-4","article-title":"Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis","volume":"21","author":"Hou","journal-title":"Nat Methods"},{"key":"2025071107013106700_ref42","doi-asserted-by":"publisher","first-page":"2739","DOI":"10.1016\/j.cell.2022.06.031","article-title":"What is a cell type and how to define it?","volume":"185","author":"Zeng","year":"2022","journal-title":"Cell"},{"key":"2025071107013106700_ref43","doi-asserted-by":"publisher","first-page":"W90","DOI":"10.1093\/nar\/gkw377","article-title":"Enrichr: a comprehensive gene set enrichment analysis web server 2016 update","volume":"44","author":"Kuleshov","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2025071107013106700_ref44","doi-asserted-by":"publisher","first-page":"e90","DOI":"10.1002\/cpz1.90","article-title":"Gene set knowledge discovery with Enrichr","volume":"1","author":"Xie","year":"2021","journal-title":"Curr Protoc"},{"key":"2025071107013106700_ref45","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2302.11382","article-title":"A prompt pattern catalog to enhance prompt engineering with ChatGPT","author":"White","year":"2023"},{"key":"2025071107013106700_ref46","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/S0004-3702(03)00055-9","article-title":"Embodied artificial intelligence","volume":"149","author":"Chrisley","year":"2003","journal-title":"Artif Intell"},{"key":"2025071107013106700_ref47","doi-asserted-by":"publisher","first-page":"663","DOI":"10.1038\/s42256-020-00250-6","article-title":"Embodied intelligence weaves a better future","volume":"2","author":"Jin","year":"2020","journal-title":"Nat Mach Intell"},{"key":"2025071107013106700_ref48","doi-asserted-by":"publisher","first-page":"5721","DOI":"10.1038\/s41467-021-25874-z","article-title":"Embodied intelligence via learning and evolution","volume":"12","author":"Gupta","year":"2021","journal-title":"Nat Commun"},{"key":"2025071107013106700_ref49","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1803.10122","article-title":"World models","author":"Ha","year":"2018"},{"key":"2025071107013106700_ref50","doi-asserted-by":"publisher","first-page":"030802","DOI":"10.1115\/1.4050244","article-title":"Digital twins: review and challenges","volume":"21","author":"Juarez","year":"2021","journal-title":"J Comput Inf Sci Eng"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/4\/bbaf312\/63725053\/bbaf312.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/4\/bbaf312\/63725053\/bbaf312.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,11]],"date-time":"2025-07-11T11:01:40Z","timestamp":1752231700000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf312\/8196318"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":50,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,7,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf312","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7]]},"article-number":"bbaf312"}}