{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T02:18:49Z","timestamp":1767925129618,"version":"3.49.0"},"reference-count":46,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T00:00:00Z","timestamp":1760486400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Biological databases are essential for providing curated knowledge, but their rigid data structures and restrictive query formats often limit flexible and exploratory user interactions. In the field of plant phosphorylation, manually curated and reviewed data represent only a small portion of the available knowledge, and users often seek information that goes beyond what is provided in structured databases. While large language models (LLMs) like ChatGPT-4o possess extensive contextual knowledge, integrating this capability into bioinformatics tools remains an open challenge. Here, we present a multimodal question-answering widget that integrates ChatGPT-4o with our Plant Protein Phosphorylation Database (P3DB). This system supports natural language queries and dynamic prompt formulation, enabling users to explore phosphorylation events, kinase-substrate relationships, and protein-protein interactions through a global entry. In another application, the widget leverages ChatGPT\u2019s image interpretation functionality to extract regulatory pathways and phosphorylation markers from complex scientific figures. To build this widget effectively, we have explored multiple prompt strategies, including one-step, two-step, few-shot, and image-cropping techniques, demonstrating their impact on output accuracy and consistency. In addition, recent multimodal LLMs such as ChatGPT-5 and Gemini 1.5 have demonstrated comparable capabilities and adaptability when applied to our test cases and the developed widgets. Together, our application widget and results highlight the development of the ChatGPT-P3DB integration as a system that enhances user accessibility, enables visual extraction, and extends the current utility of biological knowledgebases through a flexible and adaptive framework. Our \u201cChatGPT-P3DB\u201d is open-source and can be accessed on GitHub (<jats:ext-link>https:\/\/github.com\/yao-laboratory\/p3db-chat<\/jats:ext-link>). The frontend interface, \u201cP3DB askAI\u201d web module, can be accessed freely through <jats:ext-link>https:\/\/www.p3db.org\/ask-ai<\/jats:ext-link>.<\/jats:p>","DOI":"10.3389\/fbinf.2025.1687687","type":"journal-article","created":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T05:42:06Z","timestamp":1760506926000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Multimodal knowledge expansion widget powered by plant protein phosphorylation database and ChatGPT"],"prefix":"10.3389","volume":"5","author":[{"given":"Chunhui","family":"Xu","sequence":"first","affiliation":[]},{"given":"Yang","family":"Yu","sequence":"additional","affiliation":[]},{"given":"Govardhan","family":"Khadakkar","sequence":"additional","affiliation":[]},{"given":"Jiacheng","family":"Xie","sequence":"additional","affiliation":[]},{"given":"Dong","family":"Xu","sequence":"additional","affiliation":[]},{"given":"Qiuming","family":"Yao","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,10,15]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1021\/pr3009995","article-title":"A versatile mass spectrometry-based method to both identify kinase client-relationships and characterize signaling network topology","volume":"12","author":"Ahsan","year":"2013","journal-title":"J. Proteome Res."},{"key":"B2","doi-asserted-by":"publisher","first-page":"1481","DOI":"10.3390\/plants13111481","article-title":"Decoding Arabidopsis thaliana CPK\/SnRK superfamily kinase client signaling networks using peptide library and mass spectrometry","volume":"13","author":"Ahsan","year":"2024","journal-title":"Plants"},{"key":"B3","doi-asserted-by":"publisher","first-page":"D444","DOI":"10.1093\/nar\/gkae1082","article-title":"InterPro: the protein sequence classification resource in 2025","volume":"53","author":"Blum","year":"2025","journal-title":"Nucleic Acids Res."},{"key":"B4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-1-4939-6658-5_1","article-title":"Ensembl plants: integrating tools for visualizing, mining, and analyzing plant genomic data","volume":"1533","author":"Bolser","year":"2017","journal-title":"Methods Mol. Biol."},{"key":"B5","doi-asserted-by":"publisher","first-page":"15493","DOI":"10.1038\/s41598-025-99290-4","article-title":"The influence of prompt engineering on large language models for protein-protein interaction identification in biomedical literature","volume":"15","author":"Chang","year":"2025","journal-title":"Sci. Rep."},{"key":"B6","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1111\/jipb.13061","article-title":"Protein kinases in plant responses to drought, salt, and cold stress","volume":"63","author":"Chen","year":"2021","journal-title":"J. Integr. Plant Biol."},{"key":"B7","doi-asserted-by":"publisher","first-page":"13930","DOI":"10.1038\/s41598-024-64585-5","article-title":"Multi role ChatGPT framework for transforming medical data analysis","volume":"14","author":"Chen","year":"","journal-title":"Sci. Rep."},{"key":"B8","doi-asserted-by":"publisher","first-page":"2450005","DOI":"10.1142\/S2972335324500054","article-title":"Iterative prompt refinement for mining gene relationships from ChatGPT","volume":"1","author":"Chen","year":"","journal-title":"Int. J. Artif. Intell. Robotics Res."},{"key":"B9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/COLI_a_00403","article-title":"Ryansql: recursively applying sketch-based slot fillings for complex text-to-sql in cross-domain databases","volume":"47","author":"Choi","year":"2021","journal-title":"Comput. Linguist."},{"key":"B10","doi-asserted-by":"publisher","first-page":"5948","DOI":"10.1038\/s41467-022-33570-9","article-title":"Inferring differential subcellular localisation in comparative spatial proteomics using BANDLE","volume":"13","author":"Crook","year":"2022","journal-title":"Nat. Commun."},{"key":"B11","doi-asserted-by":"publisher","first-page":"2207","DOI":"10.1038\/s41467-024-46600-5","article-title":"Simultaneous proteome localization and turnover analysis reveals spatiotemporal features of protein homeostasis disruptions","volume":"15","author":"Currie","year":"2024","journal-title":"Nat. Commun."},{"key":"B12","doi-asserted-by":"publisher","first-page":"544","DOI":"10.1016\/j.molp.2020.02.004","article-title":"Molecular regulation of plant responses to environmental temperatures","volume":"13","author":"Ding","year":"2020","journal-title":"Mol. Plant"},{"key":"B13","doi-asserted-by":"publisher","first-page":"1185","DOI":"10.3390\/ijms25021185","article-title":"Molecular mechanisms and regulatory pathways underlying drought stress response in rice","volume":"25","author":"Geng","year":"2024","journal-title":"Int. J. Mol. Sci."},{"key":"B14","doi-asserted-by":"publisher","first-page":"D1178","DOI":"10.1093\/nar\/gkr944","article-title":"Phytozome: a comparative platform for green plant genomics","volume":"40","author":"Goodstein","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"B15","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1016\/j.cell.2024.11.028","article-title":"Global organelle profiling reveals subcellular localization and remodeling at proteome scale","volume":"188","author":"Hein","year":"2025","journal-title":"Cell"},{"key":"B16","doi-asserted-by":"publisher","first-page":"1149","DOI":"10.1002\/elps.200305795","article-title":"An efficient protocol for the identification of protein phosphorylation in a seedless plant, sensitive enough to detect members of signalling cascades","volume":"25","author":"Heintz","year":"2004","journal-title":"Electrophoresis"},{"key":"B17","doi-asserted-by":"publisher","first-page":"e67677","DOI":"10.2196\/67677","article-title":"Improving dietary supplement information retrieval: development of a retrieval-augmented generation system with large language models","volume":"27","author":"Hou","year":"2025","journal-title":"J. Med. Internet Res."},{"key":"B18","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1038\/s41592-024-02525-x","article-title":"Evaluation of large language models for discovery of gene set function","volume":"22","author":"Hu","year":"2025","journal-title":"Nat. Methods"},{"key":"B19","doi-asserted-by":"publisher","first-page":"829645","DOI":"10.3389\/fpls.2021.829645","article-title":"Protein kinase signaling pathways in plant-colletotrichum interaction","volume":"12","author":"Jiang","year":"2021","journal-title":"Front. Plant Sci."},{"key":"B20","doi-asserted-by":"crossref","first-page":"15157","DOI":"10.18653\/v1\/2024.acl-long.809","article-title":"ArtPrompt: ASCII art-based jailbreak attacks against aligned LLMs","volume-title":"Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: long papers)","author":"Jiang","year":"2024"},{"key":"B21","doi-asserted-by":"publisher","first-page":"1372361","DOI":"10.3389\/fpls.2024.1372361","article-title":"Unveiling orphan receptor-like kinases in plants: novel client discovery using high-confidence library predictions in the kinase-client (KiC) assay","volume":"15","author":"Jorge","year":"2024","journal-title":"Front. Plant Sci."},{"key":"B22","doi-asserted-by":"publisher","first-page":"100926","DOI":"10.1016\/j.mcpro.2025.100926","article-title":"Identifying receptor kinase substrates using an 8000 peptide kinase client library enriched for conserved phosphorylation sites","volume":"24","author":"Kim","year":"2025","journal-title":"Mol. Cell Proteomics"},{"key":"B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3665939.3665969","article-title":"LLMs as an interactive database interface for designing large queries","volume-title":"Proceedings of the 2024 workshop on human-in-the-loop data analytics","author":"Li","year":"2024"},{"key":"B24","doi-asserted-by":"publisher","first-page":"908","DOI":"10.1016\/j.tplants.2022.03.012","article-title":"Emerging roles of protein phosphorylation in plant iron homeostasis","volume":"27","author":"Li","year":"2022","journal-title":"Trends Plant Sci."},{"key":"B25","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1038\/s41467-020-14477-9","article-title":"A RAF-SnRK2 kinase cascade mediates early osmotic stress signaling in higher plants","volume":"11","author":"Lin","year":"2020","journal-title":"Nat. Commun."},{"key":"B26","doi-asserted-by":"publisher","first-page":"eabg8723","DOI":"10.1126\/sciadv.abg8723","article-title":"An MKP-MAPK protein phosphorylation cascade controls vascular immunity in plants","volume":"8","author":"Lin","year":"2022","journal-title":"Sci. Adv."},{"key":"B27","doi-asserted-by":"publisher","first-page":"vbaf019","DOI":"10.1093\/bioadv\/vbaf019","article-title":"Boosting GPT models for genomics analysis: generating trusted genetic variant annotations and interpretations through RAG and fine-tuning","volume":"5","author":"Lu","year":"2025","journal-title":"Bioinforma. Adv."},{"key":"B28","doi-asserted-by":"publisher","first-page":"7113","DOI":"10.1038\/s41467-021-27398-y","article-title":"Spatial-proteomics reveals phospho-signaling dynamics at subcellular resolution","volume":"12","author":"Martinez-Val","year":"2021","journal-title":"Nat. Commun."},{"key":"B29","doi-asserted-by":"publisher","first-page":"104649","DOI":"10.1016\/j.jbi.2024.104649","article-title":"Criteria2Query 3.0: leveraging generative large language models for clinical trial eligibility query generation","volume":"154","author":"Park","year":"2024","journal-title":"J. Biomed. Inf."},{"key":"B30","doi-asserted-by":"publisher","first-page":"vbaf044","DOI":"10.1093\/bioadv\/vbaf044","article-title":"Biological databases in the age of generative artificial intelligence","volume":"5","author":"Pop","year":"2025","journal-title":"Bioinforma. Adv."},{"key":"B31","doi-asserted-by":"publisher","first-page":"vbae133","DOI":"10.1093\/bioadv\/vbae133","article-title":"Evaluating GPT and BERT models for protein-protein interaction identification in biomedical text","volume":"4","author":"Rehana","year":"2024","journal-title":"Bioinforma. Adv."},{"key":"B32","article-title":"A review of different approaches in natural language interfaces to databases","volume-title":"Proceedings of the international conference on intelligent sustainable systems, ICISS 2017","author":"Reshma","year":"2018"},{"key":"B33","doi-asserted-by":"publisher","first-page":"713","DOI":"10.1186\/s12864-023-09816-1","article-title":"Using published pathway figures in enrichment analysis and machine learning","volume":"24","author":"Shin","year":"2023","journal-title":"BMC Genomics"},{"key":"B34","doi-asserted-by":"publisher","DOI":"10.1101\/2023.11.08.566195","author":"Tiwari","year":"2023","journal-title":"ChatGPT usage in the Reactome curation processbioRxiv"},{"key":"B35","doi-asserted-by":"publisher","first-page":"258","DOI":"10.1093\/nar\/gkg034","article-title":"STRING: a database of predicted functional associations between proteins","volume":"31","author":"von Mering","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"B36","doi-asserted-by":"publisher","first-page":"754982","DOI":"10.3389\/fpls.2021.754982","article-title":"Plant autophagy: an intricate process controlled by various signaling pathways","volume":"12","author":"Wang","year":"2021","journal-title":"Front. Plant Sci."},{"key":"B37","doi-asserted-by":"publisher","DOI":"10.1002\/qub2.67","author":"Wang","year":"","journal-title":"Bioinformatics and biomedical informatics with ChatGPT: year one reviewQuantitative Biology"},{"key":"B38","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1038\/s41698-024-00576-z","article-title":"Scientific figures interpreted by ChatGPT: strengths in plot recognition and limits in color perception","volume":"8","author":"Wang","year":"","journal-title":"NPJ Precis. Oncol."},{"key":"B39","doi-asserted-by":"publisher","first-page":"2024.07.08.602488","DOI":"10.1101\/2024.07.08.602488","article-title":"Quantifying the massive pleiotropy of microRNA: a human microRNA-disease causal association database generated with ChatGPT","author":"Wang","year":"","journal-title":"bioRxiv"},{"key":"B40","doi-asserted-by":"publisher","first-page":"1677","DOI":"10.1038\/s41592-025-02748-6","article-title":"GeneAgent: self-verification language agent for gene-set analysis using domain databases","volume":"22","author":"Wang","year":"2025","journal-title":"Nat. Methods"},{"key":"B41","doi-asserted-by":"publisher","first-page":"2734","DOI":"10.1111\/nph.19739","article-title":"CPK1-HSP90 phosphorylation and effector XopC2-HSP90 interaction underpin the antagonism during cassava defense-pathogen infection","volume":"242","author":"Wei","year":"2024","journal-title":"New Phytol."},{"key":"B42","doi-asserted-by":"publisher","first-page":"206","DOI":"10.3389\/fpls.2012.00206","article-title":"P3DB: an integrated database for plant protein phosphorylation","volume":"3","author":"Yao","year":"2012","journal-title":"Front. Plant Sci."},{"key":"B43","doi-asserted-by":"publisher","first-page":"D1206","DOI":"10.1093\/nar\/gkt1135","article-title":"P3DB 3.0: from plant phosphorylation sites to protein networks","volume":"42","author":"Yao","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"B44","doi-asserted-by":"crossref","DOI":"10.1145\/3539618.3591708","article-title":"Large language models are versatile decomposers: decomposing evidence and questions for table-based reasoning","volume-title":"Sigir 2023 - proceedings of the 46th international ACM SIGIR conference on research and development in information retrieval","author":"Ye","year":"2023"},{"key":"B45","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1111\/jipb.13215","article-title":"Mitogen-activated protein kinase cascades in plant signaling","volume":"64","author":"Zhang","year":"2022","journal-title":"J. Integr. Plant Biol."},{"key":"B46","doi-asserted-by":"publisher","first-page":"e03926","DOI":"10.1002\/advs.202503926","article-title":"PlantGPT: an arabidopsis-based intelligent agent that answers questions about plant functional genomics","volume":"12","author":"Zhang","year":"2025","journal-title":"Adv. Sci. (Weinh)"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1687687\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T05:42:10Z","timestamp":1760506930000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1687687\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,15]]},"references-count":46,"alternative-id":["10.3389\/fbinf.2025.1687687"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1687687","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,15]]},"article-number":"1687687"}}