{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T09:15:11Z","timestamp":1774689311086,"version":"3.50.1"},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T00:00:00Z","timestamp":1764892800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T00:00:00Z","timestamp":1764892800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000005","name":"U.S. Department of Defense","doi-asserted-by":"publisher","award":["W81XWH2110432"],"award-info":[{"award-number":["W81XWH2110432"]}],"id":[{"id":"10.13039\/100000005","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"U.S. Department of Health & Human Services | NIH | National Cancer Institute","doi-asserted-by":"publisher","award":["R01CA255064"],"award-info":[{"award-number":["R01CA255064"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"U.S. Department of Health & Human Services | NIH | National Cancer Institute","doi-asserted-by":"publisher","award":["R01CA252878"],"award-info":[{"award-number":["R01CA252878"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"U.S. Department of Health & Human Services | NIH | National Cancer Institute","doi-asserted-by":"publisher","award":["R01CA280097"],"award-info":[{"award-number":["R01CA280097"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Nat Comput Sci"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Gene perturbation experiments followed by transcriptomic profiling are vital for uncovering causal gene effects. However, their limited throughput leaves many perturbations of interest unexplored. Computational methods are therefore needed to predict genome-wide transcriptional responses to gene perturbations that were not experimentally assayed within a given dataset. Existing approaches often rely on Gene Ontology graphs to encode prior knowledge, but their predictive power and applicability are constrained by the graphs\u2019 sparsity and incomplete gene coverage. Here we present Scouter, a computational method that uses gene embeddings generated by large language models and a lightweight compressor\u2013generator neural network. Scouter accurately predicts transcriptional responses to both single- and two-gene perturbations, reducing errors from state-of-the-art Gene Ontology-term-based methods (GEARS and biolord) by half or more. Unlike recent approaches based on fine-tuning gene expression foundation models, Scouter offers substantially better accuracy and greater accessibility; it requires no pretraining and runs efficiently on standard hardware.<\/jats:p>","DOI":"10.1038\/s43588-025-00912-8","type":"journal-article","created":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T10:03:19Z","timestamp":1764928999000},"page":"21-28","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Scouter predicts transcriptional responses to genetic perturbations with large language model embeddings"],"prefix":"10.1038","volume":"6","author":[{"given":"Ouyang","family":"Zhu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4353-5761","authenticated-orcid":false,"given":"Jun","family":"Li","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"912_CR1","doi-asserted-by":"publisher","first-page":"e1000655","DOI":"10.1371\/journal.pcbi.1000655","volume":"6","author":"F Markowetz","year":"2010","unstructured":"Markowetz, F. How to understand the cell by breaking it: network analysis of gene perturbation screens. PLoS Comput. Biol. 6, e1000655 (2010).","journal-title":"PLoS Comput. Biol."},{"key":"912_CR2","doi-asserted-by":"crossref","unstructured":"Oberlin, S. & McManus, M. T. Decoding gene regulation with CRISPR perturbations. Nat. Biotechnol. 43, 304\u2013305 (2025).","DOI":"10.1038\/s41587-024-02222-2"},{"key":"912_CR3","doi-asserted-by":"publisher","first-page":"1853","DOI":"10.1016\/j.cell.2016.11.038","volume":"167","author":"A Dixit","year":"2016","unstructured":"Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens. Cell 167, 1853\u20131866 (2016).","journal-title":"Cell"},{"key":"912_CR4","doi-asserted-by":"publisher","first-page":"927","DOI":"10.1038\/s41587-023-01905-6","volume":"42","author":"Y Roohani","year":"2024","unstructured":"Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. 42, 927\u2013935 (2024).","journal-title":"Nat. Biotechnol."},{"key":"912_CR5","doi-asserted-by":"crossref","unstructured":"Piran, Z., Cohen, N., Hoshen, Y. & Nitzan, M. Disentanglement of single-cell data with biolord. Nat. Biotechnol. 42, 1678\u20131683 (2024).","DOI":"10.1038\/s41587-023-02079-x"},{"key":"912_CR6","doi-asserted-by":"publisher","first-page":"D440","DOI":"10.1093\/nar\/gkm883","volume":"36","author":"Gene Ontology Consortium.","year":"2008","unstructured":"Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic Acids Res. 36, D440\u2013D444 (2008).","journal-title":"Nucleic Acids Res."},{"key":"912_CR7","doi-asserted-by":"publisher","first-page":"852","DOI":"10.1038\/s42256-022-00534-z","volume":"4","author":"F Yang","year":"2022","unstructured":"Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852\u2013866 (2022).","journal-title":"Nat. Mach. Intell."},{"key":"912_CR8","doi-asserted-by":"publisher","first-page":"616","DOI":"10.1038\/s41586-023-06139-9","volume":"618","author":"CV Theodoris","year":"2023","unstructured":"Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616\u2013624 (2023).","journal-title":"Nature"},{"key":"912_CR9","doi-asserted-by":"crossref","unstructured":"Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative ai. Nat. Methods 21, 1470\u20131480 (2024).","DOI":"10.1038\/s41592-024-02201-0"},{"key":"912_CR10","doi-asserted-by":"crossref","unstructured":"Hao, M. et al. Large-scale foundation model on single-cell transcriptomics. Nat. Methods 21, 1481\u20131491 (2024).","DOI":"10.1038\/s41592-024-02305-7"},{"key":"912_CR11","doi-asserted-by":"publisher","first-page":"483","DOI":"10.1038\/s41551-024-01284-6","volume":"9","author":"Y Chen","year":"2023","unstructured":"Chen, Y. & Zou, J. Simple and effective embedding model for single-cell biology built from ChatGPT. Nat. Biomed. Eng. 9, 483\u2013493 (2023).","journal-title":"Nat. Biomed. Eng."},{"key":"912_CR12","doi-asserted-by":"publisher","first-page":"1462","DOI":"10.1038\/s41592-024-02235-4","volume":"21","author":"W Hou","year":"2024","unstructured":"Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Nat. Methods 21, 1462\u20131465 (2024).","journal-title":"Nat. Methods"},{"key":"912_CR13","doi-asserted-by":"publisher","unstructured":"Gabbay, A. & Hoshen, Y. Demystifying inter-class disentanglement. Preprint at https:\/\/doi.org\/10.48550\/arXiv.1906.11796 (2019).","DOI":"10.48550\/arXiv.1906.11796"},{"key":"912_CR14","doi-asserted-by":"publisher","first-page":"1867","DOI":"10.1016\/j.cell.2016.11.048","volume":"167","author":"B Adamson","year":"2016","unstructured":"Adamson, B. et al. A multiplexed single-cell crispr screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867\u20131882 (2016).","journal-title":"Cell"},{"key":"912_CR15","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1126\/science.aax4438","volume":"365","author":"TM Norman","year":"2019","unstructured":"Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786\u2013793 (2019).","journal-title":"Science"},{"key":"912_CR16","doi-asserted-by":"publisher","first-page":"2559","DOI":"10.1016\/j.cell.2022.05.013","volume":"185","author":"JM Replogle","year":"2022","unstructured":"Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale perturb-seq. Cell 185, 2559\u20132575 (2022).","journal-title":"Cell"},{"key":"912_CR17","doi-asserted-by":"publisher","unstructured":"Liu, T., Chen, T., Zheng, W., Luo, X. & Zhao, H. scELMo: embeddings from language models are good learners for single-cell data analysis. Preprint at bioRxiv https:\/\/doi.org\/10.1101\/2023.12.07.569910 (2023).","DOI":"10.1101\/2023.12.07.569910"},{"key":"912_CR18","unstructured":"Greene, R., Sanders, T., Weng, L. & Neelakantan, A. New and improved embedding model. OpenAI Blog https:\/\/openai.com\/blog\/new-and-improved-embedding-model (2022)."},{"key":"912_CR19","doi-asserted-by":"crossref","unstructured":"Piran, Z., Cohen, N., Hoshen, Y. & Nitzan, M. Disentanglement of single-cell data with biolord. GitHub https:\/\/github.com\/nitzanlab\/biolord_reproducibility\/tree\/main\/scripts\/biolord (2024).","DOI":"10.1038\/s41587-023-02079-x"},{"key":"912_CR20","unstructured":"Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. GitHub https:\/\/github.com\/yhr91\/gears_misc\/blob\/main\/paper\/fig2_train.py (2024)."},{"key":"912_CR21","doi-asserted-by":"crossref","unstructured":"Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative ai. GitHub https:\/\/github.com\/bowang-lab\/scGPT\/blob\/main\/tutorials\/Tutorial_Perturbation.ipynb (2024).","DOI":"10.1101\/2023.04.30.538439"},{"key":"912_CR22","doi-asserted-by":"crossref","unstructured":"Liu, T., Chen, T., Zheng, W., Luo, X. & Zhao, H. scELMo: embeddings from language models are good learners for single-cell data analysis. GitHub https:\/\/github.com\/HelloWorldLTY\/scELMo\/blob\/main\/Perturbation%20Analysis\/gears_example.ipynb (2023).","DOI":"10.1101\/2023.12.07.569910"},{"key":"912_CR23","doi-asserted-by":"crossref","unstructured":"Hao, M. et al. Large-scale foundation model on single-cell transcriptomics. GitHub https:\/\/github.com\/biomap-research\/scFoundation\/tree\/main\/GEARS (2024).","DOI":"10.1101\/2023.05.29.542705"},{"key":"912_CR24","doi-asserted-by":"publisher","unstructured":"Zhu, O. Scouter predicts transcriptional responses to genetic perturbations with llm embeddings. Zenodo https:\/\/doi.org\/10.5281\/zenodo.17239634 (2025).","DOI":"10.5281\/zenodo.17239634"}],"container-title":["Nature Computational Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s43588-025-00912-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s43588-025-00912-8","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s43588-025-00912-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T12:18:53Z","timestamp":1769689133000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s43588-025-00912-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,5]]},"references-count":24,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,1]]}},"alternative-id":["912"],"URL":"https:\/\/doi.org\/10.1038\/s43588-025-00912-8","relation":{},"ISSN":["2662-8457"],"issn-type":[{"value":"2662-8457","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,5]]},"assertion":[{"value":"9 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 October 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 December 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}