{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T17:08:57Z","timestamp":1773767337112,"version":"3.50.1"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T00:00:00Z","timestamp":1722211200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"supported by the Israel Science Foundation"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,8,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>CRISPR\/Cas9 technology has been revolutionizing the field of gene editing. Guide RNAs (gRNAs) enable Cas9 proteins to target specific genomic loci for editing. However, editing efficiency varies between gRNAs and so computational methods were developed to predict editing efficiency for any gRNA of interest. High-throughput datasets of Cas9 editing efficiencies were produced to train machine-learning models to predict editing efficiency. However, these high-throughput datasets have a low correlation with functional and endogenous datasets, which are too small to train accurate machine-learning models on.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed DeepCRISTL, a deep-learning model to predict the editing efficiency in a specific cellular context. DeepCRISTL takes advantage of high-throughput datasets to learn general patterns of gRNA editing efficiency and then fine-tunes the model on functional or endogenous data to fit a specific cellular context. We tested two state-of-the-art models trained on high-throughput datasets for editing efficiency prediction, our newly improved DeepHF and CRISPRon, combined with various transfer-learning approaches. The combination of CRISPRon and fine-tuning all model weights was the overall best performer. DeepCRISTL outperformed state-of-the-art methods in predicting editing efficiency in a specific cellular context on functional and endogenous datasets. Using saliency maps, we identified and compared the important features learned by DeepCRISTL across cellular contexts. We believe DeepCRISTL will improve prediction performance in many other CRISPR\/Cas9 editing contexts by leveraging transfer learning to utilize both high-throughput datasets and smaller and more biologically relevant datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>DeepCRISTL is available via https:\/\/github.com\/OrensteinLab\/DeepCRISTL.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae481","type":"journal-article","created":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T23:32:56Z","timestamp":1722295976000},"source":"Crossref","is-referenced-by-count":13,"title":["DeepCRISTL: deep transfer learning to predict CRISPR\/Cas9 on-target editing efficiency in specific cellular contexts"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0900-8573","authenticated-orcid":false,"given":"Shai","family":"Elkayam","sequence":"first","affiliation":[{"name":"School of Electrical and Computer Engineering, Ben-Gurion University of the Negev , Beer-Sheva 8410501, Israel"}]},{"given":"Ido","family":"Tziony","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Bar-Ilan University , Ramat Gan 5290002, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3583-3112","authenticated-orcid":false,"given":"Yaron","family":"Orenstein","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Bar-Ilan University , Ramat Gan 5290002, Israel"},{"name":"The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University , Ramat Gan 5290002, Israel"}]}],"member":"286","published-online":{"date-parts":[[2024,7,29]]},"reference":[{"key":"2024081305561590700_btae481-B1","doi-asserted-by":"crossref","first-page":"ii62","DOI":"10.1093\/bioinformatics\/btac469","article-title":"DeepZF: improved DNA-binding prediction of C2H2-zinc-finger proteins by deep transfer learning","volume":"38","author":"Aizenshtein-Gazit","year":"2022","journal-title":"Bioinformatics"},{"key":"2024081305561590700_btae481-B2","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1186\/s13059-018-1534-x","article-title":"CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters","volume":"19","author":"Alkan","year":"2018","journal-title":"Genome Biol"},{"key":"2024081305561590700_btae481-B3","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1146\/annurev-biodatasci-022020-021940","article-title":"Identifying regulatory elements via deep learning","volume":"3","author":"Barshai","year":"2020","journal-title":"Annu Rev Biomed Data Sci"},{"key":"2024081305561590700_btae481-B4","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1186\/s13059-018-1459-4","article-title":"DeepCRISPR: optimized CRISPR guide RNA design by deep learning","volume":"19","author":"Chuai","year":"2018","journal-title":"Genome Biol"},{"key":"2024081305561590700_btae481-B5","doi-asserted-by":"crossref","first-page":"btad327","DOI":"10.1093\/bioinformatics\/btad327","article-title":"Testing on external independent datasets is necessary to corroborate machine learning model improvement","volume":"39","author":"Corsi","year":"2023","journal-title":"Bioinformatics"},{"key":"2024081305561590700_btae481-B6","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1007\/s12539-018-0298-z","article-title":"Review of CRISPR\/Cas9 sgRNA design tools","volume":"10","author":"Cui","year":"2018","journal-title":"Interdiscip Sci"},{"key":"2024081305561590700_btae481-B7","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1186\/s13059-016-1012-2","article-title":"Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR","volume":"17","author":"Haeussler","year":"2016","journal-title":"Genome Biol"},{"key":"2024081305561590700_btae481-B8","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1186\/s13059-018-1614-y","article-title":"Accurate prediction of cell type-specific transcription factor binding","volume":"20","author":"Keilwagen","year":"2019","journal-title":"Genome Biol"},{"key":"2024081305561590700_btae481-B9","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1038\/nbt.4061","article-title":"Deep learning improves prediction of CRISPR\u2013Cpf1 guide RNA activity","volume":"36","author":"Kim","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024081305561590700_btae481-B10","doi-asserted-by":"crossref","first-page":"eaax9249","DOI":"10.1126\/sciadv.aax9249","article-title":"SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance","volume":"5","author":"Kim","year":"2019","journal-title":"Sci Adv"},{"key":"2024081305561590700_btae481-B11","doi-asserted-by":"crossref","first-page":"3616","DOI":"10.1093\/nar\/gkac192","article-title":"CRISPR\u2013Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning","volume":"50","author":"Konstantakos","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024081305561590700_btae481-B12","author":"Kota","year":"2021"},{"key":"2024081305561590700_btae481-B13","first-page":"254","author":"Lanchantin","year":"2017"},{"key":"2024081305561590700_btae481-B14","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1038\/s41587-019-0203-2","article-title":"Large dataset enables prediction of repair after CRISPR\u2013Cas9 editing in primary T cells","volume":"37","author":"Leenay","year":"2019","journal-title":"Nat Biotechnol"},{"key":"2024081305561590700_btae481-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1748-7188-6-26","article-title":"ViennaRNA package 2.0","volume":"6","author":"Lorenz","year":"2011","journal-title":"Algorithms Mol Biol"},{"key":"2024081305561590700_btae481-B16","doi-asserted-by":"crossref","first-page":"e1249","DOI":"10.1002\/widm.1249","article-title":"Ensemble learning: a survey","volume":"8","author":"Sagi","year":"2018","journal-title":"Wiley Interdiscipl Rev Data Min Knowledge Discov"},{"key":"2024081305561590700_btae481-B17","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/s13059-020-01977-6","article-title":"Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome","volume":"21","author":"Schreiber","year":"2020","journal-title":"Genome Biol"},{"key":"2024081305561590700_btae481-B18","first-page":"270","author":"Tan","year":"2018"},{"key":"2024081305561590700_btae481-B19","doi-asserted-by":"crossref","first-page":"4284","DOI":"10.1038\/s41467-019-12281-8","article-title":"Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning","volume":"10","author":"Wang","year":"2019","journal-title":"Nat Commun"},{"key":"2024081305561590700_btae481-B20","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1080\/15476286.2019.1669406","article-title":"An overview and metanalysis of machine and deep learning-based CRISPR gRNA design tools","volume":"17","author":"Wang","year":"2020","journal-title":"RNA Biol"},{"key":"2024081305561590700_btae481-B21","doi-asserted-by":"crossref","first-page":"3238","DOI":"10.1038\/s41467-021-23576-0","article-title":"Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning","volume":"12","author":"Xiang","year":"2021","journal-title":"Nat Commun"},{"key":"2024081305561590700_btae481-B22","doi-asserted-by":"crossref","first-page":"gkae428","DOI":"10.1093\/nar\/gkae428","article-title":"Generating, modeling and evaluating a large-scale set of CRISPR\/Cas9 off-target sites with bulges","volume":"52","author":"Yaish","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2024081305561590700_btae481-B23","doi-asserted-by":"crossref","first-page":"bbac157","DOI":"10.1093\/bib\/bbac157","article-title":"A systematic evaluation of data processing and problem formulation of CRISPR off-target site prediction","volume":"23","author":"Yaish","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024081305561590700_btae481-B24","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1038\/nature13166","article-title":"High-throughput screening of a CRISPR\/Cas9 library for functional genomics in human cells","volume":"509","author":"Zhou","year":"2014","journal-title":"Nature"},{"key":"2024081305561590700_btae481-B25","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","article-title":"A comprehensive survey on transfer learning","volume":"109","author":"Zhuang","year":"2021","journal-title":"Proc IEEE"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae481\/58679560\/btae481.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/8\/btae481\/58802718\/btae481.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/8\/btae481\/58802718\/btae481.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T06:39:13Z","timestamp":1723531153000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae481\/7723478"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,7,29]]},"references-count":25,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2024,8,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae481","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,8]]},"published":{"date-parts":[[2024,7,29]]},"article-number":"btae481"}}