{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T11:24:01Z","timestamp":1769599441234,"version":"3.49.0"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":772,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The discovery of CRISPR-Cas systems almost 20 years ago rapidly changed our perception of the bacterial and archaeal immune systems. CRISPR loci consist of several repetitive DNA sequences called repeats, inter-spaced by stretches of variable length sequences called spacers. This CRISPR array is transcribed and processed into multiple mature RNA species (crRNAs). A single crRNA is integrated into an interference complex, together with CRISPR-associated (Cas) proteins, to bind and degrade invading nucleic acids. Although existing bioinformatics tools can recognize CRISPR loci by their characteristic repeat-spacer architecture, they generally output CRISPR arrays of ambiguous orientation and thus do not determine the strand from which crRNAs are processed. Knowledge of the correct orientation is crucial for many tasks, including the classification of CRISPR conservation, the detection of leader regions, the identification of target sites (protospacers) on invading genetic elements and the characterization of protospacer-adjacent motifs.<\/jats:p>\n               <jats:p>Results: We present a fast and accurate tool to determine the crRNA-encoding strand at CRISPR loci by predicting the correct orientation of repeats based on an advanced machine learning approach. Both the repeat sequence and mutation information were encoded and processed by an efficient graph kernel to learn higher-order correlations. The model was trained and tested on curated data comprising &amp;gt;4500 CRISPRs and yielded a remarkable performance of 0.95 AUC ROC (area under the curve of the receiver operator characteristic). In addition, we show that accurate orientation information greatly improved detection of conserved repeat sequence families and structure motifs. We integrated CRISPRstrand predictions into our CRISPRmap web server of CRISPR conservation and updated the latter to version 2.0.<\/jats:p>\n               <jats:p>Availability: CRISPRmap and CRISPRstrand are available at http:\/\/rna.informatik.uni-freiburg.de\/CRISPRmap.<\/jats:p>\n               <jats:p>Contact: backofen@informatik.uni-freiburg.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu459","type":"journal-article","created":{"date-parts":[[2014,8,26]],"date-time":"2014-08-26T11:23:57Z","timestamp":1409052237000},"page":"i489-i496","source":"Crossref","is-referenced-by-count":64,"title":["CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci"],"prefix":"10.1093","volume":"30","author":[{"given":"Omer S.","family":"Alkhnbashi","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]},{"given":"Fabrizio","family":"Costa","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]},{"given":"Shiraz A.","family":"Shah","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]},{"given":"Roger A.","family":"Garrett","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]},{"given":"Sita J.","family":"Saunders","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]},{"given":"Rolf","family":"Backofen","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"},{"name":"1 Department of Computer Science, University of Freiburg, Georges-K\u00f6hler-Allee 106, 79110 Freiburg, Germany, 2Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and 3BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, University of Freiburg, Germany"}]}],"member":"286","published-online":{"date-parts":[[2014,8,22]]},"reference":[{"key":"2023012711544734200_btu459-B1","author":"Barrangou"},{"key":"2023012711544734200_btu459-B2","doi-asserted-by":"crossref","first-page":"1805","DOI":"10.1093\/bioinformatics\/btu114","article-title":"Accurate computational prediction of the transcribed strand of CRISPR noncoding RNAs","volume":"30","author":"Biswas","year":"2014","journal-title":"Bioinformatics"},{"key":"2023012711544734200_btu459-B3","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1186\/1471-2105-8-209","article-title":"CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats","volume":"8","author":"Bland","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012711544734200_btu459-B4","first-page":"177","article-title":"Large-Scale Machine Learning with Stochastic Gradient Descent","volume-title":"Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT\u20192010)","author":"Bottou","year":"2010"},{"key":"2023012711544734200_btu459-B5","doi-asserted-by":"crossref","first-page":"960","DOI":"10.1126\/science.1159689","article-title":"Small CRISPR RNAs guide antiviral defense in prokaryotes","volume":"321","author":"Brouns","year":"2008","journal-title":"Science"},{"key":"2023012711544734200_btu459-B6","first-page":"255","article-title":"Fast neighborhood subgraph pairwise distance kernel","volume-title":"Proceedings of the 26th International Conference on Machine Learning","author":"Costa","year":"2010"},{"key":"2023012711544734200_btu459-B7","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1038\/nature09886","article-title":"CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III","volume":"471","author":"Deltcheva","year":"2011","journal-title":"Nature"},{"key":"2023012711544734200_btu459-B8","doi-asserted-by":"crossref","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated Profile HMM Searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"2023012711544734200_btu459-B9","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/nar\/30.7.1575","article-title":"An efficient algorithm for large-scale detection of protein families","volume":"30","author":"Enright","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B10","doi-asserted-by":"crossref","first-page":"2020","DOI":"10.1261\/rna.033100.112","article-title":"Cas5d processes pre-crRNA and is a member of a larger family of CRISPR RNA endonucleases","volume":"18","author":"Garside","year":"2012","journal-title":"RNA"},{"key":"2023012711544734200_btu459-B11","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1038\/nsmb.2042","article-title":"Recognition and maturation of effector RNAs in a CRISPR interference pathway","volume":"18","author":"Gesner","year":"2011","journal-title":"Nat. Struct. Mol. Biol."},{"key":"2023012711544734200_btu459-B12","doi-asserted-by":"crossref","first-page":"W52","DOI":"10.1093\/nar\/gkm360","article-title":"CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats","volume":"35","author":"Grissa","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B13","doi-asserted-by":"crossref","first-page":"e60","DOI":"10.1371\/journal.pcbi.0010060","article-title":"A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR\/Cas subtypes exist in prokaryotic genomes","volume":"1","author":"Haft","year":"2005","journal-title":"PLoS Comput. Biol."},{"key":"2023012711544734200_btu459-B14","doi-asserted-by":"crossref","first-page":"D387","DOI":"10.1093\/nar\/gks1234","article-title":"TIGRFAMs and genome properties in 2013","volume":"41","author":"Haft","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B15","doi-asserted-by":"crossref","first-page":"21218","DOI":"10.1073\/pnas.1112832108","article-title":"Mature clustered, regularly interspaced, short palindromic repeats RNA (crRNA) length is measured by a ruler mechanism anchored at the precursor processing site","volume":"108","author":"Hatoum-Aslan","year":"2011","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012711544734200_btu459-B16","doi-asserted-by":"crossref","first-page":"1355","DOI":"10.1126\/science.1192272","article-title":"Sequence- and structure-specific RNA processing by a CRISPR endonuclease","volume":"329","author":"Haurwitz","year":"2010","journal-title":"Science"},{"key":"2023012711544734200_btu459-B17","doi-asserted-by":"crossref","first-page":"2824","DOI":"10.1038\/emboj.2012.107","article-title":"Csy4 relies on an unusual catalytic dyad to position and cleave CRISPR RNA","volume":"31","author":"Haurwitz","year":"2012","journal-title":"EMBO J."},{"key":"2023012711544734200_btu459-B18","doi-asserted-by":"crossref","first-page":"783","DOI":"10.1261\/rna.031468.111","article-title":"A genome-wide view of the expression and processing patterns of Thermus thermophilus HB8 CRISPR RNAs","volume":"18","author":"Juranek","year":"2012","journal-title":"RNA"},{"key":"2023012711544734200_btu459-B19","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B20","doi-asserted-by":"crossref","first-page":"R61","DOI":"10.1186\/gb-2007-8-4-r61","article-title":"Evolutionary conservation of sequence and secondary structures in CRISPR repeats","volume":"8","author":"Kunin","year":"2007","journal-title":"Genome Biol."},{"key":"2023012711544734200_btu459-B21","doi-asserted-by":"crossref","first-page":"8034","DOI":"10.1093\/nar\/gkt606","article-title":"CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems","volume":"41","author":"Lange","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B22","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/1745-6150-1-7","article-title":"A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action","volume":"1","author":"Makarova","year":"2006","journal-title":"Biol. Direct."},{"key":"2023012711544734200_btu459-B23","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1038\/nrmicro2577","article-title":"Evolution and classification of the CRISPR-Cas systems","volume":"9","author":"Makarova","year":"2011","journal-title":"Nat. Rev. Microbiol."},{"key":"2023012711544734200_btu459-B24","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1186\/1745-6150-6-38","article-title":"Unification of Cas protein families and a simple scenario for the origin and evolution of CRISPR-Cas systems","volume":"6","author":"Makarova","year":"2011","journal-title":"Biol. Direct."},{"key":"2023012711544734200_btu459-B25","first-page":"D225","article-title":"CDD: a conserved domain database for the functional annotation of proteins","volume":"39","author":"Marchler-Bauer","year":"2011","journal-title":"Database"},{"key":"2023012711544734200_btu459-B26","doi-asserted-by":"crossref","first-page":"1574","DOI":"10.1016\/j.str.2012.06.016","article-title":"Cas5d protein processes pre-crRNA and assembles into a cascade-like interference complex in subtype I-C\/Dvulg CRISPR-Cas system","volume":"20","author":"Nam","year":"2012","journal-title":"Structure"},{"key":"2023012711544734200_btu459-B27","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"2023012711544734200_btu459-B28","doi-asserted-by":"crossref","first-page":"779","DOI":"10.4161\/rna.23928","article-title":"Two CRISPR-Cas systems in Methanosarcina mazei strain Go1 display common processing features despite belonging to different types I and III","volume":"10","author":"Nickel","year":"2013","journal-title":"RNA Biol."},{"key":"2023012711544734200_btu459-B29","doi-asserted-by":"crossref","first-page":"D290","DOI":"10.1093\/nar\/gkr1065","article-title":"The Pfam protein families database","volume":"40","author":"Punta","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B30","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1016\/S0168-9525(00)02024-2","article-title":"EMBOSS: the European Molecular Biology open software suite","volume":"16","author":"Rice","year":"2000","journal-title":"Trends Genet."},{"key":"2023012711544734200_btu459-B31","doi-asserted-by":"crossref","first-page":"9887","DOI":"10.1093\/nar\/gks737","article-title":"Characterization of CRISPR RNA processing in Clostridium thermocellum and Methanococcus maripaludis","volume":"40","author":"Richter","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B32","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1038\/nsmb.2043","article-title":"An RNA-induced conformational change required for CRISPR RNA cleavage by the endoribonuclease Cse3","volume":"18","author":"Sashital","year":"2011","journal-title":"Nat. Struct. Mol. Biol."},{"key":"2023012711544734200_btu459-B33","doi-asserted-by":"crossref","first-page":"e56470","DOI":"10.1371\/journal.pone.0056470","article-title":"CRISPR-Cas Systems in the Cyanobacterium Synechocystis sp. PCC6803 exhibit distinct processing pathways involving at least two Cas6 and a Cmr2 protein","volume":"8","author":"Scholz","year":"2013","journal-title":"PLoS One"},{"key":"2023012711544734200_btu459-B34","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.resmic.2010.09.001","article-title":"CRISPR\/Cas and Cmr modules, mobility and evolution of adaptive immune systems","volume":"162","author":"Shah","year":"2011","journal-title":"Res. Microbiol."},{"key":"2023012711544734200_btu459-B35","doi-asserted-by":"crossref","first-page":"W373","DOI":"10.1093\/nar\/gkq316","article-title":"Freiburg RNA Tools: a web server integrating IntaRNA, ExpaRNA and LocARNA","volume":"38","author":"Smith","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012711544734200_btu459-B36","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1261\/rna.030882.111","article-title":"Mechanism of substrate selection by a highly specific CRISPR endoribonuclease","volume":"18","author":"Sternberg","year":"2012","journal-title":"RNA"},{"key":"2023012711544734200_btu459-B37","doi-asserted-by":"crossref","first-page":"157","DOI":"10.4161\/rna.27990","article-title":"CRISPR adaptive immune systems of Archaea","volume":"11","author":"Vestergaard","year":"2014","journal-title":"RNA Biol."},{"key":"2023012711544734200_btu459-B38","doi-asserted-by":"crossref","first-page":"e65","DOI":"10.1371\/journal.pcbi.0030065","article-title":"Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering","volume":"3","author":"Will","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023012711544734200_btu459-B39","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1261\/rna.029041.111","article-title":"LocARNA-P: accurate boundary prediction and improved detection of structural RNAs","volume":"18","author":"Will","year":"2012","journal-title":"RNA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/17\/i489\/48927799\/bioinformatics_30_17_i489.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/17\/i489\/48927799\/bioinformatics_30_17_i489.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T12:33:13Z","timestamp":1674822793000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/17\/i489\/200646"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,22]]},"references-count":39,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2014,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu459","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,9,1]]},"published":{"date-parts":[[2014,8,22]]}}}