{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T19:19:16Z","timestamp":1761765556533},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1576,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Clustering according to sequence\u2013structure similarity has now become a generally accepted scheme for ncRNA annotation. Its application to complete genomic sequences as well as whole transcriptomes is therefore desirable but hindered by extremely high computational costs.<\/jats:p>\n               <jats:p>Results: We present a novel linear-time, alignment-free method for comparing and clustering RNAs according to sequence and structure. The approach scales to datasets of hundreds of thousands of sequences. The quality of the retrieved clusters has been benchmarked against known ncRNA datasets and is comparable to state-of-the-art sequence\u2013structure methods although achieving speedups of several orders of magnitude. A selection of applications aiming at the detection of novel structural ncRNAs are presented. Exemplarily, we predicted local structural elements specific to lincRNAs likely functionally associating involved transcripts to vital processes of the human nervous system. In total, we predicted 349 local structural RNA elements.<\/jats:p>\n               <jats:p>Availability: The GraphClust pipeline is available on request.<\/jats:p>\n               <jats:p>Contact: \u00a0backofen@informatik.uni-freiburg.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts224","type":"journal-article","created":{"date-parts":[[2012,6,11]],"date-time":"2012-06-11T14:09:18Z","timestamp":1339423758000},"page":"i224-i232","source":"Crossref","is-referenced-by-count":67,"title":["GraphClust: alignment-free structural clustering of local RNA secondary structures"],"prefix":"10.1093","volume":"28","author":[{"given":"Steffen","family":"Heyne","sequence":"first","affiliation":[]},{"given":"Fabrizio","family":"Costa","sequence":"additional","affiliation":[]},{"given":"Dominic","family":"Rose","sequence":"additional","affiliation":[]},{"given":"Rolf","family":"Backofen","sequence":"additional","affiliation":[{"name":"1 Bioinformatics Group, Department of Computer Science, University of Freiburg,Georges-K\u00f6hler-Allee 106, D-79110 Freiburg, Germany"}]}],"member":"286","published-online":{"date-parts":[[2012,6,9]]},"reference":[{"key":"2023012512344340100_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B2","doi-asserted-by":"crossref","first-page":"1787","DOI":"10.1126\/science.1155472","article-title":"The eukaryotic genome as an RNA machine","volume":"319","author":"Amaral","year":"2008","journal-title":"Science"},{"key":"2023012512344340100_B3","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1186\/1471-2105-9-474","article-title":"RNAalifold: improved consensus structure prediction for RNA alignments","volume":"9","author":"Bernhart","year":"2008","journal-title":"BMC Bioinform."},{"key":"2023012512344340100_B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/jez.b.21130","article-title":"RNAs everywhere: genome-wide annotation of structured RNAs","volume":"308","author":"Bompf\u00fcnewerer Consortium.et al.","year":"2007","journal-title":"J. Exp. Zoolog. B. Mol. Dev. Evol."},{"key":"2023012512344340100_B5","first-page":"21","article-title":"On the resemblance and containment of documents","volume-title":"In Compression and Complexity of Sequences (SEQUENCES97)","author":"Broder","year":"1997"},{"key":"2023012512344340100_B6","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1016\/j.ceb.2009.04.001","article-title":"The long and the short of noncoding RNAs","volume":"21","author":"Brosnan","year":"2009","journal-title":"Curr. Opini. Cell Biolo."},{"key":"2023012512344340100_B7","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.1101\/gad.17446611","article-title":"Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses","volume":"25","author":"Cabili","year":"2011","journal-title":"Genes Dev."},{"key":"2023012512344340100_B8","doi-asserted-by":"crossref","first-page":"R72","DOI":"10.1186\/gb-2010-11-7-r72","article-title":"Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes","volume":"11","author":"Chodroff","year":"2010","journal-title":"Genome Biol"},{"key":"2023012512344340100_B9","doi-asserted-by":"crossref","first-page":"1146","DOI":"10.1093\/molbev\/msh114","article-title":"Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes","volume":"21","author":"Christoffels","year":"2004","journal-title":"Mol. Biol. Evol."},{"key":"2023012512344340100_B10","doi-asserted-by":"crossref","first-page":"e1000625","DOI":"10.1371\/journal.pbio.1000625","article-title":"The reality of pervasive transcription","volume":"9","author":"Clark","year":"2011","journal-title":"PLoS Biol."},{"key":"2023012512344340100_B11","first-page":"255","article-title":"Fast neighborhood subgraph pairwise distance kernel","volume-title":"Proceedings of the 27th International Conference on Machine Learning (ICML-10)","author":"Costa","year":"2010"},{"key":"2023012512344340100_B12","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1038\/nature05874","article-title":"Identification and analysis of functional elements in 1genome by the ENCODE pilot project","volume":"447","author":"ENCODE Project Consortium","year":"2007","journal-title":"Nature"},{"key":"2023012512344340100_B13","doi-asserted-by":"crossref","first-page":"2926","DOI":"10.1093\/nar\/gkg365","article-title":"Exploring the repertoire of rna secondary motifs using graph theory; implications for rna design","volume":"31","author":"Gan","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B14","doi-asserted-by":"crossref","first-page":"2433","DOI":"10.1093\/nar\/gki541","article-title":"A benchmark of multiple sequence alignment programs upon structural RNAs","volume":"33","author":"Gardner","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B15","doi-asserted-by":"crossref","first-page":"D141","DOI":"10.1093\/nar\/gkq1129","article-title":"Rfam: Wikipedia, clans and the \u201cdecimal\u201d release","volume":"39","author":"Gardner","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B16","doi-asserted-by":"crossref","first-page":"4843","DOI":"10.1093\/nar\/gkh779","article-title":"Abstract shapes of RNA","volume":"32","author":"Giegerich","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B17","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.tibtech.2009.09.006","article-title":"De novo prediction of structured RNAs from genomic sequences","volume":"28","author":"Gorodkin","year":"2010","journal-title":"Trends Biotechnol"},{"key":"2023012512344340100_B18","volume-title":"Convolution kernels on discrete structures.","author":"Haussler","year":"1999"},{"key":"2023012512344340100_B19","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/TCBB.2004.11","article-title":"Pure multiple RNA secondary structure alignments: a progressive profile approach","volume":"1","author":"Hochsmann","year":"2004","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012512344340100_B20","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/nprot.2008.211","article-title":"Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources","volume":"4","author":"Huang","year":"2009","journal-title":"Nat Protoc"},{"key":"2023012512344340100_B21","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classification"},{"key":"2023012512344340100_B22","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1145\/276698.276876","article-title":"Approximate nearest neighbors: Towards removing the curse of dimensionality","volume-title":"Proceedings of the thirtieth annual ACM symposium on Theory of computing, STOC '98","author":"Indyk","year":"1998"},{"key":"2023012512344340100_B23","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1093\/bioinformatics\/btn628","article-title":"Structural profiles of human miRNA families from pairwise clustering","volume":"25","author":"Kaczkowski","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512344340100_B24","doi-asserted-by":"crossref","first-page":"W300","DOI":"10.1093\/nar\/gkm253","article-title":"RADAR: a web server for RNA data analysis and research","volume":"35","author":"Khaladkar","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B25","doi-asserted-by":"crossref","first-page":"R61","DOI":"10.1186\/gb-2007-8-4-r61","article-title":"Evolutionary conservation of sequence and secondary structures in CRISPR repeats","volume":"8","author":"Kunin","year":"2007","journal-title":"Genome Biol"},{"key":"2023012512344340100_B26","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1186\/1471-2105-7-493","article-title":"A method for rapid similarity analysis of RNA secondary structures","volume":"7","author":"Liu","year":"2006","journal-title":"BMC Bioinform."},{"key":"2023012512344340100_B27","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1002\/bip.360290621","article-title":"The equilibrium partition function and base pair binding probabilities for RNA secondary structure","volume":"29","author":"McCaskill","year":"1990","journal-title":"Biopolymers"},{"key":"2023012512344340100_B28","doi-asserted-by":"crossref","first-page":"1335","DOI":"10.1093\/bioinformatics\/btp157","article-title":"Infernal 1.0: inference of RNA alignments","volume":"25","author":"Nawrocki","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512344340100_B29","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1101\/gr.112516.110","article-title":"New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes","volume":"21","author":"Parker","year":"2011","journal-title":"Genome Research"},{"key":"2023012512344340100_B30","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1101\/gr.133009.111","article-title":"Systematic identification of long non-coding RNAs expressed during zebrafish embryogenesis","volume":"22","author":"Pauli","year":"2011","journal-title":"Genome Research"},{"key":"2023012512344340100_B31","doi-asserted-by":"crossref","first-page":"e33","DOI":"10.1371\/journal.pcbi.0020033","article-title":"Identification and classification of conserved RNA secondary structures in the human genome","volume":"2","author":"Pedersen","year":"2006","journal-title":"PLoS Comput. Biol."},{"issue":"Database issue","key":"2023012512344340100_B32","doi-asserted-by":"crossref","first-page":"D32","DOI":"10.1093\/nar\/gkn721","article-title":"NCBI reference sequences: current status, policy and new initiatives","volume":"37","author":"Pruitt","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B33","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.brainres.2010.03.110","article-title":"Long non-coding RNAs in nervous system function and disease","volume":"1338","author":"Qureshi","year":"2010","journal-title":"Brain Res"},{"key":"2023012512344340100_B34","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1261\/rna.366507","article-title":"RNA stem-loops: to be or not to be cleaved by RNAse III","volume":"13","author":"Ritchie","year":"2007","journal-title":"RNA"},{"key":"2023012512344340100_B35","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1186\/1471-2164-8-406","article-title":"Computational RNomics of drosophilids","volume":"8","author":"Rose","year":"2007","journal-title":"BMC Genomics"},{"key":"2023012512344340100_B36","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.1142\/S0219720008003886","article-title":"Duplicated RNA genes in teleost fish genomes","volume":"6","author":"Rose","year":"2008","journal-title":"J Bioinform Comput Biol"},{"key":"2023012512344340100_B37","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/S0022-2836(02)01371-2","article-title":"COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance","volume":"326","author":"Sadreyev","year":"2003","journal-title":"J. Mole. Biolo."},{"issue":"Suppl 1","key":"2023012512344340100_B38","doi-asserted-by":"crossref","first-page":"S48","DOI":"10.1186\/1471-2105-12-S1-S48","article-title":"Fast and accurate clustering of noncoding RNAs using ensembles of sequence alignments and secondary structures","volume":"12","author":"Saito","year":"2011","journal-title":"BMC Bioinform."},{"key":"2023012512344340100_B39","doi-asserted-by":"crossref","first-page":"810","DOI":"10.1137\/0145048","article-title":"Simultaneous solution of the RNA folding, alignment and protosequence problems","volume":"45","author":"Sankoff","year":"1985","journal-title":"SIAM J. Appl. Math."},{"key":"2023012512344340100_B40","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1186\/1471-2105-9-318","article-title":"Directed acyclic graph kernels for structural RNA analysis","volume":"9","author":"Sato","year":"2008","journal-title":"BMC Bioinform."},{"key":"2023012512344340100_B41","doi-asserted-by":"crossref","first-page":"6355","DOI":"10.1093\/nar\/gkn544","article-title":"Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments","volume":"36","author":"Seemann","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012512344340100_B42","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1038\/nature08055","article-title":"Metatranscriptomics reveals unique microbial small RNAs in the ocean's water column","volume":"459","author":"Shi","year":"2009","journal-title":"Nature"},{"key":"2023012512344340100_B43","doi-asserted-by":"crossref","first-page":"3352","DOI":"10.1093\/bioinformatics\/bti550","article-title":"MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons","volume":"21","author":"Siebert","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512344340100_B44","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btm049","article-title":"Multiple structural alignment and clustering of RNA sequences","volume":"23","author":"Torarinsson","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012512344340100_B45","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1142\/S0219720009004126","article-title":"Finding non-coding RNAs through genome-scale clustering","volume":"7","author":"Tseng","year":"2009","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023012512344340100_B46","doi-asserted-by":"crossref","first-page":"2454","DOI":"10.1073\/pnas.0409169102","article-title":"Fast and reliable prediction of noncoding RNAs","volume":"102","author":"Washietl","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012512344340100_B47","doi-asserted-by":"crossref","first-page":"R31","DOI":"10.1186\/gb-2010-11-3-r31","article-title":"Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes","volume":"11","author":"Weinberg","year":"2010","journal-title":"Genome Biol."},{"key":"2023012512344340100_B48","doi-asserted-by":"crossref","first-page":"e65","DOI":"10.1371\/journal.pcbi.0030065","article-title":"Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering","volume":"3","author":"Will","year":"2007","journal-title":"PLoS Computa. Biolo."},{"key":"2023012512344340100_B49","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1261\/rna.029041.111","article-title":"LocARNA-P: Accurate boundary prediction and improved detection of structural RNAs","volume":"18","author":"Will","year":"2012","journal-title":"RNA"},{"key":"2023012512344340100_B50","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1093\/bioinformatics\/btk008","article-title":"CMfinder \u2014 a covariance model based RNA motif finding algorithm","volume":"22","author":"Yao","year":"2006","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/12\/i224\/48881028\/bioinformatics_28_12_i224.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/12\/i224\/48881028\/bioinformatics_28_12_i224.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T16:23:38Z","timestamp":1674663818000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/12\/i224\/269136"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,6,9]]},"references-count":50,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2012,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts224","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,6,15]]},"published":{"date-parts":[[2012,6,9]]}}}