{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,10,31]],"date-time":"2023-10-31T14:10:46Z","timestamp":1698761446892},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"18","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Repeat sequences in ESTs are a source of problems, in particular for clustering. ESTs are therefore commonly masked against a library of known repeats. High quality repeat libraries are available for the widely studied organisms, but for most other organisms the lack of such libraries is likely to compromise the quality of EST analysis.<\/jats:p>\n               <jats:p>Results: We present a fast, flexible and library-less method for masking repeats in EST sequences, based on match statistics within the EST collection. The method is not linked to a particular clustering algorithm. Extensive testing on datasets using different clustering methods and a genomic mapping as reference shows that this method gives results that are better than or as good as those obtained using RepeatMasker with a repeat library.<\/jats:p>\n               <jats:p>Availability: The implementation of RBR is available under the terms of the GPL from<\/jats:p>\n               <jats:p>Contact: \u00a0ketil.malde@bccs.uib.no<\/jats:p>\n               <jats:p>Supplementary Information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl368","type":"journal-article","created":{"date-parts":[[2006,7,13]],"date-time":"2006-07-13T00:39:14Z","timestamp":1152751154000},"page":"2232-2236","source":"Crossref","is-referenced-by-count":14,"title":["RBR: library-less repeat detection for ESTs"],"prefix":"10.1093","volume":"22","author":[{"given":"Ketil","family":"Malde","sequence":"first","affiliation":[{"name":"Computational Biology Unit, Bergen Centre for Computational Sciences, University of Bergen 1 \u00a0 1 \u00a0 \u00a0 Norway"}]},{"given":"Korbinian","family":"Schneeberger","sequence":"additional","affiliation":[{"name":"Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universit\u00e4t M\u00fcnchen 3 \u00a0 3 \u00a0 \u00a0 Germany"}]},{"given":"Eivind","family":"Coward","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Bergen 2 \u00a0 2 \u00a0 \u00a0 Norway"}]},{"given":"Inge","family":"Jonassen","sequence":"additional","affiliation":[{"name":"Computational Biology Unit, Bergen Centre for Computational Sciences, University of Bergen 1 \u00a0 1 \u00a0 \u00a0 Norway"},{"name":"Department of Informatics, University of Bergen 2 \u00a0 2 \u00a0 \u00a0 Norway"}]}],"member":"286","published-online":{"date-parts":[[2006,7,12]]},"reference":[{"key":"2023012409212369200_b1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"A basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012409212369200_b2","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-6-S4-S9","article-title":"ParPEST: a pipeline for EST data analysis based on parallel computing","volume":"6","author":"D'Agostino","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012409212369200_b3","first-page":"185","article-title":"Base-calling of automated sequencer traces using Phred. II Error probabilities","volume":"8","author":"Ewing","year":"1998","journal-title":"Genome Res."},{"key":"2023012409212369200_b4","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/978-1-4615-3424-2_12","article-title":"RNA trans-splicing","volume":"14","author":"Huang","year":"1992","journal-title":"Genetic Eng."},{"key":"2023012409212369200_b5","volume-title":"Algorithms for Clustering Data","author":"Jain","year":"1988"},{"key":"2023012409212369200_b6","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1159\/000084979","article-title":"Repbase update, a database of eukaryotic repetitive elements","volume":"110","author":"Jurka","year":"2005","journal-title":"Cytogentic and Genome Research"},{"key":"2023012409212369200_b7","doi-asserted-by":"crossref","first-page":"2963","DOI":"10.1093\/nar\/gkg379","article-title":"Efficient clustering of large EST data sets on parallel computers","volume":"31","author":"Kalyanaraman","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012409212369200_b8","first-page":"656","article-title":"BLAT\u2014the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012409212369200_b9","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1093\/nar\/30.1.299","article-title":"SYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein","volume":"1","author":"Krause","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023012409212369200_b10","doi-asserted-by":"crossref","first-page":"3657","DOI":"10.1093\/nar\/28.18.3657","article-title":"An optimized protocol for analysis of EST sequences","volume":"28","author":"Liang","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012409212369200_b11","doi-asserted-by":"crossref","first-page":"1221","DOI":"10.1093\/bioinformatics\/btg138","article-title":"Fast sequence clustering using a suffix array algorithm","volume":"19","author":"Malde","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012409212369200_b12","article-title":"Comparing clusterings\u2014an axiomatic view","author":"Meila","year":"2005"},{"key":"2023012409212369200_b13","doi-asserted-by":"crossref","first-page":"1143","DOI":"10.1101\/gr.9.11.1143","article-title":"A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus database","volume":"9","author":"Miller","year":"1999","journal-title":"Genome Res."},{"key":"2023012409212369200_b14","doi-asserted-by":"crossref","first-page":"1028","DOI":"10.1089\/cmb.2006.13.1028","article-title":"A fast and symmetric DUST implementation to mask low-complexity DNA sequences","volume":"13","author":"Morgulis","year":"2006","journal-title":"J. Comput. Biol."},{"key":"2023012409212369200_b15","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1093\/bioinformatics\/btg034","article-title":"TIGR gene indices clustering tools (TGICL): a software system for fast clustering of large EST datasets","volume":"19","author":"Pertea","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012409212369200_b16","author":"Pontius","year":"2003","journal-title":"UniGene: A Unified View of the Transcriptome"},{"key":"2023012409212369200_b17","doi-asserted-by":"crossref","first-page":"2176","DOI":"10.1093\/nar\/gki511","article-title":"Masking repeats while clustering ESTs","volume":"33","author":"Schneeberger","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012409212369200_b18","doi-asserted-by":"crossref","DOI":"10.1186\/gb-2002-3-9-research0044","article-title":"Computational discovery of sense-antisense transcription in the human and mouse genomes","volume":"3","author":"Shendure","year":"2002","journal-title":"Genome Biol."},{"key":"2023012409212369200_b19","author":"Smit","year":"1996"},{"key":"2023012409212369200_b20","doi-asserted-by":"crossref","first-page":"2973","DOI":"10.1093\/bioinformatics\/bth342","article-title":"EST clustering error evaluation and correction","volume":"20","author":"Wang","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012409212369200_b21","doi-asserted-by":"crossref","first-page":"1859","DOI":"10.1093\/bioinformatics\/bti310","article-title":"GMAP: a genomic mapping and alignment program for mRNA and EST sequences","volume":"21","author":"Wu","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409212369200_b22","article-title":"ESTmapper: efficiently clustering EST sequences using genome maps","author":"Wu","year":"2004"},{"key":"2023012409212369200_b23","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1038\/nbt808","article-title":"Widespread occurrence of antisense transcription in the human genome","volume":"21","author":"Yelin","year":"2003","journal-title":"Nat. Biotechnol."},{"key":"2023012409212369200_b24","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1089\/10665270050081478","article-title":"A greedy algorithm for aligning DNA sequences","volume":"7","author":"Zhang","year":"2000","journal-title":"J. Comput. Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/18\/2232\/48841549\/bioinformatics_22_18_2232.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/18\/2232\/48841549\/bioinformatics_22_18_2232.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T10:00:07Z","timestamp":1674554407000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/18\/2232\/316780"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,7,12]]},"references-count":24,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2006,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl368","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,9,15]]},"published":{"date-parts":[[2006,7,12]]}}}