{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T13:47:48Z","timestamp":1764251268567},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Multiple sequence alignment is a basic part of much biological research, including phylogeny estimation and protein structure and function prediction. Different alignments on the same set of unaligned sequences are often compared, sometimes in order to assess the accuracy of alignment methods or to infer a consensus alignment from a set of estimated alignments.<\/jats:p>\n               <jats:p>Three of the standard techniques for comparing alignments, Developer, Modeler and Total Column (TC) scores can be derived through calculations of the set of homologies that the alignments share. However, the brute-force technique for calculating this set is quadratic in the input size. The remaining standard technique, Cline Shift Score, inherently requires quadratic time.<\/jats:p>\n               <jats:p>Results: In this article, we prove that each of these scores can be computed in linear time, and we present FastSP, a linear-time algorithm for calculating these scores. Even on the largest alignments we explored (one with 50 000 sequences), FastSP completed &amp;lt;2 min and used at most 2 GB of the main memory. The best alternative is qscore, a method whose empirical running time is approximately the same as FastSP when given sufficient memory (at least 8 GB), but whose asymptotic running time has never been theoretically established. In addition, for comparisons of large alignments under lower memory conditions (at most 4 GB of main memory), qscore uses substantial memory (up to 10 GB for the datasets we studied), took more time and failed to analyze the largest datasets.<\/jats:p>\n               <jats:p>Availability: The open-source software and executables are available online at http:\/\/www.cs.utexas.edu\/~phylo\/software\/fastsp\/.<\/jats:p>\n               <jats:p>Contact: \u00a0tandy@cs.utexas.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr553","type":"journal-article","created":{"date-parts":[[2011,10,8]],"date-time":"2011-10-08T02:49:40Z","timestamp":1318042180000},"page":"3250-3258","source":"Crossref","is-referenced-by-count":60,"title":["F<scp>AST<\/scp>SP: linear time calculation of alignment accuracy"],"prefix":"10.1093","volume":"27","author":[{"given":"Siavash","family":"Mirarab","sequence":"first","affiliation":[{"name":"Department of Computer Science, The University of Texas at Austin, 1616 Guadalupe Street, Austin 78701, USA"}]},{"given":"Tandy","family":"Warnow","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Texas at Austin, 1616 Guadalupe Street, Austin 78701, USA"}]}],"member":"286","published-online":{"date-parts":[[2011,10,7]]},"reference":[{"key":"2023012511050894100_B1","doi-asserted-by":"crossref","first-page":"7353","DOI":"10.1093\/nar\/gkq625","article-title":"Issues in bioinformatics benchmarking: the case study of multiple sequence alignment","volume":"38","author":"Aniba","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B2","doi-asserted-by":"crossref","first-page":"675","DOI":"10.2478\/v10006-009-0054-y","article-title":"Some remarks on evaluating the quality of the multiple sequence alignment based on the BAliBASE benchmark","volume":"19","author":"Bla\u017cewicz,J.","year":"2009","journal-title":"Int. J. Appl. Math. Comput. Sci."},{"key":"2023012511050894100_B3","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1093\/bioinformatics\/18.2.306","article-title":"Predicting reliable regions in protein sequence alignments","volume":"18","author":"Cline","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B4","doi-asserted-by":"crossref","first-page":"5069","DOI":"10.1128\/AEM.03006-05","article-title":"Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB","volume":"72","author":"DeSantis","year":"2006","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012511050894100_B5","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B6","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1111\/j.1096-0031.2009.00255.x","article-title":"Phylogenetic analysis of 73,060 taxa corroborates major eukaryotic groups","volume":"25","author":"Goloboff","year":"2009","journal-title":"Cladistics"},{"key":"2023012511050894100_B7","doi-asserted-by":"crossref","first-page":"372","DOI":"10.1093\/bioinformatics\/btl592","article-title":"PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences","volume":"23","author":"Katoh","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B8","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1093\/bib\/bbn013","article-title":"Recent developments in the MAFFT multiple sequence alignment program","volume":"9","author":"Katoh","year":"2008","journal-title":"Brief. Bioinformatics"},{"key":"2023012511050894100_B9","doi-asserted-by":"crossref","first-page":"2455","DOI":"10.1093\/bioinformatics\/btp452","article-title":"Upcoming challenges for multiple sequence alignment methods in the high-throughput era","volume":"25","author":"Kemena","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B10","doi-asserted-by":"crossref","first-page":"6359","DOI":"10.1093\/nar\/gkr334","article-title":"PSAR: measuring multiple sequence alignment reliability by probabilistic sampling","volume":"39","author":"Kim","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B11","first-page":"15","article-title":"Local reliability measures from sets of co-optimal multiple sequence aligments","volume":"13","author":"Landan","year":"2008","journal-title":"Proc. Pac. Symp. Biocomput."},{"key":"2023012511050894100_B12","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1186\/1471-2105-8-471","article-title":"Predicting and improving the protein sequence alignment quality by support vector regression","volume":"8","author":"Lee","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012511050894100_B13","doi-asserted-by":"crossref","DOI":"10.1109\/DCS.1988.12507","article-title":"Condor - a hunter of idle workstations","volume-title":"Proceedings of the 8th International Conference of Distributed Computing Systems.","author":"Litzkow","year":"1988"},{"key":"2023012511050894100_B14","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1126\/science.1171243","article-title":"Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees","volume":"324","author":"Liu","year":"2009","journal-title":"Science"},{"key":"2023012511050894100_B15","first-page":"RRN1198","article-title":"Multiple sequence alignment: a major challenge to large-scale phylogenetics","volume":"2","author":"Liu","year":"2010","journal-title":"PLoS Curr. Tree Life"},{"key":"2023012511050894100_B16","article-title":"SAT\u00e9-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees","author":"Liu","year":"2011","journal-title":"Syst. Biol."},{"key":"2023012511050894100_B17","doi-asserted-by":"crossref","first-page":"10557","DOI":"10.1073\/pnas.0409137102","article-title":"An algorithm for progressive multiple alignment of sequences with insertions","volume":"102","author":"Loytynoja","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511050894100_B18","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1093\/nar\/gkm333","article-title":"The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods","volume":"35","author":"Moretti","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B19","doi-asserted-by":"crossref","first-page":"3503","DOI":"10.1093\/nar\/gkg522","article-title":"Tcoffee@igs: a web server for computing, evaluating and combining multiple sequence alignments","volume":"31","author":"Poirot","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B20","doi-asserted-by":"crossref","first-page":"1682","DOI":"10.1093\/bioinformatics\/btg211","article-title":"Consensus alignment for reliable framework prediction in homology modeling","volume":"19","author":"Prasad","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B21","doi-asserted-by":"crossref","first-page":"W50","DOI":"10.1093\/nar\/gkh456","article-title":"Consensus alignment server for reliable comparative modeling with distant templates","volume":"32","author":"Prasad","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B22","doi-asserted-by":"crossref","first-page":"e9490","DOI":"10.1371\/journal.pone.0009490","article-title":"FastTree 2: approximately maximum-likelihood trees for large alignments","volume":"5","author":"Price","year":"2010","journal-title":"PLoS One"},{"key":"2023012511050894100_B23","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1186\/1471-2148-9-217","article-title":"BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC","volume":"9","author":"Satija","year":"2009","journal-title":"BMC Evol. Biol."},{"key":"2023012511050894100_B24","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2148-9-37","article-title":"Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches","volume":"9","author":"Smith","year":"2009","journal-title":"BMC Evol. Biol."},{"key":"2023012511050894100_B25","doi-asserted-by":"crossref","first-page":"2047","DOI":"10.1093\/bioinformatics\/btl175","article-title":"BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny","volume":"22","author":"Suchard","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B26","doi-asserted-by":"crossref","first-page":"1569","DOI":"10.1093\/bioinformatics\/btq228","article-title":"DendroPy: a Python library for phylogenetic computing","volume":"26","author":"Sukumaran","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B27","doi-asserted-by":"crossref","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","article-title":"CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice","volume":"22","author":"Thompson","year":"1994","journal-title":"Nucleic Acids Res."},{"key":"2023012511050894100_B28","doi-asserted-by":"crossref","first-page":"i559","DOI":"10.1093\/bioinformatics\/btm226","article-title":"Multiple alignment by aligning alignments","volume":"23","author":"Wheeler","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012511050894100_B29","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1126\/science.1151532","article-title":"Alignment uncertainty and genomic analysis","volume":"319","author":"Wong","year":"2008","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/23\/3250\/48862947\/bioinformatics_27_23_3250.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/23\/3250\/48862947\/bioinformatics_27_23_3250.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T11:10:20Z","timestamp":1674645020000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/23\/3250\/234345"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10,7]]},"references-count":29,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2011,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr553","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,12,1]]},"published":{"date-parts":[[2011,10,7]]}}}