{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T10:20:49Z","timestamp":1774952449784,"version":"3.50.1"},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Algorithms Mol Biol"],"published-print":{"date-parts":[[2014,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Multiple sequence alignment (MSA) is an extremely useful tool for molecular and evolutionary biology and there are several programs and algorithms available for this purpose. Although previous studies have compared the alignment accuracy of different MSA programs, their computational time and memory usage have not been systematically evaluated. Given the unprecedented amount of data produced by next generation deep sequencing platforms, and increasing demand for large-scale data analysis, it is imperative to optimize the application of software. Therefore, a balance between alignment accuracy and computational cost has become a critical indicator of the most suitable MSA program. We compared both accuracy and cost of nine popular MSA programs, namely CLUSTALW, CLUSTAL OMEGA, DIALIGN-TX, MAFFT, MUSCLE, POA, Probalign, Probcons and T-Coffee, against the benchmark alignment dataset BAliBASE and discuss the relevance of some implementations embedded in each program\u2019s algorithm. Accuracy of alignment was calculated with the two standard scoring functions provided by BAliBASE, the sum-of-pairs and total-column scores, and computational costs were determined by collecting peak memory usage and time of execution.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Our results indicate that mostly the consistency-based programs Probcons, T-Coffee, Probalign and MAFFT outperformed the other programs in accuracy. Whenever sequences with large N\/C terminal extensions were present in the BAliBASE suite, Probalign, MAFFT and also CLUSTAL OMEGA outperformed Probcons and T-Coffee. The drawback of these programs is that they are more memory-greedy and slower than POA, CLUSTALW, DIALIGN-TX, and MUSCLE. CLUSTALW and MUSCLE were the fastest programs, being CLUSTALW the least RAM memory demanding program.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Based on the results presented herein, all four programs Probcons, T-Coffee, Probalign and MAFFT are well recommended for better accuracy of multiple sequence alignments. T-Coffee and recent versions of MAFFT can deliver faster and reliable alignments, which are specially suited for larger datasets than those encountered in the BAliBASE suite, if multi-core computers are available. In fact, parallelization of alignments for multi-core computers should probably be addressed by more programs in a near future, which will certainly improve performance significantly.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1748-7188-9-4","type":"journal-article","created":{"date-parts":[[2014,3,6]],"date-time":"2014-03-06T18:02:45Z","timestamp":1394128965000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":94,"title":["Assessing the efficiency of multiple sequence alignment programs"],"prefix":"10.1186","volume":"9","author":[{"given":"Fabiano Sviatopolk-Mirsky","family":"Pais","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patr\u00edcia de C\u00e1ssia","family":"Ruy","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guilherme","family":"Oliveira","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Roney Santos","family":"Coimbra","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2014,3,6]]},"reference":[{"issue":"3","key":"225_CR1","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","volume":"48","author":"SB Needleman","year":"1970","unstructured":"Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970, 48 (3): 443-453. 10.1016\/0022-2836(70)90057-4.","journal-title":"J Mol Biol"},{"issue":"1","key":"225_CR2","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1007\/BF01733210","volume":"18","author":"TF Smith","year":"1981","unstructured":"Smith TF, Waterman MS, Fitch WM: Comparative biosequence metrics. J Mol Evol. 1981, 18 (1): 38-46. 10.1007\/BF01733210.","journal-title":"J Mol Evol"},{"issue":"4","key":"225_CR3","doi-asserted-by":"publisher","first-page":"351","DOI":"10.1007\/BF02603120","volume":"25","author":"DF Feng","year":"1987","unstructured":"Feng DF, Doolittle RF: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol. 1987, 25 (4): 351-360. 10.1007\/BF02603120.","journal-title":"J Mol Evol"},{"issue":"22","key":"225_CR4","doi-asserted-by":"publisher","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","volume":"22","author":"JD Thompson","year":"1994","unstructured":"Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22 (22): 4673-4680. 10.1093\/nar\/22.22.4673.","journal-title":"Nucleic Acids Res"},{"key":"225_CR5","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1186\/1748-7188-3-6","volume":"3","author":"AR Subramanian","year":"2008","unstructured":"Subramanian AR, Kaufmann M, Morgenstern B: DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol. 2008, 3: 6-10.1186\/1748-7188-3-6.","journal-title":"Algorithms Mol Biol"},{"issue":"1","key":"225_CR6","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1006\/jmbi.2000.4042","volume":"302","author":"C Notredame","year":"2000","unstructured":"Notredame C, Higgins DG, Heringa J: T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000, 302 (1): 205-217. 10.1006\/jmbi.2000.4042.","journal-title":"J Mol Biol"},{"issue":"2","key":"225_CR7","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1101\/gr.2821705","volume":"15","author":"CB Do","year":"2005","unstructured":"Do CB, Mahabhashyam MS, Brudno M, Batzoglou S: ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 2005, 15 (2): 330-340. 10.1101\/gr.2821705.","journal-title":"Genome Res"},{"issue":"22","key":"225_CR8","doi-asserted-by":"publisher","first-page":"2715","DOI":"10.1093\/bioinformatics\/btl472","volume":"22","author":"U Roshan","year":"2006","unstructured":"Roshan U, Livesay DR: Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics. 2006, 22 (22): 2715-2721. 10.1093\/bioinformatics\/btl472.","journal-title":"Bioinformatics"},{"key":"225_CR9","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1038\/msb.2011.75","volume":"7","author":"F Sievers","year":"2011","unstructured":"Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, S\u00f6ding J, Thompson JD, Higgins DG: Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol. 2011, 7: 539-","journal-title":"Mol Syst Biol"},{"issue":"3","key":"225_CR10","doi-asserted-by":"publisher","first-page":"452","DOI":"10.1093\/bioinformatics\/18.3.452","volume":"18","author":"C Lee","year":"2002","unstructured":"Lee C, Grasso C, Sharlow MF: Multiple sequence alignment using partial order graphs. Bioinformatics. 2002, 18 (3): 452-464. 10.1093\/bioinformatics\/18.3.452.","journal-title":"Bioinformatics"},{"issue":"4","key":"225_CR11","doi-asserted-by":"publisher","first-page":"823","DOI":"10.1006\/jmbi.1996.0679","volume":"264","author":"O Gotoh","year":"1996","unstructured":"Gotoh O: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol. 1996, 264 (4): 823-838. 10.1006\/jmbi.1996.0679.","journal-title":"J Mol Biol"},{"key":"225_CR12","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1186\/1471-2105-5-113","volume":"5","author":"RC Edgar","year":"2004","unstructured":"Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinforma. 2004, 5: 113-10.1186\/1471-2105-5-113.","journal-title":"BMC Bioinforma"},{"issue":"14","key":"225_CR13","doi-asserted-by":"publisher","first-page":"3059","DOI":"10.1093\/nar\/gkf436","volume":"30","author":"K Katoh","year":"2002","unstructured":"Katoh K, Misawa K, Kuma K, Miyata T: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30 (14): 3059-3066. 10.1093\/nar\/gkf436.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"225_CR14","first-page":"13","volume":"11","author":"M Hirosawa","year":"1995","unstructured":"Hirosawa M, Totoki Y, Hoshida M, Ishikawa M: Comprehensive study on iterative algorithms of multiple sequence alignment. Comput Appl Biosci. 1995, 11 (1): 13-18.","journal-title":"Comput Appl Biosci"},{"issue":"2","key":"225_CR15","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1093\/nar\/gki198","volume":"33","author":"K Katoh","year":"2005","unstructured":"Katoh K, Kuma K, Toh H, Miyata T: MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005, 33 (2): 511-518. 10.1093\/nar\/gki198.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"225_CR16","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1002\/prot.20527","volume":"61","author":"JD Thompson","year":"2005","unstructured":"Thompson JD, Koehl P, Ripp R, Poch O: BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins. 2005, 61 (1): 127-136. 10.1002\/prot.20527.","journal-title":"Proteins"},{"issue":"1","key":"225_CR17","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1093\/nar\/29.1.323","volume":"29","author":"A Bahr","year":"2001","unstructured":"Bahr A, Thompson JD, Thierry JC, Poch O: BAliBASE (benchmark alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res. 2001, 29 (1): 323-326. 10.1093\/nar\/29.1.323.","journal-title":"Nucleic Acids Res"},{"key":"225_CR18","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1186\/1471-2105-9-213","volume":"9","author":"E Perrodou","year":"2008","unstructured":"Perrodou E, Chica C, Poch O, Gibson TJ, Thompson JD: A new protein linear motif benchmark for multiple sequence alignment software. BMC Bioinforma. 2008, 9: 213-10.1186\/1471-2105-9-213.","journal-title":"BMC Bioinforma"},{"issue":"1","key":"225_CR19","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1016\/S0014-5793(02)03189-7","volume":"529","author":"T Lassmann","year":"2002","unstructured":"Lassmann T, Sonnhammer EL: Quality assessment of multiple alignment programs. FEBS Lett. 2002, 529 (1): 126-130. 10.1016\/S0014-5793(02)03189-7.","journal-title":"FEBS Lett"},{"issue":"13","key":"225_CR20","doi-asserted-by":"publisher","first-page":"2682","DOI":"10.1093\/nar\/27.13.2682","volume":"27","author":"JD Thompson","year":"1999","unstructured":"Thompson JD, Plewniak F, Poch O: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 1999, 27 (13): 2682-2690. 10.1093\/nar\/27.13.2682.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"225_CR21","doi-asserted-by":"crossref","first-page":"321","DOI":"10.3233\/ISB-00245","volume":"6","author":"G Blackshields","year":"2006","unstructured":"Blackshields G, Wallace IM, Larkin M, Higgins DG: Analysis and comparison of benchmarks for multiple sequence alignment. In Silico Biol. 2006, 6 (4): 321-339.","journal-title":"In Silico Biol"},{"key":"225_CR22","doi-asserted-by":"publisher","first-page":"471","DOI":"10.1186\/1471-2105-7-471","volume":"7","author":"PA Nuin","year":"2006","unstructured":"Nuin PA, Wang Z, Tillier ER: The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinforma. 2006, 7: 471-10.1186\/1471-2105-7-471.","journal-title":"BMC Bioinforma"},{"issue":"1","key":"225_CR23","first-page":"11","volume":"4","author":"EW Myers","year":"1988","unstructured":"Myers EW, Miller W: Optimal alignments in linear space. Comput Appl Biosci. 1988, 4 (1): 11-17.","journal-title":"Comput Appl Biosci"},{"key":"225_CR24","doi-asserted-by":"publisher","first-page":"396","DOI":"10.1186\/1471-2105-10-396","volume":"10","author":"RC Edgar","year":"2009","unstructured":"Edgar RC: Optimizing substitution matrix choice and gap parameters for sequence alignment. BMC Bioinforma. 2009, 10: 396-10.1186\/1471-2105-10-396.","journal-title":"BMC Bioinforma"},{"issue":"4","key":"225_CR25","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1093\/bib\/bbn013","volume":"9","author":"K Katoh","year":"2008","unstructured":"Katoh K, Toh H: Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. 2008, 9 (4): 286-298. 10.1093\/bib\/bbn013.","journal-title":"Brief Bioinform"},{"issue":"15","key":"225_CR26","doi-asserted-by":"publisher","first-page":"1899","DOI":"10.1093\/bioinformatics\/btq224","volume":"26","author":"K Katoh","year":"2010","unstructured":"Katoh K, Toh H: Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics. 2010, 26 (15): 1899-1900. 10.1093\/bioinformatics\/btq224.","journal-title":"Bioinformatics"},{"key":"225_CR27","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1186\/1748-7188-5-21","volume":"5","author":"G Blackshields","year":"2010","unstructured":"Blackshields G, Sievers F, Shi W, Wilm A, Higgins DG: Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol Biol. 2010, 5: 21-10.1186\/1748-7188-5-21.","journal-title":"Algorithms Mol Biol"}],"container-title":["Algorithms for Molecular Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1748-7188-9-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T00:28:18Z","timestamp":1746145698000},"score":1,"resource":{"primary":{"URL":"https:\/\/almob.biomedcentral.com\/articles\/10.1186\/1748-7188-9-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,3,6]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,12]]}},"alternative-id":["225"],"URL":"https:\/\/doi.org\/10.1186\/1748-7188-9-4","relation":{},"ISSN":["1748-7188"],"issn-type":[{"value":"1748-7188","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,3,6]]},"assertion":[{"value":"8 July 2013","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2014","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 March 2014","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"4"}}