{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T01:07:03Z","timestamp":1773277623025,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Multiple sequence alignment (MSA) is an essential prerequisite for many sequence analysis methods and valuable tool itself for describing relationships between protein sequences. Since the success of the sequence analysis is highly dependent on the reliability of alignments, measures for assessing the quality of alignments are highly requisite.<\/jats:p><jats:p>Results: We present a statistical model-based alignment quality score. Unlike other quality scores, it does not require several parallel alignments for the same set of sequences or additional structural information. Our quality score is based on measuring the conservation level of reference alignments in Homstrad. Reference sequences were realigned with the Mafft, Muscle and Probcons alignment programs, and a sum-of-pairs (SP) score was used to measure the quality of the realignments. Statistical modelling of the SP score as a function of conservation level and other alignment characteristics makes it possible to predict the SP score for any global MSA. The predicted SP scores are highly correlated with the correct SP scores, when tested on the Homstrad and SABmark databases. The results are comparable to that of multiple overlap score (MOS) and better than those of normalized mean distance (NorMD) and normalized iRMSD (NiRMSD) alignment quality criteria. Furthermore, the predicted SP score is able to detect alignments with badly aligned or unrelated sequences.<\/jats:p><jats:p>Availability: The method is freely available at http:\/\/www.mtt.fi\/AlignmentQuality\/<\/jats:p><jats:p>Contact: \u00a0virpi.ahola@mtt.fi<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn414","type":"journal-article","created":{"date-parts":[[2008,8,5]],"date-time":"2008-08-05T00:30:04Z","timestamp":1217896204000},"page":"2165-2171","source":"Crossref","is-referenced-by-count":19,"title":["Model-based prediction of sequence alignment quality"],"prefix":"10.1093","volume":"24","author":[{"given":"Virpi","family":"Ahola","sequence":"first","affiliation":[{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"},{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"}]},{"given":"Tero","family":"Aittokallio","sequence":"additional","affiliation":[{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"}]},{"given":"Mauno","family":"Vihinen","sequence":"additional","affiliation":[{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"},{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"}]},{"given":"Esa","family":"Uusipaikka","sequence":"additional","affiliation":[{"name":"1 Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, 2Department of Statistics, 3Department of Mathematics, FI-20014, University of Turku, 4Institute of Medical Technology, FI-33014, University of Tampere and 5Tampere University Hospital, FI-33520 Tampere, Finland"}]}],"member":"286","published-online":{"date-parts":[[2008,8,4]]},"reference":[{"key":"2023020211122022500_B1","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1186\/1471-2105-7-484","article-title":"A statistical score for assessing the quality of multiple sequence alignments","volume":"7","author":"Ahola","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020211122022500_B2","doi-asserted-by":"crossref","first-page":"e35","DOI":"10.1093\/bioinformatics\/btl218","article-title":"The irmsd: a local measure of sequence alignment accuracy using structural information","volume":"22","author":"Armougom","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B3","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1093\/nar\/29.1.323","article-title":"Balibase (benchmark alignment database): enhancements for repeats, transmembrane sequences and circular permutations","volume":"29","author":"Bahr","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B4","doi-asserted-by":"crossref","first-page":"2230","DOI":"10.1093\/bioinformatics\/bti335","article-title":"A word-oriented approach to alignment validation","volume":"21","author":"Beiko","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B5","doi-asserted-by":"crossref","first-page":"1165","DOI":"10.1214\/aos\/1013699998","article-title":"The control of the false discovery rate in multiple testing under dependency","volume":"29","author":"Benjamini","year":"2001","journal-title":"Ann. Stat"},{"key":"2023020211122022500_B6","doi-asserted-by":"crossref","first-page":"321","DOI":"10.3233\/ISB-00245","article-title":"Analysis and comparison of benchmarks for multiple sequence alignment","volume":"6","author":"Blackshields","year":"2006","journal-title":"In Silico Biol"},{"key":"2023020211122022500_B7","first-page":"1125","article-title":"elta method","volume-title":"Encyclopedia of Biostatistics.","author":"Cox","year":"1998"},{"key":"2023020211122022500_B8","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1101\/gr.2821705","article-title":"Probcons: probabilistic consistency-based multiple sequence alignment","volume":"15","author":"Do","year":"2005","journal-title":"Genome Res"},{"key":"2023020211122022500_B9","doi-asserted-by":"crossref","first-page":"1003","DOI":"10.1006\/jmbi.2000.3615","article-title":"Structure-based evaluation of sequence comparison and fold recognition alignment accuracy","volume":"297","author":"Domingues","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020211122022500_B10","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"Muscle: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B11","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1016\/j.sbi.2006.04.004","article-title":"Multiple sequence alignment","volume":"16","author":"Edgar","year":"2006","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023020211122022500_B12","doi-asserted-by":"crossref","first-page":"1813","DOI":"10.1110\/ps.0242903","article-title":"Structural genomics: computational methods for structure analysis","volume":"12","author":"Goldsmith-Fischman","year":"2003","journal-title":"Protein Sci"},{"key":"2023020211122022500_B13","doi-asserted-by":"crossref","first-page":"2433","DOI":"10.1093\/molbev\/msm176","article-title":"Mind the gaps: evidence of bias in estimates of multiple sequence alignments","volume":"24","author":"Golubchik","year":"2007","journal-title":"J. Mol. Evol"},{"key":"2023020211122022500_B14","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1007\/BF02462264","article-title":"Consistency of optimal sequence alignments","volume":"52","author":"Gotoh","year":"1990","journal-title":"Bull. Math. Biol"},{"key":"2023020211122022500_B15","doi-asserted-by":"crossref","first-page":"1546","DOI":"10.1093\/bioinformatics\/bth126","article-title":"Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems","volume":"20","author":"Grasso","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B16","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1093\/bioinformatics\/15.7.563","article-title":"Identifying DNA and protein patterns with statistically significant alignments of multiple sequences","volume":"15","author":"Hertz","year":"1999","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B17","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"Mafft: a novel method for rapid multiple sequence alignment based on fast fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B18","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1093\/nar\/gki198","article-title":"Mafft version 5: improvement in accuracy of multiple sequence alignment","volume":"33","author":"Katoh","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B19","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1093\/protein\/13.11.745","article-title":"Prosup: a refined tool for protein structure alignment","volume":"13","author":"Lackner","year":"2000","journal-title":"Protein Eng"},{"key":"2023020211122022500_B20","doi-asserted-by":"crossref","first-page":"1380","DOI":"10.1093\/molbev\/msm060","article-title":"Head or tails: a simple reliability check for multiple sequence alignments","volume":"24","author":"Landan","year":"2007","journal-title":"Mol. Biol. Evol"},{"key":"2023020211122022500_B21","first-page":"15","article-title":"Local reliability measures from sets of co-optimal multiple sequence alignments","volume":"13","author":"Landan","year":"2008","journal-title":"Pac. Symp. Biocomput"},{"key":"2023020211122022500_B22","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1186\/1471-2105-6-298","article-title":"Kalign\u2013an accurate and fast multiple sequence alignment algorithm","volume":"6","author":"Lassmann","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023020211122022500_B23","doi-asserted-by":"crossref","first-page":"7120","DOI":"10.1093\/nar\/gki1020","article-title":"Automatic assessment of alignment quality","volume":"33","author":"Lassmann","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B24","volume-title":"Detection Theory; A User's Guide.","author":"McMillian","year":"2005"},{"key":"2023020211122022500_B25","doi-asserted-by":"crossref","first-page":"2469","DOI":"10.1002\/pro.5560071126","article-title":"Homstrad: a database of protein structure alignments for homologous families","volume":"7","author":"Mizuguchi","year":"1998","journal-title":"Protein Sci"},{"key":"2023020211122022500_B26","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1093\/bioinformatics\/btf882","article-title":"AltAVisT: comparing alternative multiple sequence alignments","volume":"19","author":"Morgenstern","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B27","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1517\/14622416.3.1.131","article-title":"Recent progress in multiple sequence alignment: a survey","volume":"3","author":"Notredame","year":"2002","journal-title":"Pharmacogenomics"},{"key":"2023020211122022500_B28","doi-asserted-by":"crossref","first-page":"e123","DOI":"10.1371\/journal.pcbi.0030123","article-title":"Recent evolutions of multiple sequence alignment algorithms","volume":"3","author":"Notredame","year":"2007","journal-title":"PLoS Comput. Biol"},{"key":"2023020211122022500_B29","first-page":"30","article-title":"Using multiple alignment methods to assess the quality of genomic data analysis","volume-title":"Bioinformatics and Genomes: Current Perspectives.","author":"Notredame","year":"2003"},{"key":"2023020211122022500_B30","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1006\/jmbi.2000.4042","article-title":"T-coffee: a novel method for fast and accurate multiple sequence alignment","volume":"302","author":"Notredame","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020211122022500_B31","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1186\/1471-2105-7-471","article-title":"The accuracy of several multiple sequence alignment programs for proteins","volume":"7","author":"Nuin","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020211122022500_B32","doi-asserted-by":"crossref","first-page":"4364","DOI":"10.1093\/nar\/gkl514","article-title":"Mummals: multiple sequence alignment improved by using hidden Markov models with local structural information","volume":"34","author":"Pei","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B33","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1093\/bioinformatics\/btm017","article-title":"Promals: towards accurate multiple sequence alignments of distantly related proteins","volume":"23","author":"Pei","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B34","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1093\/bioinformatics\/17.8.700","article-title":"Al2co: calculation of positional conservation in a protein sequence alignment","volume":"17","author":"Pei","year":"2001","journal-title":"Bioinformatics"},{"key":"2023020211122022500_B35","first-page":"395","article-title":"Using the sir algorithm to simulate posterior distributions","volume-title":"Bayesian Statistics 3.","author":"Rubin","year":"1988"},{"key":"2023020211122022500_B36","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1002\/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.0.CO;2-7","article-title":"Large-scale comparison of protein sequence alignment algorithms with structure alignments","volume":"40","author":"Sauder","year":"2000","journal-title":"Proteins"},{"key":"2023020211122022500_B37","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1037\/1082-989X.11.1.54","article-title":"A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables","volume":"11","author":"Smithson","year":"2006","journal-title":"Psychol. Methods"},{"key":"2023020211122022500_B38","doi-asserted-by":"crossref","first-page":"4876","DOI":"10.1093\/nar\/25.24.4876","article-title":"The clustal_x windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools","volume":"25","author":"Thompson","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B39","doi-asserted-by":"crossref","first-page":"2682","DOI":"10.1093\/nar\/27.13.2682","article-title":"A comprehensive comparison of multiple sequence alignment programs","volume":"27","author":"Thompson","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023020211122022500_B40","doi-asserted-by":"crossref","first-page":"937","DOI":"10.1006\/jmbi.2001.5187","article-title":"Towards a reliable objective function for multiple sequence alignments","volume":"314","author":"Thompson","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023020211122022500_B41","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/S0959-440X(96)80054-6","article-title":"Near-optimal sequence alignment","volume":"6","author":"Vingron","year":"1996","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023020211122022500_B42","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/0022-2836(91)90871-3","article-title":"Motif recognition and alignment for many sequences by comparison of dot-matrices","volume":"218","author":"Vingron","year":"1991","journal-title":"J. Mol. Biol"},{"key":"2023020211122022500_B43","doi-asserted-by":"crossref","first-page":"1267","DOI":"10.1093\/bioinformatics\/bth493","article-title":"Sabmark\u2013a benchmark for sequence alignment that covers the entire known fold space","volume":"21","author":"Walle","year":"2005","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/19\/2165\/49052264\/bioinformatics_24_19_2165.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/19\/2165\/49052264\/bioinformatics_24_19_2165.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T09:17:07Z","timestamp":1738315027000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/19\/2165\/248086"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8,4]]},"references-count":43,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2008,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn414","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,10,1]]},"published":{"date-parts":[[2008,8,4]]}}}