{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T22:47:39Z","timestamp":1775861259642,"version":"3.50.1"},"reference-count":26,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2023,11,1]],"date-time":"2023-11-01T00:00:00Z","timestamp":1698796800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Deanship of Scientific Research, Imam Mohammad Ibn Saud Islamic University, Saudi Arabia","award":["20-12-18-012"],"award-info":[{"award-number":["20-12-18-012"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>This study explores the accuracy and efficiency of multiple sequence alignment (MSA) programs, focusing on Clustal\u03a9, MAFFT, and MUSCLE in the context of genotyping SARS-CoV-2 for the Saudi population. Our results indicate that MAFFT outperforms the others, making it an ideal choice for large-scale genomic analyses. The comparative performance of MSAs assembled using MergeAlign demonstrates that MAFFT and MUSCLE consistently exhibit higher accuracy than Clustal\u03a9 in both reference-based and consensus-based approaches. The evaluation of genotyping effectiveness reveals that the addition of a reference sequence, such as the SARS-CoV-2 Wuhan-Hu-1 isolate, does not significantly affect the alignment process, suggesting that using consensus sequences derived from individual MSA alignments may yield comparable genotyping outcomes. Investigating single-nucleotide polymorphisms (SNPs) and mutations highlights distinctive features of MSA programs. Clustal\u03a9 and MAFFT show similar counts, while MUSCLE displays the highest SNP count. High-frequency SNP analysis identifies MAFFT as the most accurate MSA program, emphasizing its reliability. Comparisons between Saudi and global SARS-CoV-2 populations underscore regional genetic variations. Saudis exhibit consistently higher frequencies of high-frequency SNPs, attributed to genetic similarity within the population. Transmission dynamics analysis reveals a higher frequency of co-mutations in the Saudi dataset, suggesting shared evolutionary patterns. These findings emphasize the importance of considering regional diversity in genetic analyses.<\/jats:p>","DOI":"10.3390\/computation11110212","type":"journal-article","created":{"date-parts":[[2023,11,1]],"date-time":"2023-11-01T07:24:04Z","timestamp":1698823444000},"page":"212","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Evaluating the Performance of Multiple Sequence Alignment Programs with Application to Genotyping SARS-CoV-2 in the Saudi Population"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-5376-6554","authenticated-orcid":false,"given":"Aminah","family":"Alqahtani","sequence":"first","affiliation":[{"name":"Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7254-1867","authenticated-orcid":false,"given":"Meznah","family":"Almutairy","sequence":"additional","affiliation":[{"name":"Computer Science Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3.1.1","DOI":"10.1002\/0471250953.bi0301s42","article-title":"An introduction to sequence similarity (\u201chomology\u201d) searching","volume":"42","author":"Pearson","year":"2013","journal-title":"Curr. Protoc. Bioinform."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sievers, F., Wilm, A., Dineen, D., Gibson, T., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., and S\u00f6ding, J. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol., 7.","DOI":"10.1038\/msb.2011.75"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: Multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1093\/bioinformatics\/15.2.122","article-title":"Combining many multiple alignments in one improved alignment","volume":"15","author":"Caprani","year":"1999","journal-title":"Bioinformatics"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Collingridge, P., and Kelly, S. (2012). MergeAlign: Improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments. BMC Bioinform., 13.","DOI":"10.1186\/1471-2105-13-117"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1093\/bib\/bbv099","article-title":"Multiple sequence alignment modeling: Methods and applications","volume":"17","author":"Chatzou","year":"2015","journal-title":"Briefings Bioinform."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"3588","DOI":"10.1016\/j.ygeno.2020.04.016","article-title":"Genotyping coronavirus SARS-CoV-2: Methods and implications","volume":"112","author":"Yin","year":"2020","journal-title":"Genomics"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1106","DOI":"10.1093\/bib\/bbab025","article-title":"Whole genome analysis of more than 10000 SARS-CoV-2 virus unveils global genetic diversity and target region of NSP6","volume":"22","author":"Saha","year":"2021","journal-title":"Briefings Bioinform."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2502","DOI":"10.1016\/j.sjbs.2021.01.051","article-title":"Computational drug screening against the SARS-CoV-2 Saudi Arabia isolates through a multiple-sequence alignment approach","volume":"28","author":"Mok","year":"2021","journal-title":"Saudi J. Biol. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"3325","DOI":"10.1016\/j.sjbs.2021.02.077","article-title":"Molecular adaptive evolution of SARS-COV-2 spike protein in Saudi Arabia","volume":"28","author":"Nour","year":"2021","journal-title":"Saudi J. Biol. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"e06035","DOI":"10.1016\/j.heliyon.2021.e06035","article-title":"Temporal increase in D614G mutation of SARS-CoV-2 in the Middle East and North Africa","volume":"7","author":"Sallam","year":"2021","journal-title":"Heliyon"},{"key":"ref_13","unstructured":"Wang, L. (1995). Algorithms for Multiple Sequences Alignment, Comparison of Trees, and Steiner Trees. [Ph.D. Thesis, McMaster University]."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Wang, Y., Wu, H., and Cai, Y. (2018). A benchmark study of sequence alignment methods for protein clustering. BMC Bioinform., 19.","DOI":"10.1186\/s12859-018-2524-4"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Zhang, Q., Zhou, J., and Zou, Q. (2022). A survey on the algorithm and development of multiple sequence alignment. Briefings Bioinform., 23.","DOI":"10.1093\/bib\/bbac069"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Pais, F., Ruy, P., Oliveira, G., and Coimbra, R. (2014). Assessing the efficiency of multiple sequence alignment programs. Algorithms Mol. Biol., 9.","DOI":"10.1186\/1748-7188-9-4"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Ballouz, S., Dobin, A., and Gillis, J. (2019). Is it time to change the reference genome?. Genome Biol., 20.","DOI":"10.1186\/s13059-019-1774-4"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"30494","DOI":"10.2807\/1560-7917.ES.2017.22.13.30494","article-title":"GISAID: Global initiative on sharing all influenza data\u2013from vision to reality","volume":"22","author":"Shu","year":"2017","journal-title":"Eurosurveillance"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1038\/nrg3642","article-title":"Sequencing depth and coverage: Key considerations in genomic analyses","volume":"15","author":"Sims","year":"2014","journal-title":"Nat. Rev. Genet."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1038\/s41586-020-2008-3","article-title":"A new coronavirus associated with human respiratory disease in China","volume":"579","author":"Wu","year":"2020","journal-title":"Nature"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"104522","DOI":"10.1016\/j.meegid.2020.104522","article-title":"Inferring the genetic variability in Indian SARS-CoV-2 genomes using consensus of multiple sequence alignment techniques","volume":"85","author":"Saha","year":"2020","journal-title":"Infect. Genet. Evol."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Gusfield, D. (1997). Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press.","DOI":"10.1017\/CBO9780511574931"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12920-015-0115-z","article-title":"Defining \u201cmutation\u201d and \u201cpolymorphism\u201d in the era of personal genomics","volume":"8","author":"Karki","year":"2015","journal-title":"BMC Med Genom."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: Freely available Python tools for computational molecular biology and bioinformatics","volume":"25","author":"Cock","year":"2009","journal-title":"Bioinformatics"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Huang, T., Shu, Y., and Cai, Y. (2015). Genetic differences among ethnic groups. BMC Genom., 16.","DOI":"10.1186\/s12864-015-2328-0"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Choudhury, A., Hazelhurst, S., Meintjes, A., Achinike-Oduaran, O., Aron, S., Gamieldien, J., Jalali Sefid Dashti, M., Mulder, N., Tiffin, N., and Ramsay, M. (2014). Population-specific common SNPs reflect demographic histories and highlight regions of genomic plasticity with functional relevance. BMC Genom., 15.","DOI":"10.1186\/1471-2164-15-437"}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/11\/212\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:15:29Z","timestamp":1760130929000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/11\/212"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,1]]},"references-count":26,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["computation11110212"],"URL":"https:\/\/doi.org\/10.3390\/computation11110212","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,1]]}}}