{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,7]],"date-time":"2026-05-07T00:30:39Z","timestamp":1778113839309,"version":"3.51.4"},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,10,24]],"date-time":"2021-10-24T00:00:00Z","timestamp":1635033600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,10,24]],"date-time":"2021-10-24T00:00:00Z","timestamp":1635033600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen ForschungSchweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","award":["31003A_157064"],"award-info":[{"award-number":["31003A_157064"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","award":["31003A_176316"],"award-info":[{"award-number":["31003A_176316"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Current alignment tools typically lack an explicit model of indel evolution, leading to artificially short inferred alignments (i.e., over-alignment) due to inconsistencies between the indel history and the phylogeny relating the input sequences.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We present a new progressive multiple sequence alignment tool ProPIP. The process of insertions and deletions is described using an explicit evolutionary model\u2014the Poisson Indel Process or PIP. The method is based on dynamic programming and is implemented in a frequentist framework. The source code can be compiled on Linux, macOS and Microsoft Windows platforms. The algorithm is implemented in C++ as standalone program. The source code is freely available on GitHub at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/acg-team\/ProPIP\">https:\/\/github.com\/acg-team\/ProPIP<\/jats:ext-link> and is distributed under the terms of the GNU GPL v3 license.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>The use of an explicit indel evolution model allows to avoid over-alignment, to infer gaps in a phylogenetically consistent way and to make inferences about the rates of insertions and deletions. Instead of the arbitrary gap penalties, the parameters used by ProPIP are the insertion and deletion rates, which have biological interpretation and are contextualized in a probabilistic environment. As a result, indel rate settings may be optimised in order to infer phylogenetically meaningful gap patterns.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-021-04442-8","type":"journal-article","created":{"date-parts":[[2021,10,24]],"date-time":"2021-10-24T16:02:40Z","timestamp":1635091360000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["ProPIP: a tool for progressive multiple sequence alignment with Poisson Indel Process"],"prefix":"10.1186","volume":"22","author":[{"given":"Massimo","family":"Maiolo","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lorenzo","family":"Gatti","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Diego","family":"Frei","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tiziano","family":"Leidi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Manuel","family":"Gil","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maria","family":"Anisimova","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,10,24]]},"reference":[{"issue":"3","key":"4442_CR1","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","volume":"48","author":"SB Needleman","year":"1970","unstructured":"Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48(3):443\u201353. https:\/\/doi.org\/10.1016\/0022-2836(70)90057-4.","journal-title":"J Mol Biol"},{"issue":"1","key":"4442_CR2","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1145\/321796.321811","volume":"21","author":"RA Wagner","year":"1974","unstructured":"Wagner RA, Fischer MJ. The string-to-string correction problem. J ACM. 1974;21(1):168\u201373. https:\/\/doi.org\/10.1145\/321796.321811.","journal-title":"J ACM"},{"issue":"5883","key":"4442_CR3","doi-asserted-by":"publisher","first-page":"1632","DOI":"10.1126\/science.1158395","volume":"320","author":"A Loytynoja","year":"2008","unstructured":"Loytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320(5883):1632\u20135. https:\/\/doi.org\/10.1126\/science.1158395.","journal-title":"Science"},{"key":"4442_CR4","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-13-129","author":"AM Szalkowski","year":"2012","unstructured":"Szalkowski AM. Fast and robust multiple sequence alignment with phylogeny-aware gap placement. BMC Bioinf. 2012. https:\/\/doi.org\/10.1186\/1471-2105-13-129.","journal-title":"BMC Bioinf"},{"issue":"2","key":"4442_CR5","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1007\/BF02193625","volume":"33","author":"JL Thorne","year":"1991","unstructured":"Thorne JL, Kishino H, Felsenstein J. An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol. 1991;33(2):114\u201324.","journal-title":"J Mol Evol"},{"issue":"4","key":"4442_CR6","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.1073\/pnas.1220450110","volume":"110","author":"A Bouchard-C\u00f4t\u00e9","year":"2013","unstructured":"Bouchard-C\u00f4t\u00e9 A, Jordan MI. Evolutionary inference via the Poisson Indel Process. Proc Natl Acad Sci USA. 2013;110(4):1160.","journal-title":"Proc Natl Acad Sci USA"},{"key":"4442_CR7","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-018-2357-1","author":"M Maiolo","year":"2018","unstructured":"Maiolo M, Zhang X, Gil M, Anisimova M. Progressive multiple sequence alignment with indel evolution. BMC Bioinf. 2018. https:\/\/doi.org\/10.1186\/s12859-018-2357-1.","journal-title":"BMC Bioinf"},{"key":"4442_CR8","doi-asserted-by":"crossref","unstructured":"Maiolo M, Ulzega S, Gil M, Anisimova M. Accelerating phylogeny-aware alignment with indel evolution using short time fourier transform. To appear in NAR Genomics and Bioinformatics (2020).","DOI":"10.1093\/nargab\/lqaa092"},{"issue":"Suppl 2","key":"4442_CR9","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1093\/bioinformatics\/18.suppl_2.S153","volume":"18","author":"U Mueckstein","year":"2002","unstructured":"Mueckstein U, Hofacker IL, Stadler PF. Stochastic pairwise alignments. Bioinformatics. 2002;18(Suppl 2):153\u201360.","journal-title":"Bioinformatics"},{"issue":"1","key":"4442_CR10","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1186\/1471-2105-7-188","volume":"7","author":"J Dutheil","year":"2006","unstructured":"Dutheil J, Gaillard S, Bazin E, Gl\u00e9min S, Ranwez V, Galtier N, Belkhir K. Bio++: a set of c++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics. BMC Bioinf. 2006;7(1):188.","journal-title":"BMC Bioinf"},{"issue":"2","key":"4442_CR11","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1073\/pnas.1417526112","volume":"112","author":"G Tan","year":"2015","unstructured":"Tan G, Gil M, L\u00f6ytynoja AP, Goldman N, Dessimoz C. Simple chained guide trees give poorer multiple sequence alignments than inferred trees in simulation and phylogenetic benchmarks. Proc Natl Acad Sci. 2015;112(2):99\u2013100.","journal-title":"Proc Natl Acad Sci"},{"issue":"7","key":"4442_CR12","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1093\/oxfordjournals.molbev.a025808","volume":"14","author":"O Gascuel","year":"1997","unstructured":"Gascuel O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol. 1997;14(7):685\u201395.","journal-title":"Mol Biol Evol"},{"key":"4442_CR13","doi-asserted-by":"publisher","first-page":"862","DOI":"10.1126\/science.185.4154.862","volume":"185","author":"R Grantham","year":"1974","unstructured":"Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185:862.","journal-title":"Science"},{"issue":"2","key":"4442_CR14","doi-asserted-by":"publisher","first-page":"431","DOI":"10.1137\/0111030","volume":"11","author":"DW Marquardt","year":"1963","unstructured":"Marquardt DW. An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math. 1963;11(2):431\u201341.","journal-title":"J Soc Ind Appl Math"},{"issue":"2","key":"4442_CR15","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1090\/qam\/10666","volume":"2","author":"K Levenberg","year":"1944","unstructured":"Levenberg K. A method for the solution of certain non-linear problems in least squares. Q Appl Math. 1944;2(2):164\u20138.","journal-title":"Q Appl Math"},{"issue":"1","key":"4442_CR16","doi-asserted-by":"publisher","first-page":"278","DOI":"10.1186\/1471-2105-6-278","volume":"6","author":"MS Rosenberg","year":"2005","unstructured":"Rosenberg MS. Multiple sequence alignment accuracy and evolutionary distance estimation. BMC Bioinf. 2005;6(1):278. https:\/\/doi.org\/10.1186\/1471-2105-6-278.","journal-title":"BMC Bioinf"},{"key":"4442_CR17","doi-asserted-by":"crossref","unstructured":"Jukes TH, Cantor CR. Mammalian Protein Metabolism, vol. 3, pp. 21\u2013132. Academic Press, New York. 1969. Chap. 24. Evolution of Protein Molecules","DOI":"10.1016\/B978-1-4832-3211-9.50009-7"},{"issue":"14","key":"4442_CR18","doi-asserted-by":"publisher","first-page":"3059","DOI":"10.1093\/nar\/gkf436","volume":"30","author":"K Katoh","year":"2002","unstructured":"Katoh K, Misawa K, Kuma K-I, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059.","journal-title":"Nucleic Acids Res"},{"key":"4442_CR19","unstructured":"Shafee T. AlignStat V1.3.1. https:\/\/www.rdocumentation.org\/packages\/AlignStat."},{"key":"4442_CR20","unstructured":"Edgar R. Qscore. https:\/\/www.drive5.com\/qscore."},{"issue":"2","key":"4442_CR21","doi-asserted-by":"publisher","first-page":"306","DOI":"10.1093\/bioinformatics\/18.2.306","volume":"18","author":"M Cline","year":"2002","unstructured":"Cline M, Hughey R, Karplus K. Predicting reliable regions in protein sequence alignments. Bioinformatics. 2002;18(2):306\u201314. https:\/\/doi.org\/10.1093\/bioinformatics\/18.2.306.","journal-title":"Bioinformatics"},{"key":"4442_CR22","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1007\/978-1-62703-646-7_4","volume":"1079","author":"S Iantorno","year":"2014","unstructured":"Iantorno S, Gori K, Goldman N, Gil M, Dessimoz C. Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment. Methods Mol Biol. 2014;1079:59\u201373.","journal-title":"Methods Mol Biol"},{"issue":"14","key":"4442_CR23","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1093\/bioinformatics\/btz368","volume":"35","author":"D Sumanaweera","year":"2019","unstructured":"Sumanaweera D, Allison L, Konagurthu AS. Statistical compression of protein sequences and inference of marginal probability landscapes over competing alignments using finite state models and dirichlet priors. Bioinformatics. 2019;35(14):360\u20139. https:\/\/doi.org\/10.1093\/bioinformatics\/btz368.","journal-title":"Bioinformatics"},{"key":"4442_CR24","unstructured":"Poulose E. A study of dynamics of indels using propip, prank and mafft. Master\u2019s thesis, Institute of Applied Simulation, ZHAW School of Life Sciences and Facility Management, W\u00e4denswil. Switzerland. (2020)."},{"issue":"29","key":"4442_CR25","doi-asserted-by":"publisher","first-page":"10556","DOI":"10.1073\/pnas.1405628111","volume":"111","author":"K Boyce","year":"2014","unstructured":"Boyce K, Sievers F, Higgins DG. Simple chained guide trees give high-quality protein multiple sequence alignments. Proc Natl Acad Sci. 2014;111(29):10556\u201361. https:\/\/doi.org\/10.1073\/pnas.1405628111.","journal-title":"Proc Natl Acad Sci"},{"issue":"2","key":"4442_CR26","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1073\/pnas.1417526112","volume":"112","author":"G Tan","year":"2015","unstructured":"Tan G, Gil M, L\u00f6ytynoja AP, Goldman N, Dessimoz C. Simple chained guide trees give poorer multiple sequence alignments than inferred trees in simulation and phylogenetic benchmarks. Proc Natl Acad Sci. 2015;112(2):99\u2013100. https:\/\/doi.org\/10.1073\/pnas.1417526112.","journal-title":"Proc Natl Acad Sci"},{"key":"4442_CR27","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-016-1300-6","author":"T Shafee","year":"2016","unstructured":"Shafee T, Cooke I. AlignStat: a web-tool and r package for statistical comparison of alternative multiple sequence alignments. BMC Bioinf. 2016. https:\/\/doi.org\/10.1186\/s12859-016-1300-6.","journal-title":"BMC Bioinf"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04442-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-021-04442-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04442-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,24]],"date-time":"2021-10-24T16:03:40Z","timestamp":1635091420000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-021-04442-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,24]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["4442"],"URL":"https:\/\/doi.org\/10.1186\/s12859-021-04442-8","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,24]]},"assertion":[{"value":"29 September 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 October 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 October 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"518"}}