{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:38:55Z","timestamp":1773272335499,"version":"3.50.1"},"reference-count":18,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>General protein evolution models help determine the baseline expectations for the evolution of sequences, and they have been extensively useful in sequence analysis and for the computer simulation of artificial sequence data sets.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We have developed a new method of simulating protein sequence evolution, including insertion and deletion (indel) events in addition to amino-acid substitutions. The simulation generates both the simulated sequence family and a true sequence alignment that captures the evolutionary relationships between amino acids from different sequences. Our statistical model for indel evolution is based on the empirical indel distribution determined by Qian and Goldstein. We have parameterized this distribution so that it applies to sequences diverged by varying evolutionary times and generalized it to provide flexibility in simulation conditions. Our method uses a Monte-Carlo simulation strategy, and has been implemented in a C++ program named Simprot.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>Simprot will be useful for testing methods of analysis of protein sequence families particularly alignment methods, phylogenetic tree building, detection of recombination and horizontal gene transfer, and homology detection, where knowing the true course of sequence evolution is essential.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/1471-2105-6-236","type":"journal-article","created":{"date-parts":[[2005,9,28]],"date-time":"2005-09-28T18:14:23Z","timestamp":1127931263000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":34,"title":["SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution"],"prefix":"10.1186","volume":"6","author":[{"given":"Andy","family":"Pang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrew D","family":"Smith","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paulo AS","family":"Nuin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Elisabeth RM","family":"Tillier","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2005,9,27]]},"reference":[{"key":"561_CR1","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1002\/prot.1129","volume":"45","author":"B Qian","year":"2001","unstructured":"Qian B, Goldstein RA: Distribution of Indel lengths. Proteins 2001, 45: 102\u20134. 10.1002\/prot.1129","journal-title":"Proteins"},{"key":"561_CR2","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1007\/BF02193625","volume":"33","author":"JL Thorne","year":"1991","unstructured":"Thorne JL, Kishino H, Felsenstein J: An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol 1991, 33: 114\u2013124. 10.1007\/BF02193625","journal-title":"J Mol Evol"},{"key":"561_CR3","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1007\/BF00163848","volume":"34","author":"JL Thorne","year":"1999","unstructured":"Thorne JL, Kishino H, Felsenstein J: Inching toward reality: an improved likelihood model of sequence evolution. J Mol Evol 1999, 34: 3\u201316. 10.1007\/BF00163848","journal-title":"J Mol Evol"},{"key":"561_CR4","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1093\/bioinformatics\/btg026","volume":"19","author":"D Metzler","year":"2003","unstructured":"Metzler D: Statistical alignment based on fragment insertion and deletion models. Bioinformatics 2003, 19: 490\u2013499. 10.1093\/bioinformatics\/btg026","journal-title":"Bioinformatics"},{"key":"561_CR5","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1093\/molbev\/msh043","volume":"21","author":"I Miklos","year":"2004","unstructured":"Miklos I, Lunter GA, Holmes I: A Long Indel model for evolutionary sequence alignment. Mol Biol Evol 2004, 21: 529\u201340. 10.1093\/molbev\/msh043","journal-title":"Mol Biol Evol"},{"key":"561_CR6","doi-asserted-by":"publisher","first-page":"1065","DOI":"10.1006\/jmbi.1993.1105","volume":"229","author":"SA Benner","year":"1993","unstructured":"Benner SA, Cohen MA, Gonnet GH: Empirical and structural models for insertions and deletions in the divergent evolution of proteins. J Mol Biol 1993, 229: 1065\u201382. 10.1006\/jmbi.1993.1105","journal-title":"J Mol Biol"},{"key":"561_CR7","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1016\/j.jmb.2004.05.045","volume":"341","author":"MS Chang","year":"2004","unstructured":"Chang MS, Benner SA: Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. J Mol Biol 2004, 341: 617\u201331. 10.1016\/j.jmb.2004.05.045","journal-title":"J Mol Biol"},{"key":"561_CR8","first-page":"235","volume":"13","author":"A Rambaut","year":"1997","unstructured":"Rambaut A, Grassly NC: Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci 1997, 13: 235\u20138.","journal-title":"Comput Appl Biosci"},{"issue":"5","key":"561_CR9","first-page":"559","volume":"13","author":"NC Grassly","year":"1997","unstructured":"Grassly NC, Adachi J, Rambaut A: PSeq-Gen: an application for the Monte Carlo simulation of protein sequence evolution along phylogenetic trees. Comput Appl Biosci 1997, 13(5):559\u201360.","journal-title":"Comput Appl Biosci"},{"key":"561_CR10","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1093\/bioinformatics\/14.2.157","volume":"14","author":"J Stoye","year":"1998","unstructured":"Stoye J, Evers D, Meyer F: Rose: generating sequence families. Bioinformatics 1998, 14: 157\u2013163. 10.1093\/bioinformatics\/14.2.157","journal-title":"Bioinformatics"},{"key":"561_CR11","first-page":"345","volume-title":"Atlas of Protein Sequence and Structure","author":"MO Dayhoff","year":"1978","unstructured":"Dayhoff MO, Schwartz RM, Orcutt BC: A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure. Volume 5. Edited by: Dayhoff MO. National Biomedical Research Foundation; 1978:345\u2013352."},{"key":"561_CR12","first-page":"275","volume":"8","author":"DT Jones","year":"1992","unstructured":"Jones DT, Taylor WR, Thornton JM: The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences 1992, 8: 275\u2013282.","journal-title":"Computer Applications in the Biosciences"},{"key":"561_CR13","first-page":"1396","volume":"10","author":"Z Yang","year":"1993","unstructured":"Yang Z: Maximum likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 1993, 10: 1396\u20131401.","journal-title":"Mol Biol Evol"},{"key":"561_CR14","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1002\/humu.10312","volume":"23","author":"AS Kondrashov","year":"2004","unstructured":"Kondrashov AS, Rogozin IB: Context of deletions and insertions in human coding sequences. Hum Mutat 2004, 23: 177\u201385. 10.1002\/humu.10312","journal-title":"Hum Mutat"},{"key":"561_CR15","doi-asserted-by":"publisher","first-page":"1610","DOI":"10.1101\/gr.2450504","volume":"14","author":"A Ogurtsov","year":"2004","unstructured":"Ogurtsov A, Aleksey Y, Sunyaev S, Kondrashov AS: Indel-based evolutionary distance and mouse-human divergence. Genome Res 2004, 14: 1610\u20136. 10.1101\/gr.2450504","journal-title":"Genome Res"},{"key":"561_CR16","first-page":"679","volume-title":"High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome","author":"D Denver","year":"2004","unstructured":"Denver D, Morris K, Lynch M, Thomas WK: High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. 2004, 430: 679\u201382."},{"key":"561_CR17","volume-title":"PHYLIP (phylogeny inference package) version 3.6.3","author":"J Felsenstein","year":"2002","unstructured":"Felsenstein J: PHYLIP (phylogeny inference package) version 3.6.3.2002. [Http:\/\/evolution.genetics.washington.edu\/phylip.html]"},{"key":"561_CR18","doi-asserted-by":"publisher","first-page":"602","DOI":"10.1016\/S0959-437X(00)00142-8","volume":"10","author":"JL Thorne","year":"2000","unstructured":"Thorne JL: Models of protein sequence evolution and their applications. Curr Opin Genet Dev 2000, 10: 602\u2013605. 10.1016\/S0959-437X(00)00142-8","journal-title":"Curr Opin Genet Dev"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-236.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T17:55:36Z","timestamp":1706810136000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-236"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,9,27]]},"references-count":18,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2005,12]]}},"alternative-id":["561"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-6-236","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,9,27]]},"assertion":[{"value":"29 April 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 September 2005","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 September 2005","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"236"}}