{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,2]],"date-time":"2025-11-02T02:25:26Z","timestamp":1762050326356,"version":"3.37.3"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2021,7,12]],"date-time":"2021-07-12T00:00:00Z","timestamp":1626048000000},"content-version":"vor","delay-in-days":11,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Institute of Health","award":["R01-GM076275"],"award-info":[{"award-number":["R01-GM076275"]}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["ABI-1458457"],"award-info":[{"award-number":["ABI-1458457"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Protein domain duplications are a major contributor to the functional diversification of protein families. These duplications can occur one at a time through single domain duplications, or as tandem duplications where several consecutive domains are duplicated together as part of a single evolutionary event. Existing methods for inferring domain-level evolutionary events are based on reconciling domain trees with gene trees. While some formulations consider multiple domain duplications, they do not explicitly model tandem duplications; this leads to inaccurate inference of which domains duplicated together over the course of evolution.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we introduce a reconciliation-based framework that considers the relative positions of domains within extant sequences. We use this information to uncover tandem domain duplications within the evolutionary history of these genes. We devise an integer linear programming approach that solves our problem exactly, and a heuristic approach that works well in practice. We perform extensive simulation studies to demonstrate that our approaches can accurately uncover single and tandem domain duplications, and additionally test our approach on a well-studied orthogroup where lineage-specific domain expansions exhibit varying and complex domain duplication patterns.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Code is available on github at https:\/\/github.com\/Singh-Lab\/TandemDuplications.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab329","type":"journal-article","created":{"date-parts":[[2021,5,4]],"date-time":"2021-05-04T03:11:47Z","timestamp":1620097907000},"page":"i133-i141","source":"Crossref","is-referenced-by-count":1,"title":["Improved inference of tandem domain duplications"],"prefix":"10.1093","volume":"37","author":[{"given":"Chaitanya","family":"Aluru","sequence":"first","affiliation":[{"name":"Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA"}]},{"given":"Mona","family":"Singh","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,7,12]]},"reference":[{"first-page":"1","year":"2020","author":"Aluru","key":"2023062410303384200_btab329-B1"},{"key":"2023062410303384200_btab329-B2","doi-asserted-by":"crossref","first-page":"i283","DOI":"10.1093\/bioinformatics\/bts225","article-title":"Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss","volume":"28","author":"Bansal","year":"2012","journal-title":"Bioinformatics"},{"key":"2023062410303384200_btab329-B3","doi-asserted-by":"crossref","first-page":"i132","DOI":"10.1093\/bioinformatics\/btn150","article-title":"The multiple gene duplication problem revisited","volume":"24","author":"Bansal","year":"2008","journal-title":"Bioinformatics"},{"key":"2023062410303384200_btab329-B4","doi-asserted-by":"crossref","first-page":"e114","DOI":"10.1371\/journal.pcbi.0020114","article-title":"Expansion of protein domain repeats","volume":"2","author":"Bj\u00f6rklund","year":"2006","journal-title":"PLoS Comput. Biol"},{"key":"2023062410303384200_btab329-B5","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.jmb.2010.07.011","article-title":"Nebulin: a study of protein repeat evolution","volume":"402","author":"Bj\u00f6rklund","year":"2010","journal-title":"J. Mol. Biol"},{"key":"2023062410303384200_btab329-B6","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1126\/science.1085371","article-title":"Evolution of the protein repertoire","volume":"300","author":"Chothia","year":"2003","journal-title":"Science"},{"key":"2023062410303384200_btab329-B7","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/s13015-019-0139-6","article-title":"Reconciling multiple genes trees via segmental duplications and losses","volume":"14","author":"Dondi","year":"2019","journal-title":"Algorithms Mol. Biol"},{"key":"2023062410303384200_btab329-B8","doi-asserted-by":"crossref","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile hmm searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput. Biol"},{"key":"2023062410303384200_btab329-B9","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1093\/sysbio\/28.2.132","article-title":"Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences","volume":"28","author":"Goodman","year":"1979","journal-title":"Syst. Biol"},{"key":"2023062410303384200_btab329-B10","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1006\/mpev.1996.0071","article-title":"Reconstruction of ancient molecular phylogeny","volume":"6","author":"Guigo","year":"1996","journal-title":"Mol. Phylogenet. Evol"},{"first-page":"347","year":"2004","author":"Hallett","key":"2023062410303384200_btab329-B11"},{"key":"2023062410303384200_btab329-B12","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1038\/345273a0","article-title":"A regular pattern of two types of 100-residue motif in the sequence of titin","volume":"345","author":"Labeit","year":"1990","journal-title":"Nature"},{"key":"2023062410303384200_btab329-B13","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1096\/fj.10-157412","article-title":"Nebulin, a major player in muscle health and disease","volume":"25","author":"Labeit","year":"2011","journal-title":"FASEB J"},{"key":"2023062410303384200_btab329-B14","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1109\/TCBB.2018.2846253","article-title":"An integrated reconciliation framework for domain, gene, and species level evolution","volume":"16","author":"Li","year":"2019","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinformatics"},{"key":"2023062410303384200_btab329-B15","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1016\/j.jsb.2012.02.010","article-title":"The evolution of filamin\u2014a protein domain repeat perspective","volume":"179","author":"Light","year":"2012","journal-title":"J. Struct. Biol"},{"year":"2021","key":"2023062410303384200_btab329-B16"},{"year":"2018","author":"Muhammad","key":"2023062410303384200_btab329-B17"},{"key":"2023062410303384200_btab329-B18","doi-asserted-by":"crossref","first-page":"12235","DOI":"10.1073\/pnas.1635157100","article-title":"Evolution of olfactory receptor genes in the human genome","volume":"100","author":"Niimura","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062410303384200_btab329-B19","first-page":"1","article-title":"Evolution of orthologous tandemly arrayed gene clusters","volume":"12","author":"Savard","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023062410303384200_btab329-B20","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1002\/pro.3290","article-title":"Clustal omega for making accurate alignments of many protein sequences","volume":"27","author":"Sievers","year":"2018","journal-title":"Protein Sci"},{"key":"2023062410303384200_btab329-B21","doi-asserted-by":"crossref","first-page":"1312","DOI":"10.1093\/bioinformatics\/btu033","article-title":"RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies","volume":"30","author":"Stamatakis","year":"2014","journal-title":"Bioinformatics"},{"key":"2023062410303384200_btab329-B22","doi-asserted-by":"crossref","first-page":"S8","DOI":"10.1186\/1471-2105-16-S14-S8","article-title":"Event inference in multidomain families with phylogenetic reconciliation","volume":"16","author":"Stolzer","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023062410303384200_btab329-B23","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1093\/sysbio\/sys076","article-title":"TreeFix: statistically informed gene tree error correction using species trees","volume":"62","author":"Wu","year":"2013","journal-title":"Syst. Biol"},{"key":"2023062410303384200_btab329-B24","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1093\/molbev\/msr222","article-title":"Evolution at the subgene level: domain rearrangements in the drosophila phylogeny","volume":"29","author":"Yi-Chieh","year":"2012","journal-title":"Mol. Biol. Evol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i133\/50694330\/btab329.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i133\/50694330\/btab329.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,25]],"date-time":"2023-06-25T00:22:50Z","timestamp":1687652570000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/Supplement_1\/i133\/6319657"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,1]]},"references-count":24,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2021,8,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab329","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2021,7,1]]},"published":{"date-parts":[[2021,7,1]]}}}