{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T18:58:04Z","timestamp":1767034684421,"version":"3.37.3"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,2,6]],"date-time":"2020-02-06T00:00:00Z","timestamp":1580947200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2020,2,6]],"date-time":"2020-02-06T00:00:00Z","timestamp":1580947200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Doctoral Research Grant of Southwest University of Science and Technology","award":["16zx7112"],"award-info":[{"award-number":["16zx7112"]}]},{"name":"Thousand Talents Program\u201d of Sichuan Province, P.R. China","award":["17QR003"],"award-info":[{"award-number":["17QR003"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>The evolutionary history of genes serves as a cornerstone of contemporary biology. Most conserved sequences in mammalian genomes don\u2019t code for proteins, yielding a need to infer evolutionary history of sequences irrespective of what kind of functional element they may encode. Thus, sequence-, as opposed to gene-, centric modes of inferring paths of sequence evolution are increasingly relevant. Customarily, homologous sequences derived from the same direct ancestor, whose ancestral position in two genomes is usually conserved, are termed \u201cprimary\u201d (or \u201cpositional\u201d) orthologs. Methods based solely on similarity don\u2019t reliably distinguish primary orthologs from other homologs; for this, genomic context is often essential. Context-dependent identification of orthologs traditionally relies on genomic context over length scales characteristic of conserved gene order or whole-genome sequence alignment, and can be computationally intensive.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We demonstrate that short-range sequence context\u2014as short as a single \u201cmaximal\u201d match\u2014 distinguishes primary orthologs from other homologs across whole genomes. On mammalian whole genomes not preprocessed by repeat-masker, potential orthologs are extracted by genome intersection as \u201cnon-nested maximal matches:\u201d maximal matches that are not nested into other maximal matches. It emerges that on both nucleotide and gene scales, non-nested maximal matches recapitulate primary or positional orthologs with high precision and high recall, while the corresponding computation consumes less than one thirtieth of the computation time required by commonly applied whole-genome alignment methods. In regions of genomes that would be masked by repeat-masker, non-nested maximal matches recover orthologs that are inaccessible to Lastz net alignment, for which repeat-masking is a prerequisite. mmRBHs, reciprocal best hits of genes containing non-nested maximal matches, yield novel putative orthologs, e.g. around 1000 pairs of genes for human-chimpanzee.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>We describe an intersection-based method that requires neither repeat-masking nor alignment to infer evolutionary history of sequences based on short-range genomic sequence context. Ortholog identification based on non-nested maximal matches is parameter-free, and less computationally intensive than many alignment-based methods. It is especially suitable for genome-wide identification of orthologs, and may be applicable to unassembled genomes. We are agnostic as to the reasons for its effectiveness, which may reflect local variation of mean mutation rate.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-020-3384-2","type":"journal-article","created":{"date-parts":[[2020,2,6]],"date-time":"2020-02-06T16:07:07Z","timestamp":1581005227000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Primary orthologs from local sequence context"],"prefix":"10.1186","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2284-2375","authenticated-orcid":false,"given":"Kun","family":"Gao","sequence":"first","affiliation":[]},{"given":"Jonathan","family":"Miller","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,2,6]]},"reference":[{"key":"3384_CR1","unstructured":"Brown TA. Molecular phylogenetics. In: Genomes. Wiley-Liss, Oxford; 2002. 2nd ed., Chapter 16."},{"issue":"2","key":"3384_CR2","doi-asserted-by":"publisher","first-page":"99","DOI":"10.2307\/2412448","volume":"19","author":"W Fitch","year":"1970","unstructured":"Fitch W. Distinguishing homologous from analogous proteins. Syst Zool. 1970;19(2):99\u2013113.","journal-title":"Syst Zool"},{"issue":"5","key":"3384_CR3","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1016\/S0168-9525(00)02005-9","volume":"16","author":"W Fitch","year":"2000","unstructured":"Fitch W. Homology: a personal view on some of the problems. Trends Genet. 2000;16(5):227\u201331.","journal-title":"Trends Genet"},{"key":"3384_CR4","doi-asserted-by":"publisher","first-page":"2275","DOI":"10.1093\/molbev\/msi225","volume":"22","author":"JE Blair","year":"2005","unstructured":"Blair JE, Hedges SB. Molecular phylogeny and divergence times of deuterostome animals. Mol Biol Evol. 2005;22:2275\u201384.","journal-title":"Mol Biol Evol"},{"key":"3384_CR5","doi-asserted-by":"publisher","first-page":"1283","DOI":"10.1126\/science.1123061","volume":"311","author":"FD Ciccarelli","year":"2006","unstructured":"Ciccarelli FD, Doerks T, Mering C, Creevey CJ, Snel B, Bork P. Toward automatic reconstruction of a highly resolved tree of life. Science. 2006;311:1283\u20137.","journal-title":"Science."},{"key":"3384_CR6","volume-title":"Evolutionary","author":"AM Altenhoff","year":"2012","unstructured":"Altenhoff AM, Dessimoz C. Inferring orthology and paralogy. In: Anisimova M, editor. Evolutionary. Genomics: Statistical and Computational Methods. Springer Science+Business Media; 2012. Chapter 9."},{"issue":"3","key":"3384_CR7","doi-asserted-by":"publisher","first-page":"e1000703","DOI":"10.1371\/journal.pcbi.1000703","volume":"6","author":"G Fang","year":"2010","unstructured":"Fang G, Bhardwaj N, Robilotto R, Gerstein MB. Getting started in gene Orthology and functional analysis. PLoS Comput Biol. 2010;6(3):e1000703.","journal-title":"PLoS Comput Biol"},{"key":"3384_CR8","unstructured":"Ensembl documentation page. http:\/\/www.ensembl.org\/info\/genome\/compara\/homology_types.html. Accessed 19 Aug 2019."},{"issue":"13","key":"3384_CR9","doi-asserted-by":"publisher","first-page":"366","DOI":"10.1093\/bioinformatics\/bty242","volume":"34","author":"M Lafond","year":"2018","unstructured":"Lafond M, Miardan MM, Sankoff D. Accurate prediction of orthologs in the presence of divergence after duplication. Bioinformatics. 2018;34(13):366\u201375.","journal-title":"Bioinformatics."},{"key":"3384_CR10","doi-asserted-by":"publisher","first-page":"1041","DOI":"10.1006\/jmbi.2000.5197","volume":"314","author":"M Remm","year":"2001","unstructured":"Remm M, Storm CEV, Sonnhammer ELL. Automatic clustering of Orthologs and in-paralogs from pairwise species comparisons. J Mol Biol. 2001;314:1041\u201352.","journal-title":"J Mol Biol"},{"key":"3384_CR11","unstructured":"Jensen RA. Orthologs and paralogs \u2013 we need to get it right. Genome Biol. 2001; 2(8): interactions 1002.1\u20131002.3."},{"key":"3384_CR12","doi-asserted-by":"publisher","first-page":"909","DOI":"10.1093\/bioinformatics\/15.11.909","volume":"15","author":"D Sankoff","year":"1999","unstructured":"Sankoff D. Genome rearrangement with gene families. Bioinformatics. 1999;15:909\u201317.","journal-title":"Bioinformatics."},{"key":"3384_CR13","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.1089\/cmb.2007.0048","volume":"14","author":"Z Fu","year":"2007","unstructured":"Fu Z, Chen X, Vacic V, Nan P, Zhong Y, Jiang T. MSOAR: a high-throughput Ortholog assignment system based on genome rearrangement. J Comput Biol. 2007;14:1160\u201375.","journal-title":"J Comput Biol"},{"key":"3384_CR14","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1186\/1471-2105-3-14","volume":"3","author":"CM Zmasek","year":"2002","unstructured":"Zmasek CM, Eddy SR. RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics. 2002;3:14.","journal-title":"BMC Bioinformatics"},{"key":"3384_CR15","doi-asserted-by":"publisher","first-page":"428","DOI":"10.1101\/gr.4526006","volume":"16","author":"S Bandyopadhyay","year":"2006","unstructured":"Bandyopadhyay S, Sharan R, Ideker T. Systematic identification of functional orthologs based on protein network comparison. Genome Res. 2006;16:428\u201335.","journal-title":"Genome Res"},{"issue":"Suppl 19","key":"3384_CR16","doi-asserted-by":"publisher","first-page":"S15","DOI":"10.1186\/1471-2105-13-S19-S15","volume":"13","author":"KM Swenson","year":"2012","unstructured":"Swenson KM, EI-Mabrouk N. Gene trees and species trees: irreconcilable differences. BMC Bioinformatics. 2012;13(Suppl 19):S15.","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"3384_CR17","doi-asserted-by":"publisher","first-page":"404","DOI":"10.1093\/oxfordjournals.molbev.a003816","volume":"18","author":"LB Koski","year":"2001","unstructured":"Koski LB, Morton RA, Golding GB. Codon Bias and base composition are poor indicators of horizontally transferred genes. Mol Biol Evol. 2001;18(3):404\u201312.","journal-title":"Mol Biol Evol"},{"issue":"8","key":"3384_CR18","doi-asserted-by":"publisher","first-page":"e75","DOI":"10.1371\/journal.pcbi.0020075","volume":"2","author":"F Swidan","year":"2006","unstructured":"Swidan F, Rocha EPC, Shmoish M, Pinter RY. An integrative method for accurate comparative genome mapping. PLoS Comput Biol. 2006;2(8):e75.","journal-title":"PLoS Comput Biol"},{"issue":"5","key":"3384_CR19","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1093\/bib\/bbr040","volume":"12","author":"CN Dewey","year":"2011","unstructured":"Dewey CN. Positional orthology: putting genomic evolutionary relationships into context. Brief Bioinform. 2011;12(5):401\u201312.","journal-title":"Brief Bioinform"},{"key":"3384_CR20","first-page":"114","volume":"14","author":"MV Han","year":"2009","unstructured":"Han MV, Hahn MW. Identifying parent-daughter relationships among duplicated genes. Pac Symp Biocomput. 2009;14:114\u201325.","journal-title":"Pac Symp Biocomput"},{"key":"3384_CR21","doi-asserted-by":"publisher","first-page":"6164","DOI":"10.1093\/nar\/gki913","volume":"33","author":"RA Notebaart","year":"2005","unstructured":"Notebaart RA, Huynen MA, Teusink B, Siezen RJ, Snel B. Correlation between sequence conservation and the genomic context after gene duplication. Nucleic Acids Res. 2005;33:6164\u201371.","journal-title":"Nucleic Acids Res"},{"key":"3384_CR22","first-page":"77","volume":"2","author":"IJ Burgetz","year":"2006","unstructured":"Burgetz IJ, Shariff S, Pang A, Tillier ERM. Positional homology in bacterial genomes. Evol Bioinformatics Online. 2006;2:77\u201390.","journal-title":"Evol Bioinformatics Online"},{"key":"3384_CR23","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1093\/molbev\/msl199","volume":"24","author":"BP Cusack","year":"2007","unstructured":"Cusack BP, Wolfe KH. Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. Mol Biol Evol. 2007;24:679\u201386.","journal-title":"Mol Biol Evol"},{"key":"3384_CR24","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1186\/1471-2148-7-237","volume":"7","author":"F Lemoine","year":"2007","unstructured":"Lemoine F, Lespinet O, Labedan B. Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data. BMC Evol Biol. 2007;7:237.","journal-title":"BMC Evol Biol"},{"key":"3384_CR25","doi-asserted-by":"publisher","first-page":"1253","DOI":"10.1089\/cmb.2009.0074","volume":"16","author":"J Jun","year":"2009","unstructured":"Jun J, Ryvkin P, Hemphill E, Nelson C. a Duplication mechanism and disruptions in flanking regions determine the fate of Mammalian gene duplicates. J Comput Biol. 2009;16:1253\u201366.","journal-title":"J Comput Biol"},{"issue":"1","key":"3384_CR26","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/1297-9686-42-24","volume":"42","author":"Z Wang","year":"2010","unstructured":"Wang Z, Dong X, Ding GH, Li YX. Comparing the retention mechanisms of tandem duplicates and retrogenes in human and mouse genomes. Genet Sel Evol. 2010;42(1):24.","journal-title":"Genet Sel Evol"},{"key":"3384_CR27","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1101\/gr.085951.108","volume":"19","author":"MV Han","year":"2009","unstructured":"Han MV, Demuth JP, McGrath CL, Casola C, Hahn MW. Adaptive evolution of young gene duplicates in mammals. Genome Res. 2009;19:859\u201367.","journal-title":"Genome Res"},{"key":"3384_CR28","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1146\/annurev.genet.39.073003.114725","volume":"39","author":"EV Koonin","year":"2005","unstructured":"Koonin EV. Orthologs, Paralogs, and evolutionary genomics. The Annual Review of Genetics. 2005;39:309\u201338.","journal-title":"The Annual Review of Genetics"},{"issue":"1","key":"3384_CR29","doi-asserted-by":"publisher","first-page":"1350018","DOI":"10.1142\/S0219720013500182","volume":"12","author":"E Taillefer","year":"2014","unstructured":"Taillefer E, Miller J. Exhaustive computation of exact duplications via super and non-nested local maximal repeats. J Bioinforma Comput Biol. 2014;12(1):1350018.","journal-title":"J Bioinforma Comput Biol"},{"issue":"7","key":"3384_CR30","doi-asserted-by":"publisher","first-page":"e18464","DOI":"10.1371\/journal.pone.0018464","volume":"6","author":"K Gao","year":"2011","unstructured":"Gao K, Miller J. Algebraic distribution of segmental duplication lengths in whole-genome sequence self-alignments. PLoS One. 2011;6(7):e18464.","journal-title":"PLoS One"},{"key":"3384_CR31","doi-asserted-by":"crossref","unstructured":"Taillefer E and Miller J. Algebraic length-distribution of sequence duplications in whole genomes. In Proc of international conf on natural comput. Shanghai, China, Jul 2011; v3: 1454\u20131460.","DOI":"10.1109\/ICNC.2011.6022506"},{"issue":"11","key":"3384_CR32","doi-asserted-by":"publisher","first-page":"2369","DOI":"10.1093\/nar\/27.11.2369","volume":"27","author":"AL Delcher","year":"1999","unstructured":"Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL. Alignment of whole genomes. Nucleic Acids Res. 1999;27(11):2369\u201376.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"3384_CR33","doi-asserted-by":"publisher","first-page":"2478","DOI":"10.1093\/nar\/30.11.2478","volume":"30","author":"AL Delcher","year":"2002","unstructured":"Delcher AL, Phillippy A, Carlton J, Salzberg SL. Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res. 2002;30(1):2478\u201383.","journal-title":"Nucleic Acids Res"},{"key":"3384_CR34","doi-asserted-by":"publisher","first-page":"R12","DOI":"10.1186\/gb-2004-5-2-r12","volume":"5","author":"S Kurtz","year":"2004","unstructured":"Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.","journal-title":"Genome Biol"},{"key":"3384_CR35","unstructured":"Mummer3 homepage. http:\/\/mummer.sourceforge.net\/. Accessed 19 Aug 2019."},{"key":"3384_CR36","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1016\/j.compbiolchem.2014.08.010","volume":"53A","author":"K Gao","year":"2014","unstructured":"Gao K, Miller J. Human\u2013chimpanzee alignment: Ortholog exponentials and paralog power laws. Comput Biol Chem. 2014;53A:59\u201370.","journal-title":"Comput Biol Chem"},{"key":"3384_CR37","unstructured":"Taillefer E and Miller J. Exhaustive computation of exact sequence duplications in whole genomes via super and local maximal repeats. International Conf on Environ and Bio Sci (IPCBEE) IACSIT Press, Singapore. 2011; v21: 22\u201329."},{"key":"3384_CR38","unstructured":"Smit AFA, Hubley R and Green P. RepeatMasker at http:\/\/repeatmasker.org. Accessed 19 Aug 2019."},{"issue":"5338","key":"3384_CR39","doi-asserted-by":"publisher","first-page":"631","DOI":"10.1126\/science.278.5338.631","volume":"278","author":"RL Tatusov","year":"1997","unstructured":"Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278(5338):631\u20137.","journal-title":"Science."},{"issue":"4","key":"3384_CR40","doi-asserted-by":"publisher","first-page":"707","DOI":"10.1006\/jmbi.1998.2144","volume":"283","author":"P Bork","year":"1998","unstructured":"Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: from genes to genomes and back. J Mol Biol. 1998;283(4):707\u201325.","journal-title":"J Mol Biol"},{"key":"3384_CR41","doi-asserted-by":"publisher","first-page":"2896","DOI":"10.1073\/pnas.96.6.2896","volume":"96","author":"R Overbeek","year":"1999","unstructured":"Overbeek R, Fonstein M, Souza MD, Pusch GD, Maltsev N. The use of gene clusters to infer functional coupling. Proc Natl Acad Sci U S A. 1999;96:2896\u2013901.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"12","key":"3384_CR42","doi-asserted-by":"publisher","first-page":"1286","DOI":"10.1093\/gbe\/evs100","volume":"4","author":"YI Wolf","year":"2012","unstructured":"Wolf YI, Koonin EV. A tight link between orthologs and bidirectional best hits in bacterial and archaeal genomes. Genome Biol Evol. 2012;4(12):1286\u201394.","journal-title":"Genome Biol Evol"},{"issue":"3","key":"3384_CR43","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1093\/bioinformatics\/btm585","volume":"24","author":"G Moreno-Hagelsieb","year":"2008","unstructured":"Moreno-Hagelsieb G, Latimer K. Choosing BLAST options for better detection of orthologs as reciprocal best hits. Bioinformatics. 2008;24(3):319\u201324.","journal-title":"Bioinformatics."},{"issue":"7","key":"3384_CR44","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0101850","volume":"9","author":"N Ward","year":"2014","unstructured":"Ward N, Moreno-Hagelsieb G. Quickly finding Orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss? PLoS One. 2014;9(7):e101850.","journal-title":"PLoS One"},{"issue":"4","key":"3384_CR45","doi-asserted-by":"publisher","first-page":"e9844","DOI":"10.1371\/journal.pone.0009844","volume":"5","author":"HD Chen","year":"2010","unstructured":"Chen HD, Fan WL, Kong SG, Lee HC. Universal global imprints of genome growth and evolution: equivalent length and cumulative mutation density. PLoS One. 2010;5(4):e9844.","journal-title":"PLoS One"},{"key":"3384_CR46","doi-asserted-by":"publisher","first-page":"148101","DOI":"10.1103\/PhysRevLett.110.148101","volume":"110","author":"F Massip","year":"2013","unstructured":"Massip F, Arndt PF. Neutral evolution of duplicated DNA: an evolutionary stick-breaking process causes scale-invariant behavior. Phys Rev Lett. 2013;110:148101.","journal-title":"Phys Rev Lett"},{"key":"3384_CR47","unstructured":"Koroteev MV and Miller J. Fragmentation dynamics of DNA sequence duplications. arXiv: 1304.1409v3 [math-ph]."},{"key":"3384_CR48","doi-asserted-by":"publisher","first-page":"1151","DOI":"10.1126\/science.290.5494.1151","volume":"290","author":"M Lynch","year":"2000","unstructured":"Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151\u20135.","journal-title":"Science."},{"key":"3384_CR49","doi-asserted-by":"publisher","first-page":"1741","DOI":"10.1073\/pnas.82.6.1741","volume":"82","author":"CI Wu","year":"1985","unstructured":"Wu CI, Li WH. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci U S A. 1985;82:1741\u20135.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"3384_CR50","doi-asserted-by":"publisher","first-page":"5974","DOI":"10.1073\/pnas.88.14.5974","volume":"88","author":"M Bulmer","year":"1991","unstructured":"Bulmer M, Wolfe KH, Sharp PM. Synonymous nucleotide substitution rates in mammalian genes: implications for the molecular clock and the relationship of mammalian orders. Proc Natl Acad Sci U S A. 1991;88:5974\u20138.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"2","key":"3384_CR51","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1073\/pnas.022629899","volume":"99","author":"S. Kumar","year":"2002","unstructured":"Kumar S and Subramanian. Mutation rates in mammalian genomes. Proc. Natl. Acad. Sci. USA. 2002; 99: 803\u2013808.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"1","key":"3384_CR52","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1093\/genetics\/156.1.297","volume":"156","author":"MW Nachman","year":"2000","unstructured":"Nachman MW, Crowell SL. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000;156(1):297\u2013304.","journal-title":"Genetics."},{"key":"3384_CR53","doi-asserted-by":"publisher","first-page":"9407","DOI":"10.1073\/pnas.95.16.9407","volume":"95","author":"W Makalowski","year":"1998","unstructured":"Makalowski W, Boguski MS. Evolutionary parameters of the transcribed mammalian genome: an analysis of 2820 orthologous rodent and human sequences. Proc Natl Acad Sci U S A. 1998;95:9407\u201312.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"3384_CR54","unstructured":"Harris RS. Improved pairwise alignment of genomic DNA. Ph.D. Thesis, The Pennsylvania State University. 2007."},{"key":"3384_CR55","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1101\/gr.809403","volume":"13","author":"S Schwartz","year":"2003","unstructured":"Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res. 2003;13:103\u20137.","journal-title":"Genome Res"},{"issue":"20","key":"3384_CR56","doi-asserted-by":"publisher","first-page":"11484","DOI":"10.1073\/pnas.1932072100","volume":"100","author":"WJ Kent","year":"2003","unstructured":"Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D. Evolution\u2019s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A. 2003;100(20):11484\u20139.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"3384_CR57","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1101\/gr.073585.107","volume":"19","author":"AJ Vilella","year":"2009","unstructured":"Vilella AJ, Severin J, Ureta-Vidal A, Durbin R, Heng L, Birney E. Ensembl Compara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009;19:327\u201335.","journal-title":"Genome Res"},{"key":"3384_CR58","unstructured":"Ensembl documentation page. http:\/\/www.ensembl.org\/info\/genome\/stable_ids\/index.html. Accessed 19 Aug 2019."},{"key":"3384_CR59","doi-asserted-by":"publisher","first-page":"518","DOI":"10.1186\/1471-2105-9-518","volume":"9","author":"AC Roth","year":"2008","unstructured":"Roth AC, Gonnet GH, Dessimoz C. Algorithm of OMA for large-scale orthology inference. BMC Bioinformatics. 2008;9:518.","journal-title":"BMC Bioinformatics"},{"issue":"10","key":"3384_CR60","doi-asserted-by":"publisher","first-page":"1800","DOI":"10.1093\/gbe\/evt132","volume":"5","author":"DA Dalquen","year":"2013","unstructured":"Dalquen DA, Dessimoz C. Bidirectional best hits miss many Orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol. 2013;5(10):1800\u20136.","journal-title":"Genome Biol Evol"},{"key":"3384_CR61","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1146\/annurev.ge.05.120171.000501","volume":"5","author":"JH Renwick","year":"1971","unstructured":"Renwick JH. The mapping of human chromosomes. Annu Rev Genet. 1971;5:81\u2013120.","journal-title":"Annu Rev Genet"},{"key":"3384_CR62","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1038\/70486","volume":"23","author":"E Passarge","year":"1999","unstructured":"Passarge E, Horsthemke B, Farber RA. Incorrect use of the term synteny. Nat Genet. 1999;23:387.","journal-title":"Nat Genet"},{"key":"3384_CR63","doi-asserted-by":"publisher","first-page":"630","DOI":"10.1186\/1471-2164-10-630","volume":"10","author":"J Jun","year":"2009","unstructured":"Jun J, Mandoiu II, Nelson CE. Identification of mammalian orthologs using local synteny. BMC Genomics. 2009;10:630.","journal-title":"BMC Genomics"},{"key":"3384_CR64","unstructured":"Mount DM. Bioinformatics: sequence and genome analysis (second edition). Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY. 2004. ISBN978\u2013087969712-9."},{"key":"3384_CR65","doi-asserted-by":"publisher","first-page":"13121","DOI":"10.1073\/pnas.0605735103","volume":"103","author":"W Salerno","year":"2006","unstructured":"Salerno W, Havlak P, Miller J. Scale-invariant structure of strongly conserved sequence in genomic intersections and alignments. Proc Natl Acad Sci U S A. 2006;103:13121\u20135.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"3384_CR66","first-page":"117","volume-title":"String Processing and Information Retrieval","author":"Enno Ohlebusch","year":"2014","unstructured":"Ohlebusch E and Beller T. Alphabet-Independent Algorithms for Finding Context-Sensitive Repeats in Linear Time. In: Moura E and Crochemore M, editors. String Processing and Information Retrieval. Ouro Preto, Brazil, October 20\u201322, 2014. 21st International Symposium, SPIRE 2014, Proceedings. LNCS v8799: 117\u2013128."},{"issue":"2","key":"3384_CR67","doi-asserted-by":"publisher","first-page":"524","DOI":"10.1093\/molbev\/msu313","volume":"32","author":"F Massip","year":"2015","unstructured":"Massip F, Sheinman M, Schbath S, Arndt PF. How evolution of genomes is reflected in exact DNA sequence match statistics. Mol Biol Evol. 2015;32(2):524\u201335.","journal-title":"Mol Biol Evol"},{"key":"3384_CR68","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/fasta\/homo_sapiens\/dna\/Homo_sapiens.GRCh38.dna.toplevel.fa.gz. Accessed 19 Aug 2019."},{"key":"3384_CR69","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/fasta\/pan_troglodytes\/dna\/Pan_troglodytes.CHIMP2.1.4.dna.toplevel.fa.gz. Accessed 19 Aug 2019."},{"key":"3384_CR70","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/fasta\/mus_musculus\/dna\/Mus_musculus.GRCm38.dna.toplevel.fa.gz. Accessed 19 Aug 2019."},{"key":"3384_CR71","unstructured":"Gao K and Miller J. Orthologs from maxmer sequence context. arXiv:1509.04412 [q-bio.QM]."},{"key":"3384_CR72","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/mysql\/ensembl_compara_96\/. Accessed 19 Aug 2019."},{"key":"3384_CR73","unstructured":"Ensembl documentation page. http:\/\/www.ensembl.org\/info\/docs\/api\/index.html. Accessed 19 Aug 2019."},{"key":"3384_CR74","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/maf\/ensembl-compara\/pairwise_alignments\/homo_sapiens.GRCh38.vs.pan_troglodytes.CHIMP2.1.4.tar. Accessed 19 Aug 2019."},{"key":"3384_CR75","unstructured":"Ensembl ftp site. ftp:\/\/ftp.ensembl.org\/pub\/release-96\/maf\/ensembl-compara\/pairwise_alignments\/homo_sapiens.GRCh38.vs.mus_musculus.GRCm38.tar. Accessed 19 Aug 2019."},{"key":"3384_CR76","unstructured":"Physics and Biology Unit, Okinawa Institute of Science and Technology Graduate University. https:\/\/groups.oist.jp\/sites\/default\/files\/imce\/u109\/sequanalysis.zip. Accessed 19 Aug 2019."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3384-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-020-3384-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3384-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,14]],"date-time":"2022-10-14T17:03:55Z","timestamp":1665767035000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-3384-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,6]]},"references-count":76,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3384"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-3384-2","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2020,2,6]]},"assertion":[{"value":"13 February 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 February 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"48"}}