{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,20]],"date-time":"2026-03-20T15:32:19Z","timestamp":1774020739829,"version":"3.50.1"},"reference-count":167,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2018,5,23]],"date-time":"2018-05-23T00:00:00Z","timestamp":1527033600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"FCT","doi-asserted-by":"publisher","award":["UID\/CEC\/00127\/2013"],"award-info":[{"award-number":["UID\/CEC\/00127\/2013"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"FCT","doi-asserted-by":"publisher","award":["UID\/BIM\/04501\/2013"],"award-info":[{"award-number":["UID\/BIM\/04501\/2013"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"FCT","doi-asserted-by":"publisher","award":["PTCD\/EEI-SII\/6608\/2014"],"award-info":[{"award-number":["PTCD\/EEI-SII\/6608\/2014"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"FCT","doi-asserted-by":"publisher","award":["SFRH\/BPD\/111148\/2015"],"award-info":[{"award-number":["SFRH\/BPD\/111148\/2015"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>An efficient DNA compressor furnishes an approximation to measure and compare information quantities present in, between and across DNA sequences, regardless of the characteristics of the sources. In this paper, we compare directly two information measures, the Normalized Compression Distance (NCD) and the Normalized Relative Compression (NRC). These measures answer different questions; the NCD measures how similar both strings are (in terms of information content) and the NRC (which, in general, is nonsymmetric) indicates the fraction of one of them that cannot be constructed using information from the other one. This leads to the problem of finding out which measure (or question) is more suitable for the answer we need. For computing both, we use a state of the art DNA sequence compressor that we benchmark with some top compressors in different compression modes. Then, we apply the compressor on DNA sequences with different scales and natures, first using synthetic sequences and then on real DNA sequences. The last include mitochondrial DNA (mtDNA), messenger RNA (mRNA) and genomic DNA (gDNA) of seven primates. We provide several insights into evolutionary acceleration rates at different scales, namely, the observation and confirmation across the whole genomes of a higher variation rate of the mtDNA relative to the gDNA. We also show the importance of relative compression for localizing similar information regions using mtDNA.<\/jats:p>","DOI":"10.3390\/e20060393","type":"journal-article","created":{"date-parts":[[2018,5,23]],"date-time":"2018-05-23T03:14:24Z","timestamp":1527045264000},"page":"393","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Comparison of Compression-Based Measures with Application to the Evolution of Primate Genomes"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1176-552X","authenticated-orcid":false,"given":"Diogo","family":"Pratas","sequence":"first","affiliation":[{"name":"Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5926-8042","authenticated-orcid":false,"given":"Raquel M.","family":"Silva","sequence":"additional","affiliation":[{"name":"Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193 Aveiro, Portugal"},{"name":"Department of Medical Sciences and Institute for Biomedicine - iBiMED, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9164-0016","authenticated-orcid":false,"given":"Armando J.","family":"Pinho","sequence":"additional","affiliation":[{"name":"Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193 Aveiro, Portugal"},{"name":"Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2018,5,23]]},"reference":[{"key":"ref_1","first-page":"1","article-title":"Three approaches to the quantitative definition of information","volume":"1","author":"Kolmogorov","year":"1965","journal-title":"Probl. Inf. Transm."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1140\/epjb\/e2009-00168-5","article-title":"Combinatorial entropies and statistics","volume":"70","author":"Niven","year":"2009","journal-title":"Eur. Phys. J. B"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1007\/s00224-007-9078-6","article-title":"A new combinatorial approach to sequence comparison","volume":"42","author":"Mantaci","year":"2008","journal-title":"Theory Comput. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0019-9958(64)90223-2","article-title":"A formal theory of inductive inference. Part I","volume":"7","author":"Solomonoff","year":"1964","journal-title":"Inf. Control"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1016\/S0019-9958(64)90131-7","article-title":"A formal theory of inductive inference. Part II","volume":"7","author":"Solomonoff","year":"1964","journal-title":"Inf. Control"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1145\/321356.321363","article-title":"On the length of programs for computing finite binary sequences","volume":"13","author":"Chaitin","year":"1966","journal-title":"J. ACM"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1093\/comjnl\/11.2.185","article-title":"An information measure for classification","volume":"11","author":"Wallace","year":"1968","journal-title":"Comput. J."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1016\/0005-1098(78)90005-5","article-title":"Modeling by shortest data description","volume":"14","author":"Rissanen","year":"1978","journal-title":"Automatica"},{"key":"ref_10","unstructured":"Hutter, M. (arXiv, 2004). Algorithmic information theory: A brief non-technical guide to the field, arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Li, M., and Vit\u00e1nyi, P. (2008). An Introduction to Kolmogorov Complexity and Its Applications, Springer. [3rd ed.].","DOI":"10.1007\/978-0-387-49820-1"},{"key":"ref_12","first-page":"30","article-title":"Laws of information conservation (nongrowth) and aspects of the foundation of probability theory","volume":"10","author":"Levin","year":"1974","journal-title":"Problemy Peredachi Informatsii"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shen, A., Uspensky, V.A., and Vereshchagin, N. (2017). Kolmogorov Complexity and Algorithmic Randomness, American Mathematical Society.","DOI":"10.1090\/surv\/220"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1006\/jcss.1999.1677","article-title":"Inequalities for Shannon entropy and Kolmogorov complexity","volume":"60","author":"Hammer","year":"2000","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1101","DOI":"10.1111\/jep.12068","article-title":"Entropy and compression: Two measures of complexity","volume":"19","author":"Henriques","year":"2013","journal-title":"J. Eval. Clin. Pract."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Soler-Toscano, F., Zenil, H., Delahaye, J.P., and Gauvrit, N. (2014). Calculating Kolmogorov complexity from the output frequency distributions of small Turing machines. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0096223"},{"key":"ref_17","first-page":"7208216","article-title":"A computable measure of algorithmic probability by finite approximations with an application to integer sequences","volume":"2017","author":"Zenil","year":"2017","journal-title":"Complexity"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gauvrit, N., Zenil, H., Soler-Toscano, F., Delahaye, J.P., and Brugger, P. (2017). Human behavioral complexity peaks at age 25. PLoS Comput. Biol., 13.","DOI":"10.1371\/journal.pcbi.1005408"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pratas, D., and Pinho, A.J. (2017, January 20\u201323). On the Approximation of the Kolmogorov Complexity for DNA Sequences. Proceedings of the Iberian Conference on Pattern Recognition and Image Analysis, Faro, Portugal.","DOI":"10.1007\/978-3-319-58838-4_29"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Kettunen, K., Sadeniemi, M., Lindh-Knuutila, T., and Honkela, T. (2006). Analysis of EU languages through text compression. Advances in Natural Language Processing, Springer.","DOI":"10.1007\/11816508_12"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"738","DOI":"10.1016\/j.jcss.2010.06.018","article-title":"Nonapproximability of the normalized information distance","volume":"77","author":"Terwijn","year":"2011","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1016\/j.tcs.2007.02.010","article-title":"On the strongly generic undecidability of the halting problem","volume":"377","author":"Rybalov","year":"2007","journal-title":"Theor. Comput. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Bloem, P., Mota, F., de Rooij, S., Antunes, L., and Adriaans, P. (2014, January 8\u201310). A safe approximation for Kolmogorov complexity. Proceedings of the International Conference on Algorithmic Learning Theory, Bled, Slovenia.","DOI":"10.1007\/978-3-319-11662-4_24"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1407","DOI":"10.1109\/18.681318","article-title":"Information distance","volume":"44","author":"Bennett","year":"1998","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"3250","DOI":"10.1109\/TIT.2004.838101","article-title":"The similarity metric","volume":"50","author":"Li","year":"2004","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1109\/TIT.2005.844059","article-title":"Clustering by compression","volume":"51","author":"Cilibrasi","year":"2005","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Ferragina, P., Giancarlo, R., Greco, V., Manzini, G., and Valiente, G. (2007). Compression-based classification of biological sequences and structures via the universal similarity metric: Experimental assessment. BMC Bioinform., 8.","DOI":"10.1186\/1471-2105-8-252"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"El-Dirany, M., Wang, F., Furst, J., Rogers, J., and Raicu, D. (2016, January 15\u201318). Compression-based distance methods as an alternative to statistical methods for constructing phylogenetic trees. Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China.","DOI":"10.1109\/BIBM.2016.7822676"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Nikvand, N., and Wang, Z. (2010, January 26\u201329). Generic image similarity based on Kolmogorov complexity. Proceedings of the 2010 17th IEEE International Conference on Image Processing (ICIP-2010), Hong Kong, China.","DOI":"10.1109\/ICIP.2010.5653405"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Pratas, D., and Pinho, A.J. (2014, January 26\u201328). A conditional compression distance that unveils insights of the genomic evolution. Proceedings of the Data Compression Conference (DCC-2014), Snowbird, UT, USA.","DOI":"10.1109\/DCC.2014.58"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1895","DOI":"10.1109\/TIT.2007.894669","article-title":"The normalized compression distance is resistant to noise","volume":"53","author":"Alfonseca","year":"2007","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"367","DOI":"10.4310\/CIS.2005.v5.n4.a1","article-title":"Common pitfalls using the normalized compression distance: What to watch out for in a compressor","volume":"5","author":"Alfonseca","year":"2005","journal-title":"Commun. Inf. Syst."},{"key":"ref_33","unstructured":"Seaward, L., and Matwin, S. (2009, January 8\u201310). Intrinsic plagiarism detection using complexity analysis. Proceedings of the SEPLN, San Sebastian, Spain."},{"key":"ref_34","unstructured":"Merivuori, T., and Roos, T. (2009, January 17\u201319). Some Observations on the Applicability of Normalized Compression Distance to Stemmatology. Proceedings of the Second Workshop on Information Theoretic Methods in Science and Engineering, Tampere, Finland."},{"key":"ref_35","first-page":"1","article-title":"Kolmogorov complexity as a data similarity metric: Application in mitochondrial DNA","volume":"4","author":"Mota","year":"2018","journal-title":"Nonlinear Dyn."},{"key":"ref_36","unstructured":"Pratas, D., Pinho, A.J., and Garcia, S.P. (2012, January 1\u20134). Computation of the Normalized Compression Distance of DNA Sequences using a Mixture of Finite-context Models. Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOINFORMATICS-2012), Algarve, Portugal."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"La Rosa, M., Rizzo, R., Urso, A., and Gaglio, S. (2008, January 3\u20135). Comparison of genomic sequences clustering using Normalized Compression Distance and evolutionary distance. Proceedings of the International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Zagreb, Croatia.","DOI":"10.1007\/978-3-540-85567-5_92"},{"key":"ref_38","unstructured":"Nykter, M., Yli-Harja, O., and Shmulevich, I. (2005, January 22\u201324). Normalized Compression Distance for gene expression analysis. Proceedings of the Workshop on Genomic Signal Processing and Statistics (GENSIPS), Newport, RI, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1073\/pnas.0711525105","article-title":"Gene expression dynamics in the macrophage exhibit criticality","volume":"105","author":"Nykter","year":"2008","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Mihailovi\u0107, D.T., Mimi\u0107, G., Nikoli\u0107-Djori\u0107, E., and Arseni\u0107, I. (2015). Novel measures based on the Kolmogorov complexity for use in complex system behavior studies and time series analysis. Open Phys., 13.","DOI":"10.1515\/phys-2015-0001"},{"key":"ref_41","unstructured":"Tran, N. (February, January 29). The normalized compression distance and image distinguishability. Proceedings of the SPIE Human Vision and Electronic Imaging XII, San Jose, CA, USA."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Coltuc, D., Datcu, M., and Coltuc, D. (2018). On the Use of Normalized Compression Distances for Image Similarity Detection. Entropy, 20.","DOI":"10.3390\/e20020099"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Pinho, A.J., and Ferreira, P.J.S.G. (2011, January 11\u201314). Image similarity using the normalized compression distance based on finite context models. Proceedings of the 2011 18th IEEE International Conference on Image Processing (ICIP-2011), Brussels, Belgium.","DOI":"10.1109\/ICIP.2011.6115866"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1063","DOI":"10.1007\/s00371-011-0651-2","article-title":"Using Normalized Compression Distance for image similarity measurement: An experimental study","volume":"28","author":"Marco","year":"2012","journal-title":"Vis. Comput."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1007\/s11760-013-0443-4","article-title":"Image distortion analysis based on normalized perceptual information distance","volume":"7","author":"Nikvand","year":"2013","journal-title":"Signal Image Video Process."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1016\/j.cag.2007.01.024","article-title":"Normalized compression distance for visual analysis of document collections","volume":"31","author":"Telles","year":"2007","journal-title":"Comput. Graph."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Axelsson, S. (2010, January 15\u201318). Using Normalized Compression Distance for classifying file fragments. Proceedings of the ARES\u201910 International Conference on Availability, Reliability, and Security, Krakow, Poland.","DOI":"10.1109\/ARES.2010.100"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1602","DOI":"10.1109\/TPAMI.2014.2375175","article-title":"Normalized compression distance of multisets with applications","volume":"37","author":"Cohen","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1162\/0148926042728449","article-title":"Algorithmic clustering of music based on string compression","volume":"28","author":"Cilibrasi","year":"2004","journal-title":"Comput. Music J."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Alfonseca, M., Cebri\u00e1n Ramos, M., and Ortega, A. (2005, January 17\u201319). Evolving computer-generated music by means of the Normalized Compression Distance. Proceedings of the 5th WSEAS Conference on Simulation, Modeling and Optimization (SMO \u201905), Corfu Island, Greece.","DOI":"10.4310\/CIS.2005.v5.n4.a1"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1109\/TASLP.2015.2416655","article-title":"Identifying cover songs using information-theoretic measures of similarity","volume":"23","author":"Foster","year":"2015","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process. (TASLP)"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Klenk, S., Thom, D., and Heidemann, G. (2009, January 6\u20139). The Normalized Compression Distance as a distance measure in entity identification. Proceedings of the Industrial Conference on Data Mining, Miami, FL, USA.","DOI":"10.1007\/978-3-642-03067-3_26"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1109\/TPC.2011.2172833","article-title":"Assessing the impact of student peer review in writing instruction by using the Normalized Compression Distance","volume":"55","author":"Yoshizawa","year":"2012","journal-title":"IEEE Trans. Prof. Commun."},{"key":"ref_54","unstructured":"Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., and Nazario, J. (2007, January 5\u20137). Automated classification and analysis of internet malware. Proceedings of the International Workshop on Recent Advances in Intrusion Detection, Gold Coast, Australia."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1007\/s11416-015-0260-0","article-title":"On Normalized Compression Distance and large malware","volume":"12","author":"Borbely","year":"2016","journal-title":"J. Comput. Virol. Hacking Tech."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Threm, D., Yu, L., Ramaswamy, S., and Sudarsan, S.D. (2015, January 2\u20135). Using Normalized Compression Distance to measure the evolutionary stability of software systems. Proceedings of the 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), Gaithersbury, MD, USA.","DOI":"10.1109\/ISSRE.2015.7381805"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Henard, C., Papadakis, M., Harman, M., Jia, Y., and Le Traon, Y. (2016, January 14\u201322). Comparing white-box and black-box test prioritization. Proceedings of the 2016 IEEE\/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, USA.","DOI":"10.1145\/2884781.2884791"},{"key":"ref_58","first-page":"8","article-title":"Clustering-based selection for the exploration of compiler optimization sequences","volume":"13","author":"Martins","year":"2016","journal-title":"ACM Trans. Archit. Code Optim. (TACO)"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Rios, R.A., Lopes, C.S., Sikansi, F.H., Pagliosa, P.A., and de Mello, R.F. (2017, January 2\u20135). Analyzing the Public Opinion on the Brazilian Political and Corruption Issues. Proceedings of the 2017 Brazilian Conference on Intelligent Systems (BRACIS), Uberlandia, Brazil.","DOI":"10.1109\/BRACIS.2017.37"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Ting, C.L., Fisher, A.N., and Bauer, T.L. (2017, January 13\u201315). Compression-Based Algorithms for Deception Detection. Proceedings of the International Conference on Social Informatics, Oxford, UK.","DOI":"10.1007\/978-3-319-67217-5_16"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Cerra, D., Israel, M., and Datcu, M. (2009, January 12\u201317). Parameter-free clustering: Application to fawns detection. Proceedings of the 2009 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2009), Cape Town, South Africa.","DOI":"10.1109\/IGARSS.2009.5418293"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1109\/18.243444","article-title":"A measure of relative entropy between individual sequences with application to universal classification","volume":"39","author":"Ziv","year":"1993","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"902","DOI":"10.3390\/e13040902","article-title":"Algorithmic relative complexity","volume":"13","author":"Cerra","year":"2011","journal-title":"Entropy"},{"key":"ref_64","unstructured":"Pratas, D. (2016). Compression and Analysis of Genomic Data. [Ph.D. Thesis, University of Aveiro]."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1007\/s00778-012-0263-0","article-title":"Measuring structural similarity of semistructured data based on information-theoretic approaches","volume":"21","author":"Helmer","year":"2012","journal-title":"VLDB J. Int. J. Very Large Data Bases"},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"407","DOI":"10.3390\/e15010407","article-title":"Expanding the algorithmic information theory frame for applications to Earth observation","volume":"15","author":"Cerra","year":"2013","journal-title":"Entropy"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/j.patrec.2014.01.019","article-title":"Authorship analysis based on data compression","volume":"42","author":"Cerra","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"1553004","DOI":"10.1142\/S0218001415530043","article-title":"Text Classification Using Compression-Based Dissimilarity Measures","volume":"29","author":"Coutinho","year":"2015","journal-title":"Int. J. Pattern Recognit. Artif. Intell."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Pinho, A.J., Pratas, D., and Ferreira, P.J.S.G. (April, January 29). Authorship attribution using relative compression. Proceedings of the Data Compression Conference (DCC-2016), Snowbird, UT, USA.","DOI":"10.1109\/DCC.2016.53"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"467","DOI":"10.3389\/fpsyg.2018.00467","article-title":"Biometric and emotion identification: An ECG compression based method","volume":"9","author":"Ferreira","year":"2018","journal-title":"Front. Psychol."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"10203","DOI":"10.1038\/srep10203","article-title":"An alignment-free method to find and visualise rearrangements between pairs of DNA sequences","volume":"5","author":"Pratas","year":"2015","journal-title":"Sci. Rep."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Pratas, D., Pinho, A.J., and Ferreira, P.J.S.G. (April, January 29). Efficient compression of genomic sequences. Proceedings of the Data Compression Conference (DCC-2016), Snowbird, UT, USA.","DOI":"10.1109\/DCC.2016.60"},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Pratas, D., Pinho, A.J., Silva, R.M., Rodrigues, J.M.O.S., Hosseini, M., Caetano, T., and Ferreira, P.J.S.G. (2018). FALCON-meta: A method to infer metagenomic composition of ancient DNA. bioRxiv, 267179.","DOI":"10.1101\/267179"},{"key":"ref_74","unstructured":"Coutinho, D., and Figueiredo, M. (2013, January 15\u201318). An information theoretic approach to text sentiment analysis. Proceedings of the International Conference on Pattern Recognition Applications and Methods (ICPRAM), Barcelona, Spain."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gkr1124","article-title":"GReEn: A tool for efficient compression of genome resequencing data","volume":"40","author":"Pinho","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1109\/TCBB.2013.122","article-title":"FRESCO: Referential compression of highly similar sequences","volume":"10","author":"Wandelt","year":"2013","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"3364","DOI":"10.1093\/bioinformatics\/btx412","article-title":"High-speed and high-ratio referential genome compression","volume":"33","author":"Liu","year":"2017","journal-title":"Bioinformatics"},{"key":"ref_78","unstructured":"Dawy, Z., Hagenauer, J., and Hoffmann, A. (2004, January 23\u201325). Implementing the context tree weighting method for content recognition. Proceedings of the Data Compression Conference (DCC-2004), Snowbird, UT, USA."},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Darwin, C., and Bynum, W.F. (1859). The Origin of Species by Means of Natural Selection: Or, The Preservation of Favored Races in the Struggle for Life, John Murray.","DOI":"10.5962\/bhl.title.68064"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Huxley, T.H. (1863). Evidence as to Mans Place in Nature by Thomas Henry Huxley, Williams and Norgate.","DOI":"10.5962\/bhl.title.45796"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1038\/nrg1603","article-title":"Phylogenomics and the reconstruction of the tree of life","volume":"6","author":"Delsuc","year":"2005","journal-title":"Nat. Rev. Genet."},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1016\/S0168-9525(02)02744-0","article-title":"Genome trees and the tree of life","volume":"18","author":"Wolf","year":"2002","journal-title":"Trends Genet."},{"key":"ref_83","first-page":"81","article-title":"How genomes are sequenced and why it matters: Implications for studies in comparative genomics of humans and chimpanzees","volume":"4","author":"Tomkins","year":"2011","journal-title":"Answ. Res. J."},{"key":"ref_84","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.tig.2014.12.002","article-title":"Accounting for uncertainty in DNA sequencing data","volume":"31","author":"Ferson","year":"2015","journal-title":"Trends Genet."},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1038\/nrg3931","article-title":"Estimating the mutation load in human genomes","volume":"16","author":"Henn","year":"2015","journal-title":"Nat. Rev. Genet."},{"key":"ref_86","doi-asserted-by":"crossref","first-page":"3439","DOI":"10.1073\/pnas.1418652112","article-title":"Evidence for recent, population-specific evolution of the human mutation rate","volume":"112","author":"Harris","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.gde.2014.06.011","article-title":"Adaptations to local environments in modern human populations","volume":"29","author":"Jeong","year":"2014","journal-title":"Curr. Opin. Genet. Dev."},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"e00403-16","DOI":"10.1128\/mBio.00403-16","article-title":"Transcriptome remodeling contributes to epidemic disease caused by the human pathogen Streptococcus pyogenes","volume":"7","author":"Beres","year":"2016","journal-title":"MBio"},{"key":"ref_89","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.coi.2014.05.001","article-title":"Human genome variability, natural selection and infectious diseases","volume":"30","author":"Fumagalli","year":"2014","journal-title":"Curr. Opin. Immunol."},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1016\/S0169-5347(01)02187-5","article-title":"Chromosomal rearrangements and speciation","volume":"16","author":"Rieseberg","year":"2001","journal-title":"Trends Ecol. Evol."},{"key":"ref_91","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1016\/0092-8674(80)90131-2","article-title":"DNA rearrangements associated with a transposable element in yeast","volume":"21","author":"Roeder","year":"1980","journal-title":"Cell"},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1038\/s41559-017-0425-y","article-title":"Evolutionary determinants of genome-wide nucleotide composition","volume":"2","author":"Long","year":"2018","journal-title":"Nat. Ecol. Evol."},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Golan, A. (2017). Foundations of Info-Metrics: Modeling and Inference with Imperfect Information, Oxford University Press.","DOI":"10.1093\/oso\/9780199349524.001.0001"},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1016\/0168-9525(89)90111-X","article-title":"The evolutionary origins of organelles","volume":"5","author":"Gray","year":"1989","journal-title":"Trends Genet."},{"key":"ref_95","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.biosystems.2018.03.002","article-title":"Alignment-based and alignment-free methods converge with experimental data on amino acids coded by stop codons at split between nuclear and mitochondrial genetic codes","volume":"167","author":"Seligmann","year":"2018","journal-title":"Biosystems"},{"key":"ref_96","doi-asserted-by":"crossref","unstructured":"Kimura, M. (1983). The Neutral Theory of Molecular Evolution, Cambridge University Press.","DOI":"10.1017\/CBO9780511623486"},{"key":"ref_97","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1186\/s13059-017-1319-7","article-title":"Alignment-free sequence comparison: Benefits, applications, and tools","volume":"18","author":"Zielezinski","year":"2017","journal-title":"Genome Biol."},{"key":"ref_98","doi-asserted-by":"crossref","unstructured":"Ren, J., Bai, X., Lu, Y.Y., Tang, K., Wang, Y., Reinert, G., and Sun, F. (2018). Alignment-Free Sequence Analysis and Applications. Annu. Rev. Biomed. Data Sci., 1.","DOI":"10.1146\/annurev-biodatasci-080917-013431"},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Ferreira, P.J.S.G., and Pinho, A.J. (2014, January 4\u20139). Compression-based normal similarity measures for DNA sequences. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-2014, Florence, Italy.","DOI":"10.1109\/ICASSP.2014.6853630"},{"key":"ref_100","doi-asserted-by":"crossref","unstructured":"Pratas, D., Hosseini, M., and Pinho, A.J. (2017, January 21\u201323). Substitutional Tolerant Markov Models for Relative Compression of DNA Sequences. Proceedings of the 11th International Conference on Practical Applications of Computational Biology & Bioinformatics, Porto, France.","DOI":"10.1007\/978-3-319-60816-7_32"},{"key":"ref_101","unstructured":"Bell, T.C., Cleary, J.G., and Witten, I.H. (1990). Text Compression, Prentice Hall."},{"key":"ref_102","doi-asserted-by":"crossref","unstructured":"Pinho, A.J., Pratas, D., and Ferreira, P.J.S.G. (2011, January 28\u201330). Bacteria DNA sequence compression using a mixture of finite-context models. Proceedings of the IEEE Workshop on Statistical Signal Processing, Nice, France.","DOI":"10.1109\/SSP.2011.5967637"},{"key":"ref_103","doi-asserted-by":"crossref","unstructured":"Sayood, K. (2017). Introduction to Data Compression, Morgan Kaufmann.","DOI":"10.1016\/B978-0-12-809474-7.00019-7"},{"key":"ref_104","unstructured":"Pratas, D., and Pinho, A.J. (2014, January 1\u20135). Exploring deep Markov models in genomic data compression using sequence pre-analysis. Proceedings of the 22nd European Signal Processing Conference (EUSIPCO-2014), Lisbon, Portugal."},{"key":"ref_105","doi-asserted-by":"crossref","unstructured":"Pratas, D., Pinho, A.J., and Rodrigues, J.M.O.S. (2014). XS: A FASTQ read simulator. BMC Res. Notes, 7.","DOI":"10.1186\/1756-0500-7-40"},{"key":"ref_106","unstructured":"Grumbach, S., and Tahi, F. (April, January 30). Compression of DNA sequences. Proceedings of the Data Compression Conference (DCC-93), Snowbird, UT, USA."},{"key":"ref_107","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1016\/0306-4573(94)90014-0","article-title":"A new challenge for compression algorithms: Genetic sequences","volume":"30","author":"Grumbach","year":"1994","journal-title":"Inf. Process. Manag."},{"key":"ref_108","unstructured":"Rivals, E., Delahaye, J.P., Dauchet, M., and Delgrange, O. (April, January 31). A guaranteed compression scheme for repetitive DNA sequences. Proceedings of the Data Compression Conference (DCC-96), Snowbird, UT, USA."},{"key":"ref_109","unstructured":"Loewenstern, D., and Yianilos, P.N. (1997, January 25\u201327). Significantly lower entropy estimates for natural DNA sequences. Proceedings of the Data Compression Conference (DCC-97), Snowbird, UT, USA."},{"key":"ref_110","first-page":"43","article-title":"Biological sequence compression algorithms","volume":"11","author":"Matsumoto","year":"2000","journal-title":"Genome Inform."},{"key":"ref_111","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/51.940049","article-title":"A compression algorithm for DNA sequences","volume":"20","author":"Chen","year":"2001","journal-title":"IEEE Eng. Med. Biol. Mag."},{"key":"ref_112","doi-asserted-by":"crossref","first-page":"1696","DOI":"10.1093\/bioinformatics\/18.12.1696","article-title":"DNACompress: Fast and effective DNA sequence compression","volume":"18","author":"Chen","year":"2002","journal-title":"Bioinformatics"},{"key":"ref_113","unstructured":"Tabus, I., Korodi, G., and Rissanen, J. (2003, January 25\u201327). DNA sequence compression using the normalized maximum likelihood model for discrete regression. Proceedings of the Data Compression Conference (DCC-2003), Snowbird, UT, USA."},{"key":"ref_114","doi-asserted-by":"crossref","first-page":"1397","DOI":"10.1002\/spe.619","article-title":"A simple and fast DNA compressor","volume":"34","author":"Manzini","year":"2004","journal-title":"Softw. Pract. Exp."},{"key":"ref_115","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1145\/1055709.1055711","article-title":"An efficient normalized maximum likelihood algorithm for DNA sequence compression","volume":"23","author":"Korodi","year":"2005","journal-title":"ACM Trans. Inform. Syst."},{"key":"ref_116","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1007\/11496656_17","article-title":"DNA compression challenge revisited","volume":"Volume 3537","author":"Behzadi","year":"2005","journal-title":"Proceedings of the Combinatorial Pattern Matching, CPM-2005"},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Korodi, G., and Tabus, I. (2007, January 27\u201329). Normalized maximum likelihood model of order-1 for the compression of DNA sequences. Proceedings of the Data Compression Conference (DCC-2007), Snowbird, UT, USA.","DOI":"10.1109\/DCC.2007.60"},{"key":"ref_118","unstructured":"Cao, M.D., Dix, T.I., Allison, L., and Mears, C. (2007, January 27\u201329). A simple statistical algorithm for biological sequence compression. Proceedings of the Data Compression Conference (DCC-2007), Snowbird, UT, USA."},{"key":"ref_119","doi-asserted-by":"crossref","unstructured":"Kaipa, K.K., Bopardikar, A.S., Abhilash, S., Venkataraman, P., Lee, K., Ahn, T., and Narayanan, R. (2010, January 18). Algorithm for dna sequence compression based on prediction of mismatch bases and repeat location. Proceedings of 2010 IEEE International Conference on the Bioinformatics and Biomedicine Workshops (BIBMW), Hong Kong, China.","DOI":"10.1109\/BIBMW.2010.5703941"},{"key":"ref_120","first-page":"245","article-title":"A novel approach for compressing DNA sequences using semi-statistical compressor","volume":"33","author":"Gupta","year":"2011","journal-title":"Int. J. Comput. Appl."},{"key":"ref_121","doi-asserted-by":"crossref","unstructured":"Pinho, A.J., Ferreira, P.J.S.G., Neves, A.J.R., and Bastos, C.A.C. (2011). On the representability of complete genomes by multiple competing finite-context (Markov) models. PLoS ONE, 6.","DOI":"10.1371\/journal.pone.0021588"},{"key":"ref_122","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1109\/TEVC.2011.2160399","article-title":"DNA sequence compression using adaptive particle swarm optimization-based memetic algorithm","volume":"15","author":"Zhu","year":"2011","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_123","doi-asserted-by":"crossref","first-page":"2527","DOI":"10.1093\/bioinformatics\/bts467","article-title":"DELIMINATE\u2013A fast and efficient method for loss-less compression of genomic sequences","volume":"28","author":"Mohammed","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_124","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1093\/bioinformatics\/btt594","article-title":"MFCompress: A compression tool for FASTA and multi-FASTA data","volume":"30","author":"Pinho","year":"2014","journal-title":"Bioinformatics"},{"key":"ref_125","doi-asserted-by":"crossref","unstructured":"Li, P., Wang, S., Kim, J., Xiong, H., Ohno-Machado, L., and Jiang, X. (2013). DNA-COMPACT: DNA Compression Based on a Pattern-Aware Contextual Modeling Technique. PLoS ONE, 8.","DOI":"10.1371\/journal.pone.0080377"},{"key":"ref_126","unstructured":"Dai, W., Xiong, H., Jiang, X., and Ohno-Machado, L. (2013, January 20\u201322). An Adaptive Difference Distribution-Based Coding with Hierarchical Tree Structure for DNA Sequence Compression. Proceedings of the Data Compression Conference (DCC-2013), Snowbird, UT, USA."},{"key":"ref_127","doi-asserted-by":"crossref","unstructured":"Guo, H., Chen, M., Liu, X., and Xie, M. (2015, January 29\u201331). Genome compression based on Hilbert space filling curve. Proceedings of the 3rd International Conference on Management, Education, Information and Control (MEICI 2015), Shenyang, China.","DOI":"10.2991\/meici-15.2015.294"},{"key":"ref_128","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1109\/TCBB.2015.2430331","article-title":"CoGI: Towards compressing genomes as an image","volume":"12","author":"Xie","year":"2015","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Benoit, G., Lemaitre, C., Lavenier, D., Drezen, E., Dayris, T., Uricaru, R., and Rizk, G. (2015). Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph. BMC Bioinform., 16.","DOI":"10.1186\/s12859-015-0709-7"},{"key":"ref_130","doi-asserted-by":"crossref","first-page":"734","DOI":"10.1101\/gr.114819.110","article-title":"Efficient storage of high throughput DNA sequencing data using reference-based compression","volume":"21","author":"Fritz","year":"2011","journal-title":"Genome Res."},{"key":"ref_131","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1089\/cmb.2010.0253","article-title":"Compressing genomic sequence fragments using SlimGene","volume":"18","author":"Kozanitis","year":"2011","journal-title":"J. Comput. Biol."},{"key":"ref_132","doi-asserted-by":"crossref","first-page":"860","DOI":"10.1093\/bioinformatics\/btr014","article-title":"Compression of DNA sequence reads in FASTQ format","volume":"27","author":"Deorowicz","year":"2011","journal-title":"Bioinformatics"},{"key":"ref_133","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/1748-7188-7-30","article-title":"Adaptive efficient compression of genomes","volume":"7","author":"Wandelt","year":"2012","journal-title":"Algorithms Mol. Biol."},{"key":"ref_134","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1186\/1471-2105-13-100","article-title":"Handling the data management needs of high-throughput sequencing data: SpeedGene, a compression algorithm for the efficient storage of genetic data","volume":"13","author":"Qiao","year":"2012","journal-title":"BMC Bioinform."},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1093\/bioinformatics\/btu698","article-title":"iDoComp: A compression scheme for assembled genomes","volume":"31","author":"Ochoa","year":"2014","journal-title":"Bioinformatics"},{"key":"ref_136","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/srep11565","article-title":"GDC 2: Compression of large collections of genomes","volume":"5","author":"Deorowicz","year":"2015","journal-title":"Sci. Rep."},{"key":"ref_137","doi-asserted-by":"crossref","first-page":"3405","DOI":"10.1093\/bioinformatics\/btw505","article-title":"NRGC: A novel referential genome compression algorithm","volume":"32","author":"Saha","year":"2016","journal-title":"Bioinformatics"},{"key":"ref_138","doi-asserted-by":"crossref","unstructured":"Stephens, Z.D., Lee, S.Y., Faghri, F., Campbell, R.H., Zhai, C., Efron, M.J., Iyer, R., Schatz, M.C., Sinha, S., and Robinson, G.E. (2015). Big data: Astronomical or genomical?. PLoS Biol., 13.","DOI":"10.1371\/journal.pbio.1002195"},{"key":"ref_139","doi-asserted-by":"crossref","first-page":"696","DOI":"10.1109\/TIT.2009.2037052","article-title":"Compression of whole genome alignments","volume":"56","author":"Hanus","year":"2010","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_140","doi-asserted-by":"crossref","first-page":"e171","DOI":"10.1093\/nar\/gks754","article-title":"Compression of next-generation sequencing reads aided by highly efficient de novo assembly","volume":"40","author":"Jones","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"ref_141","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1093\/bioinformatics\/bts593","article-title":"SCALCE: Boosting sequence compression algorithms using locally consistent encoding","volume":"28","author":"Hach","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_142","doi-asserted-by":"crossref","first-page":"3189","DOI":"10.1109\/TIT.2012.2236605","article-title":"A compression model for DNA multiple sequence alignment blocks","volume":"59","author":"Matos","year":"2013","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_143","doi-asserted-by":"crossref","unstructured":"Bonfield, J.K., and Mahoney, M.V. (2013). Compression of FASTQ and SAM format sequencing data. PLoS ONE, 8.","DOI":"10.1371\/journal.pone.0059190"},{"key":"ref_144","doi-asserted-by":"crossref","unstructured":"Holley, G., Wittler, R., Stoye, J., and Hach, F. (2017, January 3\u20137). Dynamic alignment-free and reference-free read compression. Proceedings of the International Conference on Research in Computational Molecular Biology, Hong Kong, China.","DOI":"10.1089\/cmb.2018.0068"},{"key":"ref_145","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1093\/bioinformatics\/bts173","article-title":"Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform","volume":"28","author":"Cox","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_146","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gks939","article-title":"NGC: Lossless and lossy compression of aligned high-throughput sequencing data","volume":"41","author":"Popitsch","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"ref_147","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1093\/bioinformatics\/btr689","article-title":"Transformations for the compression of FASTQ quality scores of next-generation sequencing data","volume":"28","author":"Wan","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_148","doi-asserted-by":"crossref","unstructured":"Huang, Z.A., Wen, Z., Deng, Q., Chu, Y., Sun, Y., and Zhu, Z. (2017). LW-FQZip 2: A parallelized reference-based compression of FASTQ files. BMC Bioinform., 18.","DOI":"10.1186\/s12859-017-1588-x"},{"key":"ref_149","doi-asserted-by":"crossref","unstructured":"Hosseini, M., Pratas, D., and Pinho, A.J. (2016). A survey on data compression methods for biological sequences. Information, 7.","DOI":"10.3390\/info7040056"},{"key":"ref_150","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1038\/nature12228","article-title":"Great ape genetic diversity and population history","volume":"499","author":"Sudmant","year":"2013","journal-title":"Nature"},{"key":"ref_151","doi-asserted-by":"crossref","first-page":"40712","DOI":"10.1038\/srep40712","article-title":"Viral phylogenomics using an alignment-free method: A three-step approach to determine optimal length of k-mer","volume":"7","author":"Zhang","year":"2017","journal-title":"Sci. Rep."},{"key":"ref_152","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1101\/gr.1003303","article-title":"Large-scale variation among human and great ape genomes determined by array comparative genomic hybridization","volume":"13","author":"Locke","year":"2003","journal-title":"Genome Res."},{"key":"ref_153","doi-asserted-by":"crossref","first-page":"1640","DOI":"10.1101\/gr.124461.111","article-title":"Gorilla genome structural variation reveals evolutionary parallelisms with chimpanzee","volume":"21","author":"Ventura","year":"2011","journal-title":"Genome Res."},{"key":"ref_154","doi-asserted-by":"crossref","unstructured":"Roos, C., Zinner, D., Kubatko, L.S., Schwarz, C., Yang, M., Meyer, D., Nash, S.D., Xing, J., Batzer, M.A., and Brameier, M. (2011). Nuclear versus mitochondrial DNA: Evidence for hybridization in colobine monkeys. BMC Evol. Biol., 11.","DOI":"10.1186\/1471-2148-11-77"},{"key":"ref_155","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nat. Genet."},{"key":"ref_156","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/S0169-5347(03)00033-8","article-title":"Evolution by gene duplication: An update","volume":"18","author":"Zhang","year":"2003","journal-title":"Trends Ecol. Evol."},{"key":"ref_157","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"STAR: Ultrafast universal RNA-seq aligner","volume":"29","author":"Dobin","year":"2013","journal-title":"Bioinformatics"},{"key":"ref_158","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1101\/gr.1917404","article-title":"Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs","volume":"14","author":"Chevreux","year":"2004","journal-title":"Genome Res."},{"key":"ref_159","doi-asserted-by":"crossref","first-page":"9054","DOI":"10.1073\/pnas.84.24.9054","article-title":"Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs","volume":"84","author":"Wolfe","year":"1987","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_160","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/j.tig.2010.05.003","article-title":"Evolution of the mutation rate","volume":"26","author":"Lynch","year":"2010","journal-title":"Trends Genet."},{"key":"ref_161","doi-asserted-by":"crossref","unstructured":"Farr\u00e9, M., and Ruiz-Herrera, A. (2014). Role of chromosomal reorganisations in the human-chimpanzee speciation. Encyclopedia of Life Sciences (eLS), John Wiley & Sons.","DOI":"10.1002\/9780470015902.a0025534"},{"key":"ref_162","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1093\/molbev\/mss272","article-title":"Recombination rates and genomic shuffling in human and chimpanzee\u2014A new twist in the chromosomal speciation theory","volume":"30","author":"Micheletti","year":"2013","journal-title":"Mol. Biol. Evol."},{"key":"ref_163","doi-asserted-by":"crossref","unstructured":"Hosseini, M., Pratas, D., and Pinho, A.J. (2017, January 21\u201323). On the role of inverted repeats in DNA sequence similarity. Proceedings of the International Conference on Practical Applications of Computational Biology & Bioinformatics, Porto, Portugal.","DOI":"10.1007\/978-3-319-60816-7_28"},{"key":"ref_164","doi-asserted-by":"crossref","unstructured":"Fleagle, J.G. (2013). Primate Adaptation and Evolution, Academic Press.","DOI":"10.1016\/B978-0-12-378632-6.00009-4"},{"key":"ref_165","doi-asserted-by":"crossref","first-page":"1081","DOI":"10.1093\/molbev\/msh110","article-title":"NUMTs in sequenced eukaryotic genomes","volume":"21","author":"Richly","year":"2004","journal-title":"Mol. Biol. Evol."},{"key":"ref_166","doi-asserted-by":"crossref","first-page":"16357","DOI":"10.1038\/s41598-017-16750-2","article-title":"NumtS colonization in mammalian genomes","volume":"7","author":"Calabrese","year":"2017","journal-title":"Sci. Rep."},{"key":"ref_167","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/humu.22452","article-title":"Mitochondrial DNA rearrangements in health and disease\u2014A comprehensive study","volume":"35","author":"Damas","year":"2014","journal-title":"Hum. Mutat."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/6\/393\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:05:29Z","timestamp":1760195129000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/6\/393"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,5,23]]},"references-count":167,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2018,6]]}},"alternative-id":["e20060393"],"URL":"https:\/\/doi.org\/10.3390\/e20060393","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,5,23]]}}}