{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T13:00:25Z","timestamp":1773406825399,"version":"3.50.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"23","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2573,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: DNA sequences can be represented by sequences of four symbols, but it is often useful to convert the symbols into real or complex numbers for further analysis. Several mapping schemes have been used in the past, but they seem unrelated to any intrinsic characteristic of DNA. The objective of this work was to find a mapping scheme directly related to DNA characteristics and that would be useful in discriminating between different species. Mathematical models to explore DNA correlation structures may contribute to a better knowledge of the DNA and to find a concise DNA description.<\/jats:p>\n               <jats:p>Results: We developed a methodology to process DNA sequences based on inter-nucleotide distances. Our main contribution is a method to obtain genomic signatures for complete genomes, based on the inter-nucleotide distances, that are able to discriminate between different species. Using these signatures and hierarchical clustering, it is possible to build phylogenetic trees. Phylogenetic trees lead to genome differentiation and allow the inference of phylogenetic relations. The phylogenetic trees generated in this work display related species close to each other, suggesting that the inter-nucleotide distances are able to capture essential information about the genomes. To create the genomic signature, we construct a vector which describes the inter-nucleotide distance distribution of a complete genome and compare it with the reference distance distribution, which is the distribution of a sequence where the nucleotides are placed randomly and independently. It is the residual or relative error between the data and the reference distribution that is used to compare the DNA sequences of different organisms.<\/jats:p>\n               <jats:p>Contact: \u00a0vera@ua.pt<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp546","type":"journal-article","created":{"date-parts":[[2009,9,17]],"date-time":"2009-09-17T01:30:31Z","timestamp":1253151031000},"page":"3064-3070","source":"Crossref","is-referenced-by-count":62,"title":["Genome analysis with inter-nucleotide distances"],"prefix":"10.1093","volume":"25","author":[{"given":"Vera","family":"Afreixo","sequence":"first","affiliation":[{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"given":"Carlos A. C.","family":"Bastos","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"},{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"given":"Armando J.","family":"Pinho","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"},{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"given":"Sara P.","family":"Garcia","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]},{"given":"Paulo J. S. G.","family":"Ferreira","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"},{"name":"1 Department of Mathematics, 2 Signal Processing Lab, IEETA and 3 Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810-193 Aveiro, Portugal"}]}],"member":"286","published-online":{"date-parts":[[2009,9,16]]},"reference":[{"key":"2023013112160693200_B1","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1016\/j.dsp.2004.08.001","article-title":"Fourier analysis of symbolic data: a brief review","volume":"14","author":"Afreixo","year":"2004","journal-title":"Digit. Signal Process."},{"key":"2023013112160693200_B2","doi-asserted-by":"crossref","first-page":"031910","DOI":"10.1103\/PhysRevE.70.031910","article-title":"The spectrum and symbol distribution of nucleotide","volume":"70","author":"Afreixo","year":"2004","journal-title":"Phys. Rev. E"},{"key":"2023013112160693200_B3","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1109\/JSTSP.2008.923854","article-title":"Signal processing in sequence analysis: Advances in eukaryotic gene prediction","volume":"2","author":"Akhtar","year":"2008","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"2023013112160693200_B4","doi-asserted-by":"crossref","DOI":"10.1109\/GENSIPS.2007.4365821","article-title":"On DNA numerical representation for period-3 based exon prediction","volume-title":"5th International Workshop on Genomic Signal Processing and Statistics.","author":"Akhtar","year":"2007"},{"key":"2023013112160693200_B5","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/79.939833","article-title":"Genomic signal processing","volume":"18","author":"Anastassiou","year":"2001","journal-title":"IEEE Signal Process. Mag."},{"key":"2023013112160693200_B6","first-page":"373","article-title":"Symbol-balanced quaternionic periodicity transform for latent pattern detection in DNA sequences","volume-title":"Proceedings of IEEE ICASSP","author":"Brodzik","year":"2005"},{"key":"2023013112160693200_B7","doi-asserted-by":"crossref","first-page":"5084","DOI":"10.1103\/PhysRevE.51.5084","article-title":"Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis","volume":"51","author":"Buldyrev","year":"1995","journal-title":"Phys. Rev. E"},{"key":"2023013112160693200_B8","doi-asserted-by":"crossref","first-page":"1283","DOI":"10.1126\/science.1123061","article-title":"Toward automatic reconstruction of a highly resolved tree of life","volume":"311","author":"Ciccarelli","year":"2006","journal-title":"Science"},{"key":"2023013112160693200_B9","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1016\/S0165-1684(02)00477-2","article-title":"Large scale features in DNA genomic signals","volume":"83","author":"Cristea","year":"2003","journal-title":"Signal Process."},{"key":"2023013112160693200_B10","article-title":"Overview of human repetitive DNA sequences","author":"Doggett","year":"2001","journal-title":"Curr. Protocols Hum. Genet."},{"key":"2023013112160693200_B11","doi-asserted-by":"crossref","first-page":"3353","DOI":"10.1242\/jcs.113.19.3353","article-title":"A myosin family tree","volume":"113","author":"Hodge","year":"2000","journal-title":"J. Cell Sci."},{"key":"2023013112160693200_B12","doi-asserted-by":"crossref","first-page":"2163","DOI":"10.1093\/nar\/18.8.2163","article-title":"Chaos game representation of gene structure","volume":"18","author":"Jeffrey","year":"1990","journal-title":"Nucleic Acids Res."},{"key":"2023013112160693200_B13","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1016\/j.cplett.2004.11.059","article-title":"Application of 2-d graphical representation of DNA sequence","volume":"401","author":"Liao","year":"2005","journal-title":"Chem. Phys. Lett."},{"key":"2023013112160693200_B14","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1038\/nrg2185","article-title":"Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes","volume":"9","author":"Margulies","year":"2008","journal-title":"Nat. Rev. Genet."},{"key":"2023013112160693200_B15","article-title":"Visualization of genomic data using inter-nucleotide distance signals","volume-title":"Proceedings of IEEE Genomic Signal Processing.","author":"Nair","year":"2005"},{"key":"2023013112160693200_B16","first-page":"509","article-title":"Preliminary wavelet analysis of genomic sequences","volume-title":"Proceedings of IEEE Bioinformatics Conference.","author":"Ning","year":"2003"},{"key":"2023013112160693200_B17","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.cplett.2008.03.011","article-title":"Another look at the chaos-game representation of DNA","volume":"456","author":"Randic","year":"2008","journal-title":"Chem. Phys. Lett."},{"key":"2023013112160693200_B18","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/S0022-5193(86)80060-1","article-title":"A measure of DNA periodicity","volume":"118","author":"Silverman","year":"1986","journal-title":"J. Theor. Biol."},{"key":"2023013112160693200_B19","doi-asserted-by":"crossref","first-page":"3805","DOI":"10.1103\/PhysRevLett.68.3805","article-title":"Evolution of long-rang fractal correlations and 1\/f noise in DNA base sequences","volume":"68","author":"Voss","year":"1992","journal-title":"Phys. Rev. Lett."},{"key":"2023013112160693200_B20","doi-asserted-by":"crossref","first-page":"628","DOI":"10.1109\/78.984752","article-title":"Computing linear transforms of symbolic signals","volume":"50","author":"Wang","year":"2002","journal-title":"IEEE Trans. Signal Process."},{"key":"2023013112160693200_B21","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1080\/07391102.1994.10508031","article-title":"Z curves, an intuitive tool for visualising and analysing the DNA sequences","volume":"11","author":"Zhang","year":"1994","journal-title":"J. Biomol. Struct. Dyn."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/23\/3064\/48998268\/bioinformatics_25_23_3064.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/23\/3064\/48998268\/bioinformatics_25_23_3064.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:56:14Z","timestamp":1675202174000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/23\/3064\/214817"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,9,16]]},"references-count":21,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2009,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp546","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,12,1]]},"published":{"date-parts":[[2009,9,16]]}}}