{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T04:16:54Z","timestamp":1759033014326},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The necessity to characterize the spatial uniformity (or lack of it) of symbols in biological sequences, given its implications for identification of the properties of the structures associated with the sequences.<\/jats:p><jats:p>Methods: A one-dimensional version of a recently introduced percolation-based approach is presented, which allows the accurate quantification of symbol distributions even in the presence of co-existing densities. An enhanced version of this methodology, which uses an agglomerative process to organize hierarchically the sequence into subsequences, is also proposed and illustrated.<\/jats:p><jats:p>3. Results: The potential of the proposed methodology is illustrated with respect to synthetic and real data (1881 zebrafish and 1200 Xenopus proteins) and compared to two alternative multiscale methodologies, with encouraging results including the possibility to identify particularly remarkable amino acid arrangements in proteins.<\/jats:p><jats:p>4. Contact: \u00a0luciano@if.sc.usp.br<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti050","type":"journal-article","created":{"date-parts":[[2004,10,1]],"date-time":"2004-10-01T00:24:35Z","timestamp":1096590275000},"page":"608-616","source":"Crossref","is-referenced-by-count":3,"title":["Biological sequence analysis through the one-dimensional percolation transform and its enhanced version"],"prefix":"10.1093","volume":"21","author":[{"given":"Luciano","family":"da Fontoura Costa","sequence":"first","affiliation":[{"name":"Cybernetic Vision Research Group, GII-IFSC, Universidade de S\u00e3o Paulo S\u00e3o Carlos, SP, Caixa Postal 369, 13560-970, Brazil"}]}],"member":"286","published-online":{"date-parts":[[2004,10,12]]},"reference":[{"key":"2023013107213892700_B1","unstructured":"Almeida, J.S., Carrico, J.A., Maretzek, A., Noble, P.A., Fletcher, M. 2001Analysis of genomic sequences by Chaos Game Representation. Bioinformatics17429\u2013437"},{"key":"2023013107213892700_B2","doi-asserted-by":"crossref","unstructured":"Anastassiou, D. 2001Genomic signal processing. IEEE Signal Process. Mag.188\u201320","DOI":"10.1109\/79.939833"},{"key":"2023013107213892700_B3","unstructured":"Baldi, P. and Brunak, S. Bioinformatics2001, Cambridge, MA The MIT Press"},{"key":"2023013107213892700_B4","doi-asserted-by":"crossref","unstructured":"Binney, J.J., Fisher, A.J., Dowrick, N.J., Newman, M.E.J. The Theory of Critical Phenomena1992, London Clarendon Press","DOI":"10.1093\/oso\/9780198513940.001.0001"},{"key":"2023013107213892700_B5","doi-asserted-by":"crossref","unstructured":"Borstnik, B. and Pumpernik, D. 2002Tandem repeats in protein coding regions of primate genes. Genome Res.12, pp. 909\u2013915","DOI":"10.1101\/gr.138802"},{"key":"2023013107213892700_B6","unstructured":"Brigham, E.O. The Fast Fourier Transform and Its Applications1988, NJ Prentice-Hall"},{"key":"2023013107213892700_B7","doi-asserted-by":"crossref","unstructured":"Castelo, A.T., Martins, W., Gao, G.R. 2002TROLL\u2014tandem repeat occurrence locator. Bioinformatics18, pp. 634\u2013636","DOI":"10.1093\/bioinformatics\/18.4.634"},{"key":"2023013107213892700_B8","unstructured":"Clote, P. and Backofen, R. Computational Molecular Biology: An Introduction2000, Sons John Wiley and Sons"},{"key":"2023013107213892700_B9","unstructured":"Costa, L.da F. 2004Actively-induced percolation: an effective approach to multiple-object systems characterization. eprint arXiv, cond-mat\/0404310"},{"key":"2023013107213892700_B10","unstructured":"Costa, L.da F. 2004Complementary material for the present article"},{"key":"2023013107213892700_B11","unstructured":"Costa, L.da F. and Cesar, R.M., Jr. Shape Analysis and Classification: Theory and Practice2001, Boca Raton, FL CRC Press"},{"key":"2023013107213892700_B12","doi-asserted-by":"crossref","unstructured":"Gross, I., Bernaola-Galvan, P., Carpena, P., Roman-Roldan, R., Oliver, J., Stanley, H.E. 2002Analysis of symbolic sequences using the Jense\u2013Shannon divergence. Phys. Rev. E65, pp. 041905","DOI":"10.1103\/PhysRevE.65.041905"},{"key":"2023013107213892700_B13","doi-asserted-by":"crossref","unstructured":"Irback, A., Peterson, C., Potthast, F. 1996Evidence for nonrandom hydrophobicity structures in protein chains. Proc. Natl Acad. Sci. USA939533\u20139538","DOI":"10.1073\/pnas.93.18.9533"},{"key":"2023013107213892700_B14","doi-asserted-by":"crossref","unstructured":"Nagai, N., Kuwata, K., Hayashi, T., Kuwata, H., Era, S. 2001Evolution of the periodicity and the self-similarity in DNA sequence: a fourier transform analysis. Jpn. J. Physiol.51159\u2013168","DOI":"10.2170\/jjphysiol.51.159"},{"key":"2023013107213892700_B15","doi-asserted-by":"crossref","unstructured":"Peng, C.K., Buldyrev, S.V., Havlin, S., Simmons, M., Stanley, H.E., Goldeberger, A.L. 1994Mosaic organization of DNA nucleotides. Phys. Rev. E491685","DOI":"10.1103\/PhysRevE.49.1685"},{"key":"2023013107213892700_B16","unstructured":"Popov, A.V., Sitnik, N.A., Savvateeva-Popova, E.V., Wolf, R., Heisenberg, M. 2003The role of central parts of the brain in the control of sound production during courtship in Drosophila melanogaster . Neurosci. Behav. Physiol.3353\u201365"},{"key":"2023013107213892700_B17","unstructured":"Sanger Institute. 2004Protein families database of alignment and HMM"},{"key":"2023013107213892700_B18","doi-asserted-by":"crossref","unstructured":"Starck, J.L., Murtagh, F., Bijaoui, A. Image Processing and Data Analysis1998, Cambridge, MA Cambridge University Press","DOI":"10.1017\/CBO9780511564352"},{"key":"2023013107213892700_B19","unstructured":"Stauffer, D. and Aharony, A. Introduction to Percolation Theory1994 Taylor and Francis"},{"key":"2023013107213892700_B20","doi-asserted-by":"crossref","unstructured":"Surlykke, A. and Moss, C.F. 2000Echolocation behavior of big brown bats, Eptesius fuscus, in the field and the laboratory. J. Acoust. Soc. Am.108, pp. 2419\u20132429","DOI":"10.1121\/1.1315295"},{"key":"2023013107213892700_B21","doi-asserted-by":"crossref","unstructured":"Troyanskaya, O.G., Arbell, O., Koren, Y., Landau, G.M., Bolshoy, A. 2002Sequence complexity profiles of prokaryotic genomic sequences: a fast algorithm for calculating linguistic complexity. Bioinformatics18679\u2013688","DOI":"10.1093\/bioinformatics\/18.5.679"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/5\/608\/48962600\/bioinformatics_21_5_608.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/5\/608\/48962600\/bioinformatics_21_5_608.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,14]],"date-time":"2024-01-14T13:33:06Z","timestamp":1705239186000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/5\/608\/220168"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,10,12]]},"references-count":21,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2005,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti050","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,3,1]]},"published":{"date-parts":[[2004,10,12]]}}}