{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T19:07:32Z","timestamp":1760728052229},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2220,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similarities are weak and non-detectable. Moreover, solenoids can be degenerated and widely vary in the number of units. So that it is difficult to detect them. Recently, several solenoid repeats detection methods have been proposed, such as self-alignment of the sequence, spectral analysis and discrete Fourier transform of sequence. Although these methods have shown good performance on certain data sets, they often fail to detect repeats with weak similarities. In this article, we propose a new approach to recognize solenoid repeats and non-solenoid proteins using stationary wavelet packet transform (SWPT). Our method associates with three advantages: (i) naturally representing five main factors of protein structure and properties by wavelet analysis technique; (ii) extracting novel wavelet features that can capture hidden components from solenoid sequence similarities and distinguish them from global proteins; (iii) obtaining statistics features that capture repeating motifs of solenoid proteins.<\/jats:p>\n               <jats:p>Results: Our method analyzes the characteristics of amino acid sequence in both spectral and temporal domains using SWPT. Both global and local information of proteins are captured by SWPT coefficients. We obtain and integrate wavelet-based features and statistics-based features of amino acid sequence to improve the classification task. Our proposed method is evaluated by comparing to state-of-the-art methods such as HHrepID and REPETITA. The experimental results show that our algorithm consistently outperforms them in areas under ROC curve. At the same false positive rate, the sensitivity of our WAVELET method is higher than other methods.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/www.naaan.org\/anvo\/Software\/Software.htm<\/jats:p>\n               <jats:p>Contact: \u00a0anphuocnhu.vo@mavs.uta.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq371","type":"journal-article","created":{"date-parts":[[2010,9,7]],"date-time":"2010-09-07T17:41:46Z","timestamp":1283881306000},"page":"i467-i473","source":"Crossref","is-referenced-by-count":10,"title":["Solenoid and non-solenoid protein recognition using stationary wavelet packet transform"],"prefix":"10.1093","volume":"26","author":[{"given":"An","family":"Vo","sequence":"first","affiliation":[{"name":"1 The Feinstein Institute for Medical Research, North Shore LIJ Health System, NY, 2Department of Electrical Engineering and 3Department of Computer Science and Engineering, University of Texas at Arlington, TX, USA"}]},{"given":"Nha","family":"Nguyen","sequence":"additional","affiliation":[{"name":"1 The Feinstein Institute for Medical Research, North Shore LIJ Health System, NY, 2Department of Electrical Engineering and 3Department of Computer Science and Engineering, University of Texas at Arlington, TX, USA"}]},{"given":"Heng","family":"Huang","sequence":"additional","affiliation":[{"name":"1 The Feinstein Institute for Medical Research, North Shore LIJ Health System, NY, 2Department of Electrical Engineering and 3Department of Computer Science and Engineering, University of Texas at Arlington, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,9,4]]},"reference":[{"key":"2023012508264328000_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012508264328000_B2","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1038\/ng1095-115","article-title":"HEAT repeats in the Huntington's disease protein","volume":"11","author":"Andrade","year":"1995","journal-title":"Nat. Genet."},{"key":"2023012508264328000_B3","doi-asserted-by":"crossref","first-page":"6395","DOI":"10.1073\/pnas.0408677102","article-title":"Solving the protein sequence metric problem","volume":"102","author":"Atchley","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012508264328000_B4","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1016\/0167-4838(87)90112-9","article-title":"On the tertiary structure of the extracellular domains of the epidermal growth factor and insulin receptors","volume":"916","author":"Bajaja","year":"1987","journal-title":"Biochim. Biophys. Acta."},{"key":"2023012508264328000_B5","doi-asserted-by":"crossref","first-page":"1477","DOI":"10.1002\/pro.5560070625","article-title":"Structure and distribution of pentapeptide repeats in bacteria","volume":"7","author":"Bateman","year":"1998","journal-title":"Protein Sci."},{"key":"2023012508264328000_B6","doi-asserted-by":"crossref","first-page":"3357","DOI":"10.1002\/j.1460-2075.1993.tb06009.x","article-title":"Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif","volume":"12","author":"Baumann","year":"1993","journal-title":"EMBO J."},{"key":"2023012508264328000_B7","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1093\/bioinformatics\/btn039","article-title":"De novo identification of highly diverged protein repeats by probabilistic consistency","volume":"24","author":"Biegert","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B8","first-page":"408","article-title":"Novel repetitive sequence motifs in the alpha and beta subunits of prenyl-protein transferases and homology of the alpha subunit to the MAD2 gene product in yeast","volume":"4","author":"Boguski","year":"1992","journal-title":"New Biol."},{"key":"2023012508264328000_B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-8-210","article-title":"A multiresolution approach to automated classification of protein subcellular location images","volume":"8","author":"Chebira","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012508264328000_B10","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1007\/978-1-4612-2544-7_9","article-title":"Translation invariant de-noising","volume":"103","author":"Coifman","year":"1995","journal-title":"Lecture Notes Stat."},{"key":"2023012508264328000_B11","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1093\/bioinformatics\/14.6.498","article-title":"Detecting periodic patterns in biological sequences","volume":"14","author":"Coward","year":"1998","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B12","first-page":"53","article-title":"Chapter 3","volume-title":"Ten Lectures on Wavelets (CBMS-NSF Regional Conference Series in Applied Mathematics).","author":"Daubechies","year":"1992"},{"key":"2023012508264328000_B13","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1016\/S0968-0004(00)01643-1","article-title":"The REPRO server: finding protein internal sequence repeats through the web","volume":"25","author":"George","year":"2000","journal-title":"Trends Biochem. Sci."},{"key":"2023012508264328000_B14","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1038\/35018610","article-title":"Beta-helix structure and ice-binding properties of a hyperactive antifreeze protein from an insect","volume":"406","author":"Graether","year":"2000","journal-title":"Nature"},{"key":"2023012508264328000_B15","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1016\/S0959-440X(99)80052-9","article-title":"Topological characteristics of helical repeat proteins","volume":"9","author":"Groves","year":"1999","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023012508264328000_B16","doi-asserted-by":"crossref","first-page":"W239","DOI":"10.1093\/nar\/gki405","article-title":"REPPER-repeats and their periodicities in fibrous proteins","volume":"33","author":"Gruber","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012508264328000_B17","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1002\/1097-0134(20001101)41:2<224::AID-PROT70>3.0.CO;2-Z","article-title":"Rapid automatic detection and alignment of repeats in protein sequence","volume":"41","author":"Heger","year":"2000","journal-title":"Proteins Struct. Funct. Genet."},{"key":"2023012508264328000_B18","doi-asserted-by":"crossref","first-page":"1094","DOI":"10.1016\/j.jmb.2006.02.039","article-title":"Standard conformations of beta-arches in beta-solenoid protein","volume":"358","author":"Hennetin","year":"2006","journal-title":"J. Mol. Biol."},{"key":"2023012508264328000_B19","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/1471-2164-9-S2-S17","article-title":"Array CGH data modeling and smoothing in stationary wavelet packet transform domain","volume":"9","author":"Huang","year":"2008","journal-title":"BMC Genomics"},{"key":"2023012508264328000_B20","doi-asserted-by":"crossref","first-page":"49791","DOI":"10.1074\/jbc.M204982200","article-title":"What curve alpha-solenoid?","volume":"227","author":"Kajava","year":"2002","journal-title":"J. Biol. Chem."},{"key":"2023012508264328000_B21","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1006\/jmbi.1998.1643","article-title":"Structural diversity of leucine-rich repeat proteins","volume":"277","author":"Kajava","year":"1998","journal-title":"J. Mol. Biol."},{"key":"2023012508264328000_B22","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1006\/jsbi.2000.4328","article-title":"Review: proteins with repeated sequence-structural prediction and modeling","volume":"134","author":"Kajava","year":"2001","journal-title":"J. Struct. Biol."},{"key":"2023012508264328000_B23","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1016\/S0968-0004(00)01667-4","article-title":"When protein folding is simplified to protein coiling: the continuum of solenoid protein structures","volume":"25","author":"Kobe","year":"2000","journal-title":"Trends in Biochemical. Sci."},{"key":"2023012508264328000_B24","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0004-3702(97)00043-X","article-title":"Wrappers for feature subset selection","volume":"97","author":"Kohavia","year":"1997","journal-title":"Artificial Intelligence"},{"key":"2023012508264328000_B25","first-page":"340","article-title":"Chapter 12","volume-title":"Principles of Multivariate Analysis: A User's Perspective (Oxford Statistical Science Series).","author":"Krzanowski","year":"1988"},{"key":"2023012508264328000_B26","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1093\/bioinformatics\/19.1.2","article-title":"Wavelets in bioinformatics and computational biology: state of art and perspectives","volume":"19","author":"Li","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B27","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1016\/S0968-0004(97)01117-1","article-title":"Self-compartmentalizing proteases","volume":"22","author":"Lupas","year":"1997","journal-title":"Trends Biochem. Sci."},{"key":"2023012508264328000_B28","doi-asserted-by":"crossref","first-page":"674","DOI":"10.1109\/34.192463","article-title":"A theory for multiresolution signal decomposition: the wavelet representation","volume":"11","author":"Mallat","year":"1989","journal-title":"IEEE Pattern Anal. Mach. Intell."},{"key":"2023012508264328000_B29","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1006\/jmbi.1999.3136","article-title":"Census of protein repeats","volume":"293","author":"Marcotte","year":"1999","journal-title":"J. Mol. Biol."},{"key":"2023012508264328000_B30","doi-asserted-by":"crossref","first-page":"i289","DOI":"10.1093\/bioinformatics\/btp232","article-title":"REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by Fourier transform","volume":"25","author":"Marsella","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B31","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1006\/jmbi.2001.5332","article-title":"Wavelet transforms for the characterization and detection of repeating motifs","volume":"316","author":"Murray","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023012508264328000_B32","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1002\/prot.20202","article-title":"Toward the detection and validation of repeats in protein structure","volume":"57","author":"Murray","year":"2004","journal-title":"Proteins"},{"key":"2023012508264328000_B33","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1007\/978-1-4612-2544-7_17","article-title":"The stationary wavelet transform and some statistical applications","volume":"103","author":"Nason","year":"1995","journal-title":"Lecture Notes Stat."},{"key":"2023012508264328000_B34","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1089\/cmb.2009.0013","article-title":"Stationary wavelet packet transform and dependent Laplacian bivariate shrinkage estimator for array-CGH data smoothing","volume":"17","author":"Nguyen","year":"2010","journal-title":"J. Comput. Biol."},{"key":"2023012508264328000_B35","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/nar\/gkg062","article-title":"The CATH database: an extended protein family resource for structural and functional genomics","volume":"31","author":"Pearl","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012508264328000_B36","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1038\/nsb1296-991","article-title":"A leucine-rich repeat variant with a novel repetitive protein structural motif","volume":"3","author":"Peters","year":"1996","journal-title":"Struct. Biol."},{"key":"2023012508264328000_B37","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1126\/science.270.5238.997","article-title":"A left-handed parallel beta helix in the structure of UDP-N-acetylglucosamine acyltransferase","volume":"270","author":"Raetz","year":"1995","journal-title":"Science"},{"key":"2023012508264328000_B38","doi-asserted-by":"crossref","first-page":"2632","DOI":"10.1093\/bioinformatics\/btn488","article-title":"TESE: generating specific protein structure test set ensembles","volume":"24","author":"Sirocco","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B39","doi-asserted-by":"crossref","first-page":"W137","DOI":"10.1093\/nar\/gkl130","article-title":"HHrep: de novo protein repeat detection and the origin of TIM barrels","volume":"34","author":"Soding","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012508264328000_B40","doi-asserted-by":"crossref","first-page":"i311","DOI":"10.1093\/bioinformatics\/bth911","article-title":"Tracking repeats using significance and transitivity","volume":"20","author":"Szklarczyk","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012508264328000_B41","doi-asserted-by":"crossref","first-page":"626","DOI":"10.1109\/5.488704","article-title":"A review of wavelets in biomedical applications","volume":"84","author":"Unser","year":"1996","journal-title":"Proc. IEEE"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/18\/i467\/48856608\/bioinformatics_26_18_i467.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/18\/i467\/48856608\/bioinformatics_26_18_i467.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T08:27:11Z","timestamp":1674635231000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/18\/i467\/205279"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,9,4]]},"references-count":41,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2010,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq371","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,9,15]]},"published":{"date-parts":[[2010,9,4]]}}}