{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T19:32:53Z","timestamp":1716924773634},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"9","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Most biological sequences contain compositionally biased segments in which one or more residue types are significantly overrepresented. The function and evolution of these segments are poorly understood. Usually, all types of compositionally biased segments are masked and ignored during sequence analysis. However, it has been shown for a number of proteins that biased segments that contain amino acids with similar chemical properties are involved in a variety of molecular functions and human diseases. A detailed large-scale analysis of the functional implications and evolutionary conservation of different compositionally biased segments requires a sensitive method capable of detecting user-specified types of compositional bias.<\/jats:p>\n               <jats:p>Results: We present BIAS, a novel sensitive method for the detection of compositionally biased segments composed of a user-specified set of residue types. BIAS uses the discrete scan statistics that provides a highly accurate correction for multiple tests to compute analytical estimates of the significance of each compositionally biased segment. The method can take into account global compositional bias when computing analytical estimates of the significance of local clusters. BIAS is benchmarked against SEG, SAPS and CAST programs. We also use BIAS to show that groups of proteins with the same biological function are significantly associated with particular types of compositionally biased segments.<\/jats:p>\n               <jats:p>Availability: The software is available at<\/jats:p>\n               <jats:p>Contact: \u00a0ikuznetsov@albany.edu<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl049","type":"journal-article","created":{"date-parts":[[2006,2,25]],"date-time":"2006-02-25T01:14:23Z","timestamp":1140830063000},"page":"1055-1063","source":"Crossref","is-referenced-by-count":14,"title":["A novel sensitive method for the detection of user-defined compositional bias in biological sequences"],"prefix":"10.1093","volume":"22","author":[{"given":"Igor B.","family":"Kuznetsov","sequence":"first","affiliation":[{"name":"Gen*NY*sis Center for Excellence in Cancer Genomics, Department of Epidemiology and Biostatistics, University at Albany, State University of New York \u00a0 One Discovery Drive, Rensselaer, NY 12144, USA"}]},{"given":"Seungwoo","family":"Hwang","sequence":"additional","affiliation":[{"name":"Gen*NY*sis Center for Excellence in Cancer Genomics, Department of Epidemiology and Biostatistics, University at Albany, State University of New York \u00a0 One Discovery Drive, Rensselaer, NY 12144, USA"}]}],"member":"286","published-online":{"date-parts":[[2006,2,24]]},"reference":[{"key":"2023012409135290800_b1","doi-asserted-by":"crossref","first-page":"672","DOI":"10.1093\/bioinformatics\/18.5.672","article-title":"Detecting cryptically simple protein sequences using the SIMPLE algorithm","volume":"8","author":"Alba","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012409135290800_b2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012409135290800_b3","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012409135290800_b4","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1093\/nar\/28.1.45","article-title":"The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012409135290800_b5","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1006\/jsbi.1998.3965","article-title":"Supercoiled protein motifs: the collagen triple-helix and the alpha-helical coiled coil","volume":"122","author":"Beck","year":"1998","journal-title":"J. Struct. Biol."},{"key":"2023012409135290800_b6","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1093\/protein\/12.1.23","article-title":"Amino acid composition of protein termini are biased in different manners","volume":"12","author":"Berezovsky","year":"1999","journal-title":"Protein Eng."},{"key":"2023012409135290800_b7","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012409135290800_b8","doi-asserted-by":"crossref","first-page":"5698","DOI":"10.1073\/pnas.86.15.5698","article-title":"Association of charge clusters with functional domains of cellular transcription factors","volume":"86","author":"Brendel","year":"1989","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409135290800_b9","doi-asserted-by":"crossref","first-page":"2002","DOI":"10.1073\/pnas.89.6.2002","article-title":"Methods and algorithms for statistical analysis of protein sequences","volume":"89","author":"Brendel","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409135290800_b10","doi-asserted-by":"crossref","first-page":"1166","DOI":"10.1110\/ps.8.6.1166","article-title":"Polymer principles and protein folding","volume":"8","author":"Dill","year":"1999","journal-title":"Protein Sci."},{"key":"2023012409135290800_b11","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1006\/jmbi.1993.1576","article-title":"Refined crystal structure of the seryl-tRNA synthetase from Thermus thermophilus at 2.5\u00c5 resolution","volume":"234","author":"Fujinaga","year":"1993","journal-title":"J. Mol. Biol."},{"key":"2023012409135290800_b12","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.1126\/science.282.5391.1126","article-title":"Chromosome 2 sequence of the human malaria parasite Plasmodium falciparum","volume":"282","author":"Gardner","year":"1998","journal-title":"Science"},{"key":"2023012409135290800_b13","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1007\/978-1-4757-3460-7","volume-title":"Scan Statistics","author":"Glaz","year":"2001"},{"key":"2023012409135290800_b14","doi-asserted-by":"crossref","first-page":"R40","DOI":"10.1186\/gb-2003-4-6-r40","article-title":"A method to assess compositional bias in biological sequences and its application to prion-like glutamine\/asparagine-rich domains in eukaryotic proteomes","volume":"4","author":"Harrison","year":"2003","journal-title":"Genome Biol."},{"key":"2023012409135290800_b15","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1007\/s002390010073","article-title":"Evolution of simple sequence in proteins","volume":"51","author":"Huntley","year":"2000","journal-title":"J. Mol. Evol."},{"key":"2023012409135290800_b16","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1002\/prot.10150","article-title":"Simple sequences are rare in the Protein Data Bank","volume":"48","author":"Huntley","year":"2002","journal-title":"Proteins"},{"key":"2023012409135290800_b17","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1073\/pnas.93.4.1560","article-title":"Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development","volume":"93","author":"Karlin","year":"1996","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409135290800_b18","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1016\/0076-6879(90)83026-6","article-title":"Identification of significant sequence patterns in proteins","volume":"183","author":"Karlin","year":"1990","journal-title":"Methods Enzymol."},{"key":"2023012409135290800_b19","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1073\/pnas.012608599","article-title":"Amino acid runs in eukaryotic proteomes and disease associations","volume":"99","author":"Karlin","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409135290800_b20","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1016\/S0959-440X(03)00073-3","article-title":"Genome comparisons and analysis","volume":"13","author":"Karlin","year":"2003","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023012409135290800_b21","doi-asserted-by":"crossref","first-page":"4214","DOI":"10.1093\/emboj\/20.15.4214","article-title":"The kink-turn: a new RNA secondary structure motif","volume":"20","author":"Klein","year":"2001","journal-title":"EMBO J."},{"key":"2023012409135290800_b22","doi-asserted-by":"crossref","first-page":"770","DOI":"10.1038\/nsb0901-770","article-title":"Crystal structure of the human prion protein reveals a mechanism for oligomerization","volume":"8","author":"Knaus","year":"2001","journal-title":"Nat. Struct. Biol."},{"key":"2023012409135290800_b23","doi-asserted-by":"crossref","first-page":"1672","DOI":"10.1093\/bioinformatics\/btg212","article-title":"Comparison of sequence masking algorithms and the detection of biased protein sequence regions","volume":"19","author":"Kreil","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012409135290800_b24","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/S0753-3322(99)80059-6","article-title":"Trafficking of the cellular isoform of the prion protein","volume":"53","author":"Lehmann","year":"1999","journal-title":"Biomed. Pharmacother."},{"key":"2023012409135290800_b25","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1093\/bioinformatics\/18.1.77","article-title":"Tolerating some redundancy significantly speeds up clustering of large protein databases","volume":"8","author":"Li","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012409135290800_b26","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1007\/PL00006396","article-title":"Biased usages of arginines and lysines in proteins are correlated with local-scale fluctuations of the G + C content of DNA sequences","volume":"47","author":"Nishizawa","year":"1998","journal-title":"J. Mol. Evol."},{"key":"2023012409135290800_b27","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1093\/bioinformatics\/16.10.915","article-title":"CAST: an iterative algorithm for the complexity analysis of sequence tracts","volume":"16","author":"Promponas","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012409135290800_b28","doi-asserted-by":"crossref","first-page":"13363","DOI":"10.1073\/pnas.95.23.13363","article-title":"Prions","volume":"95","author":"Prusiner","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409135290800_b29","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1080\/07391102.2005.10507013","article-title":"Underlying hydrophobic sequence periodicity of protein tertiary structure","volume":"22","author":"Silverman","year":"2005","journal-title":"J. Biomol. Struct. Dyn."},{"key":"2023012409135290800_b30","doi-asserted-by":"crossref","first-page":"1581","DOI":"10.1093\/oxfordjournals.molbev.a026257","article-title":"Nucleotide bias causes a genome wide bias in the amino acid composition of proteins","volume":"17","author":"Singer","year":"2000","journal-title":"Mol. Biol. Evol."},{"key":"2023012409135290800_b31","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1021\/ar030266l","article-title":"C-type cytochrome formation: chemical and biological enigmas","volume":"37","author":"Stevens","year":"2004","journal-title":"Acc. Chem. Res."},{"key":"2023012409135290800_b32","doi-asserted-by":"crossref","first-page":"4091","DOI":"10.1002\/j.1460-2075.1989.tb08593.x","article-title":"Differential expression of two Xenopus c-myc proto-oncogenes during development","volume":"8","author":"Vriz","year":"1989","journal-title":"EMBO J."},{"key":"2023012409135290800_b33","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1016\/S0076-6879(96)66035-2","article-title":"Analysis of compositionally biased regions in sequence databases","volume":"266","author":"Wootton","year":"1996","journal-title":"Methods Enzymol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/9\/1055\/48840125\/bioinformatics_22_9_1055.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/9\/1055\/48840125\/bioinformatics_22_9_1055.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T09:52:31Z","timestamp":1674553951000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/9\/1055\/200151"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,2,24]]},"references-count":33,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2006,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl049","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,5,1]]},"published":{"date-parts":[[2006,2,24]]}}}