{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T12:54:27Z","timestamp":1773838467395,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns.<\/jats:p>\n               <jats:p>Results: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column\u2019s observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions.<\/jats:p>\n               <jats:p>Availability and implementation: Our new measures are implemented in an open-source Web-based logo generation program, which is available at http:\/\/www.ncbi.nlm.nih.gov\/CBBresearch\/Yu\/logoddslogo\/index.html . A stand-alone version of the program is also available from this site.<\/jats:p>\n               <jats:p>Contact: \u00a0altschul@ncbi.nlm.nih.gov<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu634","type":"journal-article","created":{"date-parts":[[2014,10,8]],"date-time":"2014-10-08T00:30:13Z","timestamp":1412728213000},"page":"324-331","source":"Crossref","is-referenced-by-count":16,"title":["Log-odds sequence logos"],"prefix":"10.1093","volume":"31","author":[{"given":"Yi-Kuo","family":"Yu","sequence":"first","affiliation":[{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John A.","family":"Capra","sequence":"additional","affiliation":[{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"},{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Aleksandar","family":"Stojmirovi\u0107","sequence":"additional","affiliation":[{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Landsman","sequence":"additional","affiliation":[{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephen F.","family":"Altschul","sequence":"additional","affiliation":[{"name":"1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, 2 Center for Human Genetics Research and 3 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2014,10,6]]},"reference":[{"key":"2023020116163838700_btu634-B1","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1016\/0022-2836(91)90193-A","article-title":"Amino acid substitution matrices from an information theoretic perspective","volume":"219","author":"Altschul","year":"1991","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B3","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B4","doi-asserted-by":"crossref","first-page":"815","DOI":"10.1093\/nar\/gkn981","article-title":"PSI-BLAST pseudocounts and the minimum description length principle","volume":"37","author":"Altschul","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B5","doi-asserted-by":"crossref","first-page":"e1000852","DOI":"10.1371\/journal.pcbi.1000852","article-title":"The construction and use of log-odds substitution scores for multiple sequence alignment","volume":"6","author":"Altschul","year":"2010","journal-title":"PLoS Comp. Biol."},{"key":"2023020116163838700_btu634-B6","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1016\/0022-2836(89)90234-9","article-title":"Weights for data related by a tree","volume":"207","author":"Altschul","year":"1989","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B7","doi-asserted-by":"crossref","first-page":"e160","DOI":"10.1371\/journal.pcbi.0030160","article-title":"Automated protein subfamily identification and classification","volume":"3","author":"Brown","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023020116163838700_btu634-B8","first-page":"47","article-title":"Using Dirichlet mixture priors to derive hidden Markov models for protein families","volume-title":"Proceedings of First International Conference on Intelligent System for Molecular Biology","author":"Brown","year":"1993"},{"key":"2023020116163838700_btu634-B9","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btm270","article-title":"Predicting functionally important residues from sequence conservation","volume":"23","author":"Capra","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020116163838700_btu634-B10","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1038\/nmeth1109-786","article-title":"Improved visualization of protein consensus sequences by iceLogo","volume":"6","author":"Colaert","year":"2009","journal-title":"Nat. Methods"},{"key":"2023020116163838700_btu634-B11","doi-asserted-by":"crossref","DOI":"10.1002\/0471200611","volume-title":"Elements of Information Theory","author":"Cover","year":"1991"},{"key":"2023020116163838700_btu634-B12","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1101\/gr.849004","article-title":"WebLogo: a sequence logo generator","volume":"14","author":"Crooks","year":"2004","journal-title":"Genome Res."},{"key":"2023020116163838700_btu634-B13","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0097-8485(96)80004-0","article-title":"Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching","volume":"20","author":"Gribskov","year":"1996","journal-title":"Comput. Chem."},{"key":"2023020116163838700_btu634-B14","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/4643.001.0001","volume-title":"The Minimum Description Length Principle","author":"Gr\u00fcnwald","year":"2007"},{"key":"2023020116163838700_btu634-B15","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020116163838700_btu634-B16","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1016\/0022-2836(94)90032-9","article-title":"Position-based sequence weights","volume":"243","author":"Henikoff","year":"1994","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B17","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1098\/rspa.1946.0056","article-title":"An invariant form of the prior probability in estimation problems","volume":"186","author":"Jeffreys","year":"1946","journal-title":"Proc. R. Soc. London Ser. A"},{"key":"2023020116163838700_btu634-B18","doi-asserted-by":"crossref","first-page":"2264","DOI":"10.1073\/pnas.87.6.2264","article-title":"Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes","volume":"87","author":"Karlin","year":"1990","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020116163838700_btu634-B19","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1002\/bies.950150807","article-title":"A signature for the HMG-1 box DNA-binding proteins","volume":"15","author":"Landsman","year":"1993","journal-title":"Bioessays"},{"key":"2023020116163838700_btu634-B20","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1126\/science.8211139","article-title":"Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment","volume":"262","author":"Lawrence","year":"1993","journal-title":"Science"},{"key":"2023020116163838700_btu634-B21","doi-asserted-by":"crossref","first-page":"D348","DOI":"10.1093\/nar\/gks1243","article-title":"CDD: conserved domains and protein three-dimensional structure","volume":"41","author":"Marchler-Bauer","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B22","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1016\/S0022-2836(02)00938-5","article-title":"The \n              S. cerevisiae\n               architectural HMGB protein NHP6A complexed with DNA: DNA and protein conformational changes upon binding","volume":"323","author":"Masse","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1089\/cmb.2012.0244","article-title":"Dirichlet mixtures, the Dirichlet process, and the structure of protein space","volume":"20","author":"Nguyen","year":"2013","journal-title":"J. Comput. Biol."},{"key":"2023020116163838700_btu634-B24","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1093\/nar\/gkn1019","article-title":"Pseudocounts for transcription factor binding sites","volume":"37","author":"Nishida","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B25","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1038\/nmeth.2646","article-title":"pLogo: a probabilistic approach to visualizing sequence motifs","volume":"10","author":"O\u2019Shea","year":"2013","journal-title":"Nat. Methods"},{"key":"2023020116163838700_btu634-B26","doi-asserted-by":"crossref","first-page":"2444","DOI":"10.1073\/pnas.85.8.2444","article-title":"Improved tools for biological sequence comparison","volume":"85","author":"Pearson","year":"1988","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020116163838700_btu634-B27","doi-asserted-by":"crossref","first-page":"8880","DOI":"10.1073\/pnas.88.20.8880","article-title":"Distribution of glutamine and asparagine residues and their near neighbors in peptides and proteins","volume":"88","author":"Robinson","year":"1991","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020116163838700_btu634-B28","doi-asserted-by":"crossref","first-page":"2994","DOI":"10.1093\/nar\/29.14.2994","article-title":"Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements","volume":"29","author":"Sch\u00e4ffer","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1088\/0957-4484\/5\/1\/001","article-title":"Sequence logos, machine\/channel capacity, Maxwell\u2019s demon, and molecular computers: a review of the theory of molecular machines","volume":"5","author":"Schneider","year":"1994","journal-title":"Nanotechnology"},{"key":"2023020116163838700_btu634-B30","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"Schneider","year":"1990","journal-title":"Nucleic Acids Res."},{"key":"2023020116163838700_btu634-B31","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1016\/0022-2836(86)90165-8","article-title":"Information content of binding sites on nucleotide sequences","volume":"188","author":"Schneider","year":"1986","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B32","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/1471-2105-5-7","article-title":"HMM Logos for visualization of protein families","volume":"5","author":"Schuster-B\u00f6ckler","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020116163838700_btu634-B33","first-page":"353","article-title":"Matrices for detecting distant relationships","volume-title":"Atlas of Protein Sequence and Structure","author":"Schwartz","year":"1978"},{"key":"2023020116163838700_btu634-B34","first-page":"327","article-title":"Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology","volume":"12","author":"Sj\u00f6lander","year":"1996","journal-title":"Comput. Appl. Biosci."},{"key":"2023020116163838700_btu634-B35","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2023020116163838700_btu634-B36","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1006\/jtbi.1998.0785","article-title":"Information content and free energy in DNA\u2013protein interactions","volume":"195","author":"Stormo","year":"1998","journal-title":"J. Theor. Biol."},{"key":"2023020116163838700_btu634-B37","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.bbagrm.2009.09.008","article-title":"HMGB proteins: interactions with DNA and chromatin","volume":"1799","author":"Stros","year":"2010","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020116163838700_btu634-B38","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1093\/protein\/12.5.387","article-title":"PSIC: profile extraction from sequence alignments with position-specific counts of independent observations","volume":"12","author":"Sunyaev","year":"1999","journal-title":"Protein Eng."},{"key":"2023020116163838700_btu634-B39","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.1093\/bioinformatics\/btl151","article-title":"Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments","volume":"22","author":"Vacic","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020116163838700_btu634-B40","doi-asserted-by":"crossref","first-page":"W389","DOI":"10.1093\/nar\/gki439","article-title":"enoLOGOS: a versatile web tool for energy normalized sequence logos","volume":"33","author":"Workman","year":"2005","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/3\/324\/49012845\/bioinformatics_31_3_324.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/3\/324\/49012845\/bioinformatics_31_3_324.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:29:11Z","timestamp":1675297751000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/3\/324\/2365439"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,10,6]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu634","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,2,1]]},"published":{"date-parts":[[2014,10,6]]}}}