{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T00:00:10Z","timestamp":1773273610389,"version":"3.50.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Amino acid substitution matrices play a central role in protein alignment methods. Standard log-odds matrices, such as those of the PAM and BLOSUM series, are constructed from large sets of protein alignments having implicit background amino acid frequencies. However, these matrices frequently are used to compare proteins with markedly different amino acid compositions, such as transmembrane proteins or proteins from organisms with strongly biased nucleotide compositions. It has been argued elsewhere that standard matrices are not ideal for such comparisons and, furthermore, a rationale has been presented for transforming a standard matrix for use in a non-standard compositional context.<\/jats:p>\n               <jats:p>Results: This paper presents the mathematical details underlying the compositional adjustment of amino acid or DNA substitution matrices.<\/jats:p>\n               <jats:p>Availability: Programs implementing the methods described are available from the authors upon request.<\/jats:p>\n               <jats:p>Contact: \u00a0altschul@ncbi.nlm.nih.gov<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti070","type":"journal-article","created":{"date-parts":[[2004,10,28]],"date-time":"2004-10-28T00:23:01Z","timestamp":1098922981000},"page":"902-911","source":"Crossref","is-referenced-by-count":81,"title":["The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions"],"prefix":"10.1093","volume":"21","author":[{"given":"Yi-Kuo","family":"Yu","sequence":"first","affiliation":[{"name":"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health Bethesda, MD 20894, USA"}]},{"given":"Stephen F.","family":"Altschul","sequence":"additional","affiliation":[{"name":"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health Bethesda, MD 20894, USA"}]}],"member":"286","published-online":{"date-parts":[[2004,10,27]]},"reference":[{"key":"2023013107282897300_B1","unstructured":"Altschul, S.F. 1991Amino acid substitution matrices from an information theoretic perspective. J. Mol. Biol.219555\u2013565"},{"key":"2023013107282897300_B2","doi-asserted-by":"crossref","unstructured":"Altschul, S.F. 1993A protein alignment scoring system sensitive at all evolutionary distances. J. Mol. Evol.36290\u2013300","DOI":"10.1007\/BF00160485"},{"key":"2023013107282897300_B3","doi-asserted-by":"crossref","unstructured":"Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. 1990Basic local alignment search tool. J. Mol. Biol.215403\u2013410","DOI":"10.1016\/S0022-2836(05)80360-2"},{"key":"2023013107282897300_B4","doi-asserted-by":"crossref","unstructured":"Altschul, S.F., Madden, T.L., Sch\u00e4ffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. 1997Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res.253389\u20133402","DOI":"10.1093\/nar\/25.17.3389"},{"key":"2023013107282897300_B5","unstructured":"Altschul, S.F., Bundschuh, R., Olsen, R., Hwa, T. 2001The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Res.29351\u2013361"},{"key":"2023013107282897300_B6","unstructured":"Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. 1978A model of evolutionary change in proteins. In Dayhoff, M.O. (Ed.). Atlas of Protein Sequence and Structure , Washington, DC  National Biomedical Research Foundation vol. 5Suppl. 3,,  pp. 345\u2013352"},{"key":"2023013107282897300_B7","doi-asserted-by":"crossref","unstructured":"Dembo, A., Karlin, S., Zeitouni, O. 1994Limit distribution of maximal non-aligned two-sequence segmental score. Ann. Prob.222022\u20132039","DOI":"10.1214\/aop\/1176988493"},{"key":"2023013107282897300_B8","doi-asserted-by":"crossref","unstructured":"Henikoff, S. and Henikoff, J.G. 1992Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci., USA8910915\u201310919","DOI":"10.1073\/pnas.89.22.10915"},{"key":"2023013107282897300_B9","doi-asserted-by":"crossref","unstructured":"Karlin, S. and Altschul, S.F. 1990Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl Acad. Sci., USA872264\u20132268","DOI":"10.1073\/pnas.87.6.2264"},{"key":"2023013107282897300_B10","doi-asserted-by":"crossref","unstructured":"Kapatral, V., Anderson, I., Ivanova, N., Reznik, G., Los, T., Lykidis, A., Bhattacharyya, A., Bartman, A., Gardner, W., Grechkin, G., et al. 2002Genome sequence and analysis of the oral bacterium Fusobacterium nucleatum strain ATCC 25586. J. Bacteriol.1842005\u20132018","DOI":"10.1128\/JB.184.7.2005-2018.2002"},{"key":"2023013107282897300_B11","doi-asserted-by":"crossref","unstructured":"Kim, H., Certa, U., Dobeli, H., Jakob, P., Hol, W.G. 1998Crystal structure of fructose-1,6-bisphosphate aldolase from the human malaria parasite Plasmodium falciparum. Biochemistry374388\u20134396","DOI":"10.2210\/pdb1a5c\/pdb"},{"key":"2023013107282897300_B12","doi-asserted-by":"crossref","unstructured":"Knight, R.D., Freeland, S.J., Landweber, L.F. 2001A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol.2research0010.1\u2013research0010.13","DOI":"10.1186\/gb-2001-2-4-research0010"},{"key":"2023013107282897300_B13","doi-asserted-by":"crossref","unstructured":"Muller, T., Rahmann, S., Rehmsmeier, M. 2001Non-symmetric score matrices and the detection of homologous transmembrane proteins. Bioinformatics17(Suppl. 1),S182\u2013S189","DOI":"10.1093\/bioinformatics\/17.suppl_1.S182"},{"key":"2023013107282897300_B14","doi-asserted-by":"crossref","unstructured":"Ng, P.C., Henikoff, J.G., Henikoff, S. 2000PHAT: a transmembrane-specific substitution matrix. Bioinformatics16760\u2013766","DOI":"10.1093\/bioinformatics\/16.9.760"},{"key":"2023013107282897300_B15","unstructured":"Pearson, W.R. and Lipman, D.J. 1988Improved tools for biological sequence comparison. Proc. Natl Acad. Sci., USA852444\u20132448"},{"key":"2023013107282897300_B16","unstructured":"Rump, S.M. 1979Polynomial minimum root separation. Math. Comput.33327\u2013336"},{"key":"2023013107282897300_B17","doi-asserted-by":"crossref","unstructured":"Sch\u00e4ffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge, J.L., Wolf, Y.I., Koonin, E.V., Altschul, S.F. 2001Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res.292994\u20133005","DOI":"10.1093\/nar\/29.14.2994"},{"key":"2023013107282897300_B18","unstructured":"Schwartz, R.M. and Dayhoff, M.O. 1978Matrices for detecting distant relationships. In Dayhoff, M.O. (Ed.). Atlas of Protein Sequence and Structure , Washington, DC  National Biomedical Research Foundation vol. 5Suppl. 3,,  pp. 353\u2013358"},{"key":"2023013107282897300_B19","unstructured":"Smith, T.F. and Waterman, M.S. 1981Identification of common molecular subsequences. J. Mol. Biol.147195\u2013197"},{"key":"2023013107282897300_B20","doi-asserted-by":"crossref","unstructured":"States, D.J., Gish, W., Altschul, S.F. 1991Improved sensitivity of nucleic acid database searches using application-specific scoring matrices. Methods366\u201370","DOI":"10.1016\/S1046-2023(05)80165-3"},{"key":"2023013107282897300_B21","unstructured":"Sueoka, N. 1988Directional mutation pressure and neutral molecular evolution. Proc. Natl Acad. Sci., USA852653\u20132657"},{"key":"2023013107282897300_B22","doi-asserted-by":"crossref","unstructured":"Wan, H. and Wootton, J.C. 2000A global compositional complexity measure for biological sequences: AT-rich and GC-rich genomes encode less complex proteins. Comput. Chem.2471\u201394","DOI":"10.1016\/S0097-8485(00)80008-X"},{"key":"2023013107282897300_B23","doi-asserted-by":"crossref","unstructured":"Yu, Y.-K., Wootton, J.C., Altschul, S.F. 2003The compositional adjustment of amino acid substitution matrices. Proc. Natl Acad. Sci., USA10015688\u201315693","DOI":"10.1073\/pnas.2533904100"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/902\/48967269\/bioinformatics_21_7_902.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/902\/48967269\/bioinformatics_21_7_902.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T10:36:46Z","timestamp":1675161406000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/7\/902\/268768"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,10,27]]},"references-count":23,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2005,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti070","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,4,1]]},"published":{"date-parts":[[2004,10,27]]}}}