{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T17:58:14Z","timestamp":1772042294360,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1041,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Although many amino acid substitution matrices have been developed, it has not been well understood which is the best for similarity searches, especially for remote homology detection. Therefore, we collected information related to existing matrices, condensed it and derived a novel matrix that can detect more remote homology than ever.<\/jats:p>\n               <jats:p>Results: Using principal component analysis with existing matrices and benchmarks, we developed a novel matrix, which we designate as MIQS. The detection performance of MIQS is validated and compared with that of existing general purpose matrices using SSEARCH with optimized gap penalties for each matrix. Results show that MIQS is able to detect more remote homology than the existing matrices on an independent dataset. In addition, the performance of our developed matrix was superior to that of CS-BLAST, which was a novel similarity search method with no amino acid matrix. We also evaluated the alignment quality of matrices and methods, which revealed that MIQS shows higher alignment sensitivity than that with the existing matrix series and CS-BLAST. Fundamentally, these results are expected to constitute good proof of the availability and\/or importance of amino acid matrices in sequence analysis. Moreover, with our developed matrix, sophisticated similarity search methods such as sequence\u2013profile and profile\u2013profile comparison methods can be improved further.<\/jats:p>\n               <jats:p>Availability and implementation: Newly developed matrices and datasets used for this study are available at http:\/\/csas.cbrc.jp\/Ssearch\/.<\/jats:p>\n               <jats:p>Contact: \u00a0k-tomii@aist.go.jp<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt694","type":"journal-article","created":{"date-parts":[[2013,11,27]],"date-time":"2013-11-27T04:14:49Z","timestamp":1385525689000},"page":"317-325","source":"Crossref","is-referenced-by-count":45,"title":["Revisiting amino acid substitution matrices for identifying distantly related proteins"],"prefix":"10.1093","volume":"30","author":[{"given":"Kazunori","family":"Yamada","sequence":"first","affiliation":[{"name":"Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan"}]},{"given":"Kentaro","family":"Tomii","sequence":"additional","affiliation":[{"name":"Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan"}]}],"member":"286","published-online":{"date-parts":[[2013,11,26]]},"reference":[{"key":"2023012710402024000_btt694-B1","doi-asserted-by":"crossref","first-page":"S19","DOI":"10.1186\/1471-2164-13-S7-S19","article-title":"The parasite specific substitution matrices improve the annotation of apicomplexan proteins","volume":"13","author":"Ali","year":"2012","journal-title":"BMC Genomics"},{"key":"2023012710402024000_btt694-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B3","doi-asserted-by":"crossref","first-page":"D419","DOI":"10.1093\/nar\/gkm993","article-title":"Data growth and its impact on the SCOP database: new developments","volume":"36","author":"Andreeva","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B4","doi-asserted-by":"crossref","first-page":"3240","DOI":"10.1093\/bioinformatics\/bts622","article-title":"Discriminative modelling of context-specific amino acid substitution probabilities","volume":"28","author":"Angermuller","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B5","doi-asserted-by":"crossref","first-page":"1323","DOI":"10.1093\/protein\/7.11.1323","article-title":"Amino acid substitution during functionally constrained divergent evolution of protein sequences","volume":"7","author":"Benner","year":"1994","journal-title":"Protein Eng."},{"key":"2023012710402024000_btt694-B6","doi-asserted-by":"crossref","first-page":"3770","DOI":"10.1073\/pnas.0810767106","article-title":"Sequence context-specific profiles for homology searching","volume":"106","author":"Biegert","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710402024000_btt694-B7","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1186\/1471-2105-9-236","article-title":"A novel series of compositionally biased substitution matrices for comparing Plasmodium proteins","volume":"9","author":"Brick","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012710402024000_btt694-B8","doi-asserted-by":"crossref","first-page":"D189","DOI":"10.1093\/nar\/gkh034","article-title":"The ASTRAL Compendium in 2004","volume":"32","author":"Chandonia","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B9","doi-asserted-by":"crossref","first-page":"3704","DOI":"10.1093\/bioinformatics\/bti616","article-title":"Pairwise alignment incorporating dipeptide covariation","volume":"21","author":"Crooks","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B10","first-page":"345","article-title":"A model of evolutionary change in proteins","volume":"5","author":"Dayhoff","year":"1978","journal-title":"Atlas Protein Seq. Strut."},{"key":"2023012710402024000_btt694-B11","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1007\/s00239-001-2304-y","article-title":"rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny","volume":"55","author":"Dimmic","year":"2002","journal-title":"J. Mol. Evol."},{"key":"2023012710402024000_btt694-B12","doi-asserted-by":"crossref","first-page":"396","DOI":"10.1186\/1471-2105-10-396","article-title":"Optimizing substitution matrix choice and gap parameters for sequence alignment","volume":"10","author":"Edgar","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012710402024000_btt694-B13","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B14","doi-asserted-by":"crossref","first-page":"S116","DOI":"10.1093\/bioinformatics\/18.suppl_2.S116","article-title":"Contextual alignment of biological sequences (Extended abstract)","volume":"18","author":"Gambin","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B15","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1006\/bbrc.1994.1255","article-title":"Analysis of amino acid substitution during divergent evolution: the 400 by 400 dipeptide substitution matrix","volume":"199","author":"Gonnet","year":"1994","journal-title":"Biochem. Biophys. Res. Commun."},{"key":"2023012710402024000_btt694-B16","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1006\/jmbi.2001.5080","article-title":"Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure","volume":"313","author":"Gough","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023012710402024000_btt694-B17","doi-asserted-by":"crossref","first-page":"1834","DOI":"10.1109\/JPROC.2002.805303","article-title":"Bootstrapping and normalization for enhanced evaluations of pairwise sequence comparison","volume":"90","author":"Green","year":"2002","journal-title":"Proc. IEEE"},{"key":"2023012710402024000_btt694-B18","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/S0097-8485(96)80004-0","article-title":"Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching","volume":"20","author":"Gribskov","year":"1996","journal-title":"Comput. Chem."},{"key":"2023012710402024000_btt694-B19","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710402024000_btt694-B20","doi-asserted-by":"crossref","first-page":"2780","DOI":"10.1093\/bioinformatics\/btn507","article-title":"Searching protein structure databases with DaliLite v.3","volume":"24","author":"Holm","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B21","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1093\/bioinformatics\/btg494","article-title":"Optimizing substitution matrices by separating score distributions","volume":"20","author":"Hourai","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B22","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1093\/bioinformatics\/bti828","article-title":"Improved pairwise alignments of proteins in the twilight zone using local structure predictions","volume":"22","author":"Huang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B23","doi-asserted-by":"crossref","first-page":"e26400","DOI":"10.1371\/journal.pone.0026400","article-title":"Pattern of amino acid substitutions in transmembrane domains of beta-barrel membrane proteins for detecting remote homologs in bacteria and mitochondria","volume":"6","author":"Jimenez-Morales","year":"2011","journal-title":"PLoS One"},{"key":"2023012710402024000_btt694-B24","first-page":"1347","article-title":"Detecting remote homologues using scoring matrices calculated from the estimation of amino acid substitution rates of beta-barrel membrane proteins","volume":"2008","author":"Jimenez-Morales","year":"2008","journal-title":"Conf. Proc. IEEE Eng. Med. Biol. Soc."},{"key":"2023012710402024000_btt694-B25","doi-asserted-by":"crossref","first-page":"1576","DOI":"10.1110\/ps.9.8.1576","article-title":"Use of residue pairs in protein sequence-sequence and sequence-structure alignments","volume":"9","author":"Jung","year":"2000","journal-title":"Protein Sci."},{"key":"2023012710402024000_btt694-B26","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1002\/1097-0134(20001201)41:4<498::AID-PROT70>3.0.CO;2-3","article-title":"Optimization of a new score function for the detection of remote homologs","volume":"41","author":"Kann","year":"2000","journal-title":"Proteins"},{"key":"2023012710402024000_btt694-B27","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1186\/1756-0500-4-296","article-title":"Protein sequence alignment with family-specific amino acid similarity matrices","volume":"4","author":"Kuznetsov","year":"2011","journal-title":"BMC Res. Notes"},{"key":"2023012710402024000_btt694-B28","doi-asserted-by":"crossref","first-page":"1339","DOI":"10.1093\/bioinformatics\/btn130","article-title":"Simple is beautiful: a straightforward approach to improve the delineation of true and false positives in PSI-BLAST searches","volume":"24","author":"Lee","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B29","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1186\/1471-2105-12-457","article-title":"A novel substitution matrix fitted to the compositional bias in Mollicutes improves the prediction of homologous relationships","volume":"12","author":"Lemaitre","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012710402024000_btt694-B30","doi-asserted-by":"crossref","first-page":"D499","DOI":"10.1093\/nar\/gks1266","article-title":"Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains","volume":"41","author":"Lewis","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B31","doi-asserted-by":"crossref","first-page":"1679","DOI":"10.1089\/cmb.2008.0035","article-title":"Substitution matrices of residue triplets derived from protein blocks","volume":"17","author":"Liu","year":"2010","journal-title":"J. Comput. Biol."},{"key":"2023012710402024000_btt694-B32","doi-asserted-by":"crossref","first-page":"S182","DOI":"10.1093\/bioinformatics\/17.suppl_1.S182","article-title":"Non-symmetric score matrices and the detection of homologous transmembrane proteins","volume":"17","author":"Muller","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B33","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1093\/oxfordjournals.molbev.a003985","article-title":"Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method","volume":"19","author":"Muller","year":"2002","journal-title":"Mol. Biol. Evol."},{"key":"2023012710402024000_btt694-B34","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1093\/bioinformatics\/16.9.760","article-title":"PHAT: a transmembrane-specific substitution matrix. Predicted hydrophobic and transmembrane","volume":"16","author":"Ng","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012710402024000_btt694-B35","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1186\/1471-2105-9-531","article-title":"Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score","volume":"9","author":"Pandit","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012710402024000_btt694-B36","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/0888-7543(91)90071-L","article-title":"Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms","volume":"11","author":"Pearson","year":"1991","journal-title":"Genomics"},{"key":"2023012710402024000_btt694-B37","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1002\/prot.10132","article-title":"Optimization of a new score function for the generation of accurate alignments","volume":"48","author":"Qian","year":"2002","journal-title":"Proteins"},{"key":"2023012710402024000_btt694-B38"},{"key":"2023012710402024000_btt694-B39","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1186\/1471-2105-7-246","article-title":"Optimizing amino acid substitution matrices with a local alignment kernel","volume":"7","author":"Saigo","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012710402024000_btt694-B40","doi-asserted-by":"crossref","first-page":"2994","DOI":"10.1093\/nar\/29.14.2994","article-title":"Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements","volume":"29","author":"Schaffer","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B41","doi-asserted-by":"crossref","first-page":"D490","DOI":"10.1093\/nar\/gks1211","article-title":"New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures","volume":"41","author":"Sillitoe","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012710402024000_btt694-B42","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/protein\/9.1.27","article-title":"Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins","volume":"9","author":"Tomii","year":"1996","journal-title":"Protein Eng."},{"key":"2023012710402024000_btt694-B43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0022-2836(05)80006-3","article-title":"Sequence alignment and penalty choice. Review of concepts, case studies and implications","volume":"235","author":"Vingron","year":"1994","journal-title":"J. Mol. Biol."},{"key":"2023012710402024000_btt694-B44","doi-asserted-by":"crossref","first-page":"15688","DOI":"10.1073\/pnas.2533904100","article-title":"The compositional adjustment of amino acid substitution matrices","volume":"100","author":"Yu","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/3\/317\/48915258\/bioinformatics_30_3_317.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/3\/317\/48915258\/bioinformatics_30_3_317.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T10:52:09Z","timestamp":1674816729000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/3\/317\/229250"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,11,26]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2014,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt694","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,2,1]]},"published":{"date-parts":[[2013,11,26]]}}}