{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:48:34Z","timestamp":1753876114139,"version":"3.41.2"},"reference-count":28,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2019,7,18]],"date-time":"2019-07-18T00:00:00Z","timestamp":1563408000000},"content-version":"vor","delay-in-days":198,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key R&D Program of China","award":["2018YFC0910401","2016YFC0901604"],"award-info":[{"award-number":["2018YFC0910401","2016YFC0901604"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31771478"],"award-info":[{"award-number":["31771478"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Iterative homology search has been widely used in identification of remotely related proteins. Our previous study has found that the query-seeded sequence iterative search can reduce homologous over-extension errors and greatly improve selectivity. However, iterative homology search remains challenging in protein functional prediction. More sensitive scoring models are highly needed to improve the predictive performance of the alignment methods, and alignment annotation with better visualization has also become imperative for result interpretation. Here we report an open-source application PSISearch2D that runs query-seeded iterative sequence search for remotely related protein detection. PSISearch2D retrieves domain annotation from Pfam, UniProtKB, CDD and PROSITE for resulting hits and demonstrates combined domain and sequence alignments in novel visualizations. A scoring model called C-value is newly defined to re-order hits with consideration of the combination of sequence and domain alignments. The benchmarking on the use of C-value indicates that PSISearch2D outperforms the original PSISearch2 tool in terms of both accuracy and specificity. PSISearch2D improves the characterization of unknown proteins in remote protein detection. Our evaluation tests show that PSISearch2D has provided annotation for 77\u2009695 of 139\u2009503 unknown bacteria proteins and 140\u2009751 of 352\u2009757 unknown virus proteins in UniProtKB, about 2.3-fold and 1.8-fold more characterization than the original PSISearch2, respectively. Together with advanced features of auto-iteration mode to handle large-scale data and optional programs for global and local sequence alignments, PSISearch2D enhances remotely related protein search.<\/jats:p>","DOI":"10.1093\/database\/baz092","type":"journal-article","created":{"date-parts":[[2019,6,16]],"date-time":"2019-06-16T11:07:32Z","timestamp":1560683252000},"source":"Crossref","is-referenced-by-count":1,"title":["Combined alignments of sequences and domains characterize unknown proteins with remotely related protein search PSISearch2D"],"prefix":"10.1093","volume":"2019","author":[{"given":"Minglei","family":"Yang","sequence":"first","affiliation":[{"name":"Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenliang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guocai","family":"Yao","sequence":"additional","affiliation":[{"name":"Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haiyue","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weizhong","family":"Li","sequence":"additional","affiliation":[{"name":"Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China"},{"name":"Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China"},{"name":"Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Ministry of Education"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2019,7,17]]},"reference":[{"key":"2019071803060391400_ref1","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1093\/bib\/bbw108","article-title":"A comprehensive review and comparison of different computational methods for protein remote homology detection","volume":"19","author":"Chen","year":"2018","journal-title":"Brief. Bioinform."},{"key":"2019071803060391400_ref2","doi-asserted-by":"crossref","first-page":"D158","DOI":"10.1093\/nar\/gkw1099","article-title":"UniProt: the universal protein knowledgebase","volume":"45","author":"The UniProt Consortium","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref3","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref4","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1186\/1471-2105-10-421","article-title":"BLAST+: architecture and applications","volume":"10","author":"Camacho","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2019071803060391400_ref5","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2011","journal-title":"Nat. Methods"},{"key":"2019071803060391400_ref6","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1186\/1471-2105-11-431","article-title":"Hidden Markov model speed heuristic and iterative HMM search procedure","volume":"11","author":"Johnson","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2019071803060391400_ref7","doi-asserted-by":"crossref","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile HMM searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"2019071803060391400_ref8","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1110\/ps.03328504","article-title":"Sensitivity and selectivity in protein structure comparison","volume":"13","author":"Sierk","year":"2004","journal-title":"Protein Sci."},{"key":"2019071803060391400_ref9","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/1745-6150-7-12","article-title":"Domain enhanced lookup time accelerated BLAST","volume":"7","author":"Boratyn","year":"2012","journal-title":"Biol. Direct"},{"key":"2019071803060391400_ref10","doi-asserted-by":"crossref","first-page":"e454","DOI":"10.1093\/bioinformatics\/btl227","article-title":"On counting position weight matrix matches in a sequence, with application to discriminative motif finding","volume":"22","author":"Sinha","year":"2006","journal-title":"Bioinformatics"},{"key":"2019071803060391400_ref11","doi-asserted-by":"crossref","first-page":"2177","DOI":"10.1093\/nar\/gkp1219","article-title":"Homologous over-extension: a challenge for iterative similarity searches","volume":"38","author":"Gonzalez","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref12","doi-asserted-by":"crossref","first-page":"1650","DOI":"10.1093\/bioinformatics\/bts240","article-title":"PSI-Search: iterative HOE-reduced profile SSEARCH searching","volume":"28","author":"Li","year":"2012","journal-title":"Bioinformatics"},{"key":"2019071803060391400_ref13","doi-asserted-by":"crossref","first-page":"e46","DOI":"10.1093\/nar\/gkw1207","article-title":"Query-seeded iterative sequence similarity searching improves selectivity 5\u201320-fold","volume":"45","author":"Pearson","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref14","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"InterProScan 5: genome-scale protein function classification","volume":"30","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"2019071803060391400_ref15","doi-asserted-by":"crossref","first-page":"D279","DOI":"10.1093\/nar\/gkv1344","article-title":"The Pfam protein families database: towards a more sustainable future","volume":"44","author":"Finn","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref16","doi-asserted-by":"crossref","first-page":"D190","DOI":"10.1093\/nar\/gkw1107","article-title":"InterPro in 2017-beyond protein family and domain annotations","volume":"45","author":"Finn","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref17","doi-asserted-by":"crossref","first-page":"D200","DOI":"10.1093\/nar\/gkw1129","article-title":"CDD\/SPARCLE: functional classification of proteins via subfamily domain architectures","volume":"45","author":"Marchler-Bauer","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref18","doi-asserted-by":"crossref","first-page":"D289","DOI":"10.1093\/nar\/gkw1098","article-title":"CATH: an expanded resource to predict protein function through structure and sequence","volume":"45","author":"Dawson","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref19","doi-asserted-by":"crossref","first-page":"D435","DOI":"10.1093\/nar\/gkx1069","article-title":"Gene3D: extensive prediction of globular domains in proteins","volume":"46","author":"Lewis","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref20","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gks1067","article-title":"New and continuing developments at PROSITE","volume":"41","author":"Sigrist","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref21","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/0888-7543(91)90071-L","article-title":"Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith\u2013Waterman and FASTA algorithms","volume":"11","author":"Pearson","year":"1991","journal-title":"Genomics"},{"key":"2019071803060391400_ref22","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2019071803060391400_ref23","doi-asserted-by":"crossref","first-page":"W580","DOI":"10.1093\/nar\/gkv279","article-title":"The EMBL-EBI bioinformatics web and programmatic tools framework","volume":"43","author":"Li","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref24","doi-asserted-by":"crossref","first-page":"W695","DOI":"10.1093\/nar\/gkq313","article-title":"A new bioinformatics analysis tools framework at EMBL-EBI","volume":"38","author":"Goujon","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref25","doi-asserted-by":"crossref","first-page":"2361","DOI":"10.1093\/bioinformatics\/btq426","article-title":"RefProtDom: a protein database with improved domain boundaries and homology relationships","volume":"26","author":"Gonzalez","year":"2010","journal-title":"Bioinformatics"},{"key":"2019071803060391400_ref26","doi-asserted-by":"crossref","first-page":"D1186","DOI":"10.1093\/nar\/gky1036","article-title":"ECO, the evidence and conclusion ontology: community standard for evidence information","volume":"47","author":"Giglio","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref27","doi-asserted-by":"crossref","first-page":"D490","DOI":"10.1093\/nar\/gky1130","article-title":"The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver","volume":"47","author":"Pandurangan","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"2019071803060391400_ref28","doi-asserted-by":"crossref","first-page":"D310","DOI":"10.1093\/nar\/gkt1242","article-title":"SCOP2 prototype: a new approach to protein structure mining","volume":"42","author":"Andreeva","year":"2014","journal-title":"Nucleic Acids Res."}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baz092\/28939954\/baz092.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,9,23]],"date-time":"2019-09-23T22:37:14Z","timestamp":1569278234000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baz092\/5532822"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,1]]},"references-count":28,"URL":"https:\/\/doi.org\/10.1093\/database\/baz092","relation":{},"ISSN":["1758-0463"],"issn-type":[{"type":"electronic","value":"1758-0463"}],"subject":[],"published-other":{"date-parts":[[2019]]},"published":{"date-parts":[[2019,1,1]]},"article-number":"baz092"}}