{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T14:11:44Z","timestamp":1761487904304},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"18","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: A recent development in sequence-based remote homologue detection is the introduction of profile\u2013profile comparison methods. These are more powerful than previous technologies and can detect potentially homologous relationships missed by structural classifications such as CATH and SCOP. As structural classifications traditionally act as the gold standard of homology this poses a challenge in benchmarking them.<\/jats:p><jats:p>Results: We present a novel approach which allows an accurate benchmark of these methods against the CATH structural classification. We then apply this approach to assess the accuracy of a range of publicly available methods for remote homology detection including several profile\u2013profile methods (COMPASS, HHSearch, PRC) from two perspectives. First, in distinguishing homologous domains from non-homologues and second, in annotating proteomes with structural domain families. PRC is shown to be the best method for distinguishing homologues. We show that SAM is the best practical method for annotating genomes, whilst using COMPASS for the most remote homologues would increase coverage. Finally, we introduce a simple approach to increase the sensitivity of remote homologue detection by up to 10 %. This is achieved by combining multiple methods with a jury vote.<\/jats:p><jats:p>Contact: \u00a0reid@bioichem.ucl.ac.uk<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm355","type":"journal-article","created":{"date-parts":[[2007,8,21]],"date-time":"2007-08-21T00:34:49Z","timestamp":1187656489000},"page":"2353-2360","source":"Crossref","is-referenced-by-count":27,"title":["Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone"],"prefix":"10.1093","volume":"23","author":[{"given":"Adam James","family":"Reid","sequence":"first","affiliation":[{"name":"Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK"}]},{"given":"Corin","family":"Yeats","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK"}]},{"given":"Christine Anne","family":"Orengo","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK"}]}],"member":"286","published-online":{"date-parts":[[2007,9,15]]},"reference":[{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1093\/bioinformatics\/btm034","article-title":"SCOOP: a simple method for identification of novel protein superfamily relationships","volume":"23","author":"Bateman","year":"2007","journal-title":"Bioinformatics"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"6073","DOI":"10.1073\/pnas.95.11.6073","article-title":"Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships","volume":"95","author":"Brenner","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/1471-2105-7-48","article-title":"On single and multiple models of protein families for the detection of remote sequence relationships","volume":"7","author":"Casbon","year":"2006","journal-title":"BMC. Bioinformatics"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1002\/j.1460-2075.1986.tb04288.x","article-title":"The relation between the divergence of sequence and structure in proteins","volume":"5","author":"Chothia","year":"1986","journal-title":"EMBO J"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1016\/S0959-440X(96)80056-X","article-title":"Hidden Markov models","volume":"6","author":"Eddy","year":"1996","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"D247","DOI":"10.1093\/nar\/gkj149","article-title":"Pfam: clans, web tools and services","volume":"34","author":"Finn","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1006\/jmbi.2001.5080","article-title":"Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure","volume":"313","author":"Gough","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"D291","DOI":"10.1093\/nar\/gkl959","article-title":"The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution","volume":"35","author":"Greene","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1016\/S0022-2836(02)00992-0","article-title":"Quantifying the similarities within fold space","volume":"323","author":"Harrison","year":"2002","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","first-page":"3600","article-title":"The FSSP database of structurally aligned protein fold families","volume":"22","author":"Holm","year":"1994","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/S0969-2126(02)00750-5","article-title":"Novel sequences propel familiar folds","volume":"10","author":"Jawad","year":"2002","journal-title":"Structure"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1093\/bioinformatics\/14.10.846","article-title":"Hidden Markov models for detecting remote protein homologies","volume":"14","author":"Karplus","year":"1998","journal-title":"Bioinformatics"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"1173","DOI":"10.1016\/j.jmb.2004.12.032","article-title":"Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures","volume":"346","author":"Kolodny","year":"2005","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","unstructured":"Madera M PRC \u2013 The Profile Comparer PhD thesis 2006 University of Cambridge"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"4321","DOI":"10.1093\/nar\/gkf544","article-title":"A comparison of profile hidden Markov model procedures for remote homology detection","volume":"30","author":"Madera","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.1006\/jmbi.1999.3233","article-title":"Benchmarking PSI-BLAST in genome annotation","volume":"293","author":"Muller","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1016\/S0076-6879(96)66038-8","article-title":"SSAP: sequential structure alignment program for protein structure comparison","volume":"266","author":"Orengo","year":"1996","journal-title":"Methods Enzymol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"1201","DOI":"10.1006\/jmbi.1998.2221","article-title":"Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods","volume":"284","author":"Park","year":"1998","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"3836","DOI":"10.1093\/nar\/24.19.3836","article-title":"Searching databases of conserved sequence regions by aligning protein multiple-alignments","volume":"24","author":"Pietrokovski","year":"1996","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1016\/j.jmb.2006.05.035","article-title":"Structural diversity of domain superfamilies in the CATH database","volume":"360","author":"Reeves","year":"2006","journal-title":"J. Mol. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/S0022-2836(02)01371-2","article-title":"COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance","volume":"326","author":"Sadreyev","year":"2003","journal-title":"J. Mol. Biol"},{"issue":"Web Server issue","key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"W653","DOI":"10.1093\/nar\/gkm293","article-title":"COMPASS server for remote homology inference","volume":"35","author":"Sadreyev","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"776","DOI":"10.1093\/bioinformatics\/16.9.776","article-title":"MaxSub: an automated measure for the assessment of protein structure prediction quality","volume":"16","author":"Siew","year":"2000","journal-title":"Bioinformatics"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"1800","DOI":"10.1110\/ps.041056105","article-title":"Assessing strategies for improved superfamily recognition","volume":"14","author":"Sillitoe","year":"2005","journal-title":"Protein Sci"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM-HMM comparison","volume":"21","author":"Soding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/0960-9822(93)90255-M","article-title":"Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core","volume":"3","author":"Subbiah","year":"1993","journal-title":"Curr. Biol"},{"key":"2023041106215587300_","doi-asserted-by":"crossref","first-page":"1257","DOI":"10.1006\/jmbi.2001.5293","article-title":"Within the twilight zone: a sensitive profile-profile comparison tool based on information theory","volume":"315","author":"Yona","year":"2002","journal-title":"J. Mol. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/18\/2353\/49816675\/bioinformatics_23_18_2353.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/18\/2353\/49816675\/bioinformatics_23_18_2353.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T21:13:17Z","timestamp":1684012397000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/18\/2353\/237686"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9,15]]},"references-count":29,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2007,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm355","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,9,15]]},"published":{"date-parts":[[2007,9,15]]}}}