{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T01:28:22Z","timestamp":1770514102242,"version":"3.49.0"},"reference-count":7,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2018,10,8]],"date-time":"2018-10-08T00:00:00Z","timestamp":1538956800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000268","name":"BBSRC","doi-asserted-by":"publisher","award":["BB\/L002817\/1"],"award-info":[{"award-number":["BB\/L002817\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Many bioinformatics areas require us to assign domain matches onto stretches of a query protein. Starting with a set of candidate matches, we want to identify the optimal subset that has limited\/no overlap between matches. This may be further complicated by discontinuous domains in the input data. Existing tools are increasingly facing very large data-sets for which they require prohibitive amounts of CPU-time and memory.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present cath-resolve-hits (CRH), a new tool that uses a dynamic-programming algorithm implemented in open-source C++ to handle large datasets quickly (up to \u223c1 million hits\/second) and in reasonable amounts of memory. It accepts multiple input formats and provides its output in plain text, JSON or graphical HTML. We describe a benchmark against an existing algorithm, which shows CRH delivers very similar or slightly improved results and very much improved CPU\/memory performance on large datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>CRH is available at https:\/\/github.com\/UCLOrengoGroup\/cath-tools; documentation is available at http:\/\/cath-tools.readthedocs.io.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty863","type":"journal-article","created":{"date-parts":[[2018,10,5]],"date-time":"2018-10-05T19:44:49Z","timestamp":1538768689000},"page":"1766-1767","source":"Crossref","is-referenced-by-count":51,"title":["cath-resolve-hits: a new tool that resolves domain matches suspiciously quickly"],"prefix":"10.1093","volume":"35","author":[{"given":"T E","family":"Lewis","sequence":"first","affiliation":[{"name":"Department of Structural and Molecular Biology, UCL, Darwin Building, London, UK"}]},{"given":"I","family":"Sillitoe","sequence":"additional","affiliation":[{"name":"Department of Structural and Molecular Biology, UCL, Darwin Building, London, UK"}]},{"given":"J G","family":"Lees","sequence":"additional","affiliation":[{"name":"Department of Biological and Medical Sciences, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford, Oxfordshire, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,10,8]]},"reference":[{"key":"2023013107490247300_bty863-B1","doi-asserted-by":"crossref","first-page":"D289","DOI":"10.1093\/nar\/gkw1098","article-title":"CATH: an expanded resource to predict protein function through structure and sequence","volume":"45","author":"Dawson","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B2","doi-asserted-by":"crossref","first-page":"D190","DOI":"10.1093\/nar\/gkw1107","article-title":"InterPro in 2017-beyond protein family and domain annotations","volume":"45","author":"Finn","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B3","doi-asserted-by":"crossref","first-page":"D404","DOI":"10.1093\/nar\/gkv1231","article-title":"Gene3D: expanding the utility of domain assignments","volume":"44","author":"Lam","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B4","doi-asserted-by":"crossref","first-page":"D435","DOI":"10.1093\/nar\/gkx1069","article-title":"Gene3D: extensive prediction of globular domains in proteins","volume":"46","author":"Lewis","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B5","doi-asserted-by":"crossref","first-page":"D123","DOI":"10.1093\/nar\/gkr975","article-title":"IMG\/M: the integrated metagenome data management and comparative analysis system","volume":"40","author":"Markowitz","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B6","doi-asserted-by":"crossref","first-page":"D483","DOI":"10.1093\/nar\/gks1258","article-title":"SIFTS: structure integration with function, taxonomy and sequences resource","volume":"41","author":"Velankar","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023013107490247300_bty863-B7","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1093\/bioinformatics\/btq034","article-title":"A fast and automated solution for accurately resolving protein domain architectures","volume":"26","author":"Yeats","year":"2010","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/10\/1766\/48970213\/bioinformatics_35_10_1766.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/10\/1766\/48970213\/bioinformatics_35_10_1766.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T10:51:19Z","timestamp":1675162279000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/10\/1766\/5123356"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,10,8]]},"references-count":7,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2019,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty863","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,5,15]]},"published":{"date-parts":[[2018,10,8]]}}}