{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T09:00:34Z","timestamp":1767085234743,"version":"3.37.3"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2019,8,30]],"date-time":"2019-08-30T00:00:00Z","timestamp":1567123200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/multicom-toolbox\/DNCON2\/.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz679","type":"journal-article","created":{"date-parts":[[2019,8,29]],"date-time":"2019-08-29T13:17:58Z","timestamp":1567084678000},"page":"1091-1098","source":"Crossref","is-referenced-by-count":37,"title":["Analysis of several key factors influencing deep learning-based inter-residue contact prediction"],"prefix":"10.1093","volume":"36","author":[{"given":"Tianqi","family":"Wu","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Missouri , Columbia, MO 65211, USA"}]},{"given":"Jie","family":"Hou","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Missouri , Columbia, MO 65211, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1547-0238","authenticated-orcid":false,"given":"Badri","family":"Adhikari","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Missouri, St. Louis , MO 63121, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0305-2853","authenticated-orcid":false,"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Missouri , Columbia, MO 65211, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,8,30]]},"reference":[{"key":"2023013110144703800_btz679-B1","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1186\/s12859-018-2032-6","article-title":"CONFOLD2: improved contact-driven ab initio protein structure modeling","volume":"19","author":"Adhikari","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023013110144703800_btz679-B2","doi-asserted-by":"crossref","first-page":"517.","DOI":"10.1186\/s12859-016-1404-z","article-title":"ConEVA: a toolbox for comprehensive assessment of protein contacts","volume":"17","author":"Adhikari","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023013110144703800_btz679-B3","doi-asserted-by":"crossref","first-page":"1466","DOI":"10.1093\/bioinformatics\/btx781","article-title":"DNCON2: improved protein contact prediction using two-level deep convolutional neural networks","volume":"34","author":"Adhikari","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B4","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1093\/protein\/2.3.193","article-title":"Coordinated amino acid changes in homologous protein families","volume":"2","author":"Altschuh","year":"1988","journal-title":"Protein Eng"},{"key":"2023013110144703800_btz679-B5","doi-asserted-by":"crossref","first-page":"905","DOI":"10.1107\/S0907444998003254","article-title":"Crystallography & NMR system: a new software suite for macromolecular structure determination","volume":"54 (Pt 5)","author":"Brunger","year":"1998","journal-title":"Acta Crystallogr. D Biol. Crystallogr"},{"key":"2023013110144703800_btz679-B6","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1002\/prot.25379","article-title":"Improved protein contact predictions with the MetaPSICOV2 server in CASP12","volume":"86 (Suppl. 1)","author":"Buchan","year":"2018","journal-title":"Proteins"},{"key":"2023013110144703800_btz679-B7","doi-asserted-by":"crossref","first-page":"2449","DOI":"10.1093\/bioinformatics\/bts475","article-title":"Deep architectures for protein contact map prediction","volume":"28","author":"Di Lena","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B8","doi-asserted-by":"crossref","first-page":"3066","DOI":"10.1093\/bioinformatics\/bts598","article-title":"Predicting protein residue-residue contacts using deep networks and boosting","volume":"28","author":"Eickholt","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B9","doi-asserted-by":"crossref","first-page":"012707","DOI":"10.1103\/PhysRevE.87.012707","article-title":"Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models","volume":"87","author":"Ekeberg","year":"2013","journal-title":"Phys. Rev. E Stat. Nonlin. Soft Matter Phys"},{"key":"2023013110144703800_btz679-B10","doi-asserted-by":"crossref","first-page":"3514.","DOI":"10.1038\/s41598-019-40314-1","article-title":"DESTINI: a deep-learning approach to contact-driven protein structure prediction","volume":"9","author":"Gao","year":"2019","journal-title":"Sci. Rep"},{"key":"2023013110144703800_btz679-B11","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1002\/prot.340180402","article-title":"Correlated mutations and residue contacts in proteins","volume":"18","author":"Gobel","year":"1994","journal-title":"Proteins"},{"key":"2023013110144703800_btz679-B12","doi-asserted-by":"crossref","first-page":"4039","DOI":"10.1093\/bioinformatics\/bty481","article-title":"Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks","volume":"34","author":"Hanson","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B13","doi-asserted-by":"crossref","DOI":"10.1002\/prot.25697","article-title":"Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13","author":"Hou","year":"2019","journal-title":"Proteins: Struct., Funct., Bioinf."},{"key":"2023013110144703800_btz679-B14","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1186\/1471-2105-11-431","article-title":"Hidden Markov model speed heuristic and iterative HMM search procedure","volume":"11","author":"Johnson","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023013110144703800_btz679-B15","doi-asserted-by":"crossref","first-page":"3308","DOI":"10.1093\/bioinformatics\/bty341","article-title":"High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features","volume":"34","author":"Jones","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B16","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1093\/bioinformatics\/btr638","article-title":"PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments","volume":"28","author":"Jones","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B17","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1093\/bioinformatics\/btu791","article-title":"MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins","volume":"31","author":"Jones","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B18","doi-asserted-by":"crossref","first-page":"85.","DOI":"10.1186\/1471-2105-15-85","article-title":"FreeContact: fast and free software for protein contact prediction from residue co-evolution","volume":"15","author":"Kajan","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023013110144703800_btz679-B19","doi-asserted-by":"crossref","first-page":"15674","DOI":"10.1073\/pnas.1314045110","article-title":"Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era","volume":"110","author":"Kamisetty","year":"2013","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023013110144703800_btz679-B20","first-page":"586800","article-title":"Prediction of inter-residue contacts with DeepMetaPSICOV in CASP13","author":"Kandathil","year":"2019","journal-title":"bioRxiv"},{"key":"2023013110144703800_btz679-B21","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btz291","article-title":"ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks","author":"Li","year":"2019","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B22","doi-asserted-by":"crossref","first-page":"e28766.","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3D structure computed from evolutionary sequence variation","volume":"6","author":"Marks","year":"2011","journal-title":"PLoS One"},{"key":"2023013110144703800_btz679-B23","doi-asserted-by":"crossref","first-page":"386.","DOI":"10.1186\/1471-2105-9-386","article-title":"The metagenomics RAST server\u2014a public resource for the automatic phylogenetic and functional analysis of metagenomes","volume":"9","author":"Meyer","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023013110144703800_btz679-B24","doi-asserted-by":"crossref","first-page":"i23","DOI":"10.1093\/bioinformatics\/btx239","article-title":"Large-scale structure prediction by improved contact predictions and model quality assessment","volume":"33","author":"Michel","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B25","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1002\/prot.24340","article-title":"Evaluation of residue\u2013residue contact prediction in CASP10","volume":"82","author":"Monastyrskyy","year":"2014","journal-title":"Funct. Bioinformatics"},{"key":"2023013110144703800_btz679-B26","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"key":"2023013110144703800_btz679-B27","doi-asserted-by":"crossref","first-page":"S62","DOI":"10.1093\/bioinformatics\/18.suppl_1.S62","article-title":"Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners","volume":"18 (Suppl. 1)","author":"Pollastri","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B28","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2011","journal-title":"Nat. Methods"},{"key":"2023013110144703800_btz679-B29","doi-asserted-by":"crossref","first-page":"3128","DOI":"10.1093\/bioinformatics\/btu500","article-title":"CCMpred\u2014fast and precise prediction of protein residue-residue contacts from correlated mutations","volume":"30","author":"Seemayer","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013110144703800_btz679-B30","doi-asserted-by":"crossref","first-page":"e1003889.","DOI":"10.1371\/journal.pcbi.1003889","article-title":"Improved contact predictions using the recognition of protein like contact patterns","volume":"10","author":"Skwark","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023013110144703800_btz679-B31","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2023013110144703800_btz679-B32","doi-asserted-by":"crossref","first-page":"2542.","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat. Commun"},{"key":"2023013110144703800_btz679-B33","doi-asserted-by":"crossref","first-page":"W515","DOI":"10.1093\/nar\/gkp305","article-title":"NNcon: improved protein contact map prediction using 2D-recursive neural networks","volume":"37","author":"Tegge","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023013110144703800_btz679-B34","doi-asserted-by":"crossref","first-page":"e1005324.","DOI":"10.1371\/journal.pcbi.1005324","article-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","volume":"13","author":"Wang","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023013110144703800_btz679-B35","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1073\/pnas.0805923106","article-title":"Identification of direct residue contacts in protein-protein interaction by message passing","volume":"106","author":"Weigt","year":"2009","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023013110144703800_btz679-B36","doi-asserted-by":"crossref","first-page":"D590","DOI":"10.1093\/nar\/gkv1322","article-title":"The MG-RAST metagenomics database and portal in 2015","volume":"44","author":"Wilke","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013110144703800_btz679-B37","first-page":"624460","article-title":"Analysis of distance-based protein structure prediction by deep learning in CASP13","author":"Xu","year":"2019","journal-title":"bioRxiv"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz679\/30076157\/btz679.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1091\/48983137\/btz679.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1091\/48983137\/btz679.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T20:15:35Z","timestamp":1675196135000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/4\/1091\/5556814"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,8,30]]},"references-count":37,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz679","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,2,15]]},"published":{"date-parts":[[2019,8,30]]}}}