{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:10Z","timestamp":1772138050656,"version":"3.50.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2022,2,1]],"date-time":"2022-02-01T00:00:00Z","timestamp":1643673600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australia Research Council","doi-asserted-by":"crossref","award":["DP210101875"],"award-info":[{"award-number":["DP210101875"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Shenzhen Science and Technology Program","award":["KQTD20170330155106581"],"award-info":[{"award-number":["KQTD20170330155106581"]}]},{"name":"Major Program of Shenzhen Bay Laboratory","award":["S201101001"],"award-info":[{"award-number":["S201101001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Accurate prediction of protein contact-map is essential for accurate protein structure and function prediction. As a result, many methods have been developed for protein contact map prediction. However, most methods rely on protein-sequence-evolutionary information, which may not exist for many proteins due to lack of naturally occurring homologous sequences. Moreover, generating evolutionary profiles is computationally intensive. Here, we developed a contact-map predictor utilizing the output of a pre-trained language model ESM-1b as an input along with a large training set and an ensemble of residual neural networks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We showed that the proposed method makes a significant improvement over a single-sequence-based predictor SSCpred with 15% improvement in the F1-score for the independent CASP14-FM test set. It also outperforms evolutionary-profile-based methods trRosetta and SPOT-Contact with 48.7% and 48.5% respective improvement in the F1-score on the proteins without homologs (Neff\u2009=\u20091) in the independent SPOT-2018 set. The new method provides a much faster and reasonably accurate alternative to evolution-based methods, useful for large-scale prediction.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Stand-alone-version of SPOT-Contact-LM is available at https:\/\/github.com\/jas-preet\/SPOT-Contact-Single. Direct prediction can also be made at https:\/\/sparks-lab.org\/server\/spot-contact-single. The datasets used in this research can also be downloaded from the GitHub.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac053","type":"journal-article","created":{"date-parts":[[2022,1,26]],"date-time":"2022-01-26T07:17:48Z","timestamp":1643181468000},"page":"1888-1894","source":"Crossref","is-referenced-by-count":51,"title":["SPOT-Contact-LM: improving single-sequence-based prediction of protein contact map using a transformer language model"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4974-5188","authenticated-orcid":false,"given":"Jaspreet","family":"Singh","sequence":"first","affiliation":[{"name":"Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University , Brisbane, QLD 4111, Australia"}]},{"given":"Thomas","family":"Litfin","sequence":"additional","affiliation":[{"name":"Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University , Brisbane, QLD 4111, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0478-5533","authenticated-orcid":false,"given":"Jaswinder","family":"Singh","sequence":"additional","affiliation":[{"name":"Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University , Brisbane, QLD 4111, Australia"}]},{"given":"Kuldip","family":"Paliwal","sequence":"additional","affiliation":[{"name":"Signal Processing Laboratory, School of Engineering and Built Environment, Griffith University , Brisbane, QLD 4111, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9958-5699","authenticated-orcid":false,"given":"Yaoqi","family":"Zhou","sequence":"additional","affiliation":[{"name":"Institute for Glycomics, Griffith University , Southport, QLD 4222, Australia"},{"name":"Institute of Systems and Physical Biology, Shenzhen Bay Laboratory , Shenzhen 518055, China"},{"name":"Peking University Shenzhen Graduate School , Shenzhen 518055, China"}]}],"member":"286","published-online":{"date-parts":[[2022,2,1]]},"reference":[{"key":"2023020109015331000_btac053-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-2932-0","article-title":"ProteinNet: a standardized data set for machine learning of protein structure","volume":"20","author":"AlQuraishi","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023020109015331000_btac053-B2","doi-asserted-by":"crossref","first-page":"D138","DOI":"10.1093\/nar\/gkh121","article-title":"The pfam protein families database","volume":"32","author":"Bateman","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020109015331000_btac053-B3","doi-asserted-by":"crossref","first-page":"3295","DOI":"10.1021\/acs.jcim.9b01207","article-title":"SSCpred: single-sequence-based protein contact prediction using deep fully convolutional network","volume":"60","author":"Chen","year":"2020","journal-title":"J. Chem. Inf. Model"},{"key":"2023020109015331000_btac053-B4","doi-asserted-by":"crossref","first-page":"1361","DOI":"10.1002\/prot.25767","article-title":"Estimation of model accuracy in CASP13","volume":"87","author":"Cheng","year":"2019","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023020109015331000_btac053-B5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12864-019-6413-7","article-title":"The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation","volume":"21","author":"Chicco","year":"2020","journal-title":"BMC Genomics"},{"key":"2023020109015331000_btac053-B6","author":"Choromanski","year":"2020"},{"key":"2023020109015331000_btac053-B7","author":"Duta","year":"2021"},{"key":"2023020109015331000_btac053-B8","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jcp.2014.07.024","article-title":"Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences","volume":"276","author":"Ekeberg","year":"2014","journal-title":"J. Comput. Phys"},{"key":"2023020109015331000_btac053-B9","author":"Elnaggar","year":"2021"},{"key":"2023020109015331000_btac053-B10","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1002\/prot.22554","article-title":"Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8","volume":"77","author":"Ezkurdia","year":"2009","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023020109015331000_btac053-B11","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1002\/prot.25487","article-title":"MUFOLD-SS: new deep inception-inside-inception networks for protein secondary structure prediction","volume":"86","author":"Fang","year":"2018","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023020109015331000_btac053-B12","doi-asserted-by":"crossref","first-page":"4039","DOI":"10.1093\/bioinformatics\/bty481","article-title":"Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks","volume":"34","author":"Hanson","year":"2018","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B13","doi-asserted-by":"crossref","first-page":"2403","DOI":"10.1093\/bioinformatics\/bty1006","article-title":"Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks","volume":"35","author":"Hanson","year":"2019","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B14","doi-asserted-by":"crossref","first-page":"796","DOI":"10.1089\/cmb.2019.0193","article-title":"Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning","volume":"27","author":"Hanson","year":"2020","journal-title":"J. Comput. Biol"},{"key":"2023020109015331000_btac053-B15","doi-asserted-by":"crossref","first-page":"2210","DOI":"10.1002\/jcc.25534","article-title":"Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning","volume":"39","author":"Heffernan","year":"2018","journal-title":"J. Comput. Chem"},{"key":"2023020109015331000_btac053-B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-3220-8","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","volume":"20","author":"Heinzinger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023020109015331000_btac053-B17","doi-asserted-by":"crossref","first-page":"3308","DOI":"10.1093\/bioinformatics\/bty341","article-title":"High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features","volume":"34","author":"Jones","year":"2018","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B18","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1093\/bioinformatics\/btu791","article-title":"MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins","volume":"31","author":"Jones","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-15-85","article-title":"FreeContact: fast and free software for protein contact prediction from residue co-evolution","volume":"15","author":"Kaj\u00e1n","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023020109015331000_btac053-B20","doi-asserted-by":"crossref","first-page":"1082","DOI":"10.1002\/prot.25798","article-title":"Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13","volume":"87","author":"Li","year":"2019","journal-title":"Proteins Struct. Funct. Bioinf"},{"key":"2023020109015331000_btac053-B21","first-page":"58","author":"Liu","year":"2022"},{"key":"2023020109015331000_btac053-B22","doi-asserted-by":"crossref","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","article-title":"Uniclust databases of clustered and deeply annotated protein sequences and alignments","volume":"45","author":"Mirdita","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023020109015331000_btac053-B23","doi-asserted-by":"crossref","first-page":"e02030","DOI":"10.7554\/eLife.02030","article-title":"Robust and accurate prediction of residue\u2013residue interactions across protein interfaces using evolutionary information","volume":"3","author":"Ovchinnikov","year":"2014","journal-title":"Elife"},{"key":"2023020109015331000_btac053-B24","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"key":"2023020109015331000_btac053-B25","first-page":"9689","article-title":"Evaluating protein transfer learning with tape","volume":"32","author":"Rao","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"2023020109015331000_btac053-B26","author":"Rao","year":"2020"},{"key":"2023020109015331000_btac053-B27","doi-asserted-by":"crossref","first-page":"3128","DOI":"10.1093\/bioinformatics\/btu500","article-title":"CCMpred\u2014fast and precise prediction of protein residue\u2013residue contacts from correlated mutations","volume":"30","author":"Seemayer","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B28","first-page":"021022","author":"Sheridan","year":"2015"},{"key":"2023020109015331000_btac053-B29","doi-asserted-by":"crossref","first-page":"2589","DOI":"10.1093\/bioinformatics\/btab165","article-title":"Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning","volume":"37","author":"Singh","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B30","doi-asserted-by":"crossref","first-page":"3464","DOI":"10.1093\/bioinformatics\/btab316","article-title":"SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning","volume":"37","author":"Singh","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B31","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2023020109015331000_btac053-B32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat. Commun"},{"key":"2023020109015331000_btac053-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023020109015331000_btac053-B34","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1038\/s41592-019-0437-4","article-title":"Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold","volume":"16","author":"Steinegger","year":"2019","journal-title":"Nat. Methods"},{"key":"2023020109015331000_btac053-B35","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020109015331000_btac053-B36","author":"Vaswani","year":"2017"},{"key":"2023020109015331000_btac053-B37","first-page":"1","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci. Rep"},{"key":"2023020109015331000_btac053-B38","doi-asserted-by":"crossref","first-page":"e1005324","DOI":"10.1371\/journal.pcbi.1005324","article-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","volume":"13","author":"Wang","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023020109015331000_btac053-B39","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1093\/bioinformatics\/btz477","article-title":"Protein contact prediction using metagenome sequence data and residual neural networks","volume":"36","author":"Wu","year":"2020","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac053\/42466318\/btac053.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/7\/1888\/49010510\/btac053.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/7\/1888\/49010510\/btac053.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T15:48:36Z","timestamp":1675266516000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/7\/1888\/6519147"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,2,1]]},"references-count":39,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2022,3,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac053","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.06.19.449089","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,4,1]]},"published":{"date-parts":[[2022,2,1]]}}}