{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T01:18:13Z","timestamp":1775697493298,"version":"3.50.1"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T00:00:00Z","timestamp":1647388800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T00:00:00Z","timestamp":1647388800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["2018R1C1B600543513"],"award-info":[{"award-number":["2018R1C1B600543513"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["2019M3E5D4066898"],"award-info":[{"award-number":["2019M3E5D4066898"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-022-04628-8","type":"journal-article","created":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T08:03:15Z","timestamp":1647417795000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["A-Prot: protein structure modeling using MSA transformer"],"prefix":"10.1186","volume":"23","author":[{"given":"Yiyu","family":"Hong","sequence":"first","affiliation":[]},{"given":"Juyong","family":"Lee","sequence":"additional","affiliation":[]},{"given":"Junsu","family":"Ko","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,16]]},"reference":[{"key":"4628_CR1","doi-asserted-by":"publisher","first-page":"1940","DOI":"10.1002\/prot.26192","volume":"89","author":"S Kwon","year":"2021","unstructured":"Kwon S, Won J, Kryshtafovych A, Seok C. Assessment of protein model structure accuracy estimation in CASP14: old and new challenges. Proteins Struct Funct Bioinform. 2021;89:1940\u20138. https:\/\/doi.org\/10.1002\/prot.26192.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR2","doi-asserted-by":"publisher","first-page":"1687","DOI":"10.1002\/prot.26171","volume":"89","author":"J Pereira","year":"2021","unstructured":"Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN. High-accuracy protein structure prediction in CASP14. Proteins Struct Funct Bioinform. 2021;89:1687\u201399. https:\/\/doi.org\/10.1002\/prot.26171.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR3","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-021-03819-2","author":"J Jumper","year":"2021","unstructured":"Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021. https:\/\/doi.org\/10.1038\/s41586-021-03819-2.","journal-title":"Nature"},{"key":"4628_CR4","doi-asserted-by":"publisher","first-page":"1861","DOI":"10.1101\/gr.092452.109","volume":"19","author":"ERM Tillier","year":"2009","unstructured":"Tillier ERM, Charlebois RL. The human protein coevolution network. Genome Res. 2009;19:1861\u201371.","journal-title":"Genome Res"},{"key":"4628_CR5","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1073\/pnas.0805923106","volume":"106","author":"M Weigt","year":"2009","unstructured":"Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein\u2013protein interaction by message passing. Proc Natl Acad Sci. 2009;106:67\u201372.","journal-title":"Proc Natl Acad Sci"},{"key":"4628_CR6","doi-asserted-by":"crossref","unstructured":"Lunt B, Szurmant H, Procaccini A, Hoch JA, Hwa T, Weigt M. Inference of direct residue contacts in two-component signaling. In: Methods in enzymology. 2010. pp. 17\u201341.","DOI":"10.1016\/S0076-6879(10)71002-8"},{"key":"4628_CR7","first-page":"1","volume":"2014","author":"S Ovchinnikov","year":"2014","unstructured":"Ovchinnikov S, Kamisetty H, Baker D. Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. eLife. 2014;2014:1\u201321.","journal-title":"eLife"},{"key":"4628_CR8","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1038\/nrg3414","volume":"14","author":"D de Juan","year":"2013","unstructured":"de Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nat Rev Genet. 2013;14:249\u201361.","journal-title":"Nat Rev Genet"},{"key":"4628_CR9","doi-asserted-by":"publisher","DOI":"10.1038\/2419","author":"DS Marks","year":"2012","unstructured":"Marks DS, Hopf TA, Sander C. Protein structure prediction from sequence variation. Nat Publ Group. 2012. https:\/\/doi.org\/10.1038\/2419.","journal-title":"Nat Publ Group"},{"key":"4628_CR10","doi-asserted-by":"publisher","first-page":"3128","DOI":"10.1093\/bioinformatics\/btu500","volume":"30","author":"S Seemayer","year":"2014","unstructured":"Seemayer S, Gruber M, S\u00f6ding J. CCMpred: fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics. 2014;30:3128\u201330.","journal-title":"Bioinformatics"},{"key":"4628_CR11","doi-asserted-by":"publisher","first-page":"999","DOI":"10.1093\/bioinformatics\/btu791","volume":"31","author":"DT Jones","year":"2015","unstructured":"Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics. 2015;31:999\u20131006.","journal-title":"Bioinformatics"},{"key":"4628_CR12","doi-asserted-by":"crossref","unstructured":"Hopf TA, Sch\u00e4rfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife. 2014; 3.","DOI":"10.7554\/eLife.03430"},{"key":"4628_CR13","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","volume":"577","author":"AW Senior","year":"2020","unstructured":"Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706\u201310.","journal-title":"Nature"},{"key":"4628_CR14","doi-asserted-by":"publisher","first-page":"1069","DOI":"10.1002\/prot.25810","volume":"87","author":"J Xu","year":"2019","unstructured":"Xu J, Wang S. Analysis of distance-based protein structure prediction by deep learning in CASP13. Proteins Struct Funct Bioinform. 2019;87:1069\u201381.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR15","doi-asserted-by":"publisher","first-page":"16856","DOI":"10.1073\/pnas.1821309116","volume":"116","author":"J Xu","year":"2019","unstructured":"Xu J. Distance-based protein folding powered by deep learning. Proc Natl Acad Sci USA. 2019;116:16856\u201365.","journal-title":"Proc Natl Acad Sci USA"},{"key":"4628_CR16","doi-asserted-by":"publisher","first-page":"654","DOI":"10.1016\/j.cels.2021.05.017","volume":"12","author":"T Bepler","year":"2021","unstructured":"Bepler T, Berger B. Learning the protein language: evolution, structure, and function. Cell Syst. 2021;12:654-669.e3.","journal-title":"Cell Syst"},{"key":"4628_CR17","doi-asserted-by":"publisher","first-page":"2401","DOI":"10.1093\/bioinformatics\/btaa003","volume":"36","author":"N Strodthoff","year":"2020","unstructured":"Strodthoff N, Wagner P, Wenzel M, Samek W. UDSMProt: Universal deep sequence models for protein classification. Bioinformatics. 2020;36:2401\u20139.","journal-title":"Bioinformatics"},{"key":"4628_CR18","doi-asserted-by":"crossref","unstructured":"Vig J, Madani A, Varshney LR, Xiong C, Socher R, Rajani NF. BERTology meets biology: interpreting attention in protein language models. 2020.","DOI":"10.1101\/2020.06.26.174417"},{"key":"4628_CR19","doi-asserted-by":"publisher","unstructured":"Madani A, McCann B, Naik N, Keskar NS, Anand N, Eguchi RR, et al. ProGen: language modeling for protein generation. 2020. https:\/\/doi.org\/10.1101\/2020.03.07.982272.","DOI":"10.1101\/2020.03.07.982272"},{"key":"4628_CR20","doi-asserted-by":"publisher","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","volume":"118","author":"A Rives","year":"2021","unstructured":"Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci. 2021;118:e2016239118.","journal-title":"Proc Natl Acad Sci"},{"key":"4628_CR21","doi-asserted-by":"crossref","unstructured":"Rao R, Meier J, Sercu T, Ovchinnikov S, Rives A. Transformer protein language models are unsupervised structure learners. In: ICLR 2021 conference. 2021.","DOI":"10.1101\/2020.12.15.422761"},{"key":"4628_CR22","doi-asserted-by":"crossref","unstructured":"Rao R, Liu J, Verkuil R, Meier J, Canny JF, Abbeel P, et al. MSA transformer. 2021.","DOI":"10.1101\/2021.02.12.430858"},{"key":"4628_CR23","doi-asserted-by":"publisher","first-page":"1618","DOI":"10.1002\/prot.26202","volume":"89","author":"LN Kinch","year":"2021","unstructured":"Kinch LN, Schaeffer RD, Kryshtafovych A, Grishin NV. Target classification in the 14th round of the critical assessment of protein structure prediction (CASP14). Proteins Struct Funct Bioinform. 2021;89:1618\u201332. https:\/\/doi.org\/10.1002\/prot.26202.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR24","doi-asserted-by":"publisher","first-page":"1021","DOI":"10.1002\/prot.25775","volume":"87","author":"LN Kinch","year":"2019","unstructured":"Kinch LN, Kryshtafovych A, Monastyrskyy B, Grishin NV. CASP13 target classification into tertiary structure prediction categories. Proteins Struct Funct Bioinform. 2019;87:1021\u201336.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR25","doi-asserted-by":"crossref","unstructured":"Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, Baker D. Improved protein structure prediction using predicted interresidue orientations. Proc Natl Acad Sci USA 2020;117.","DOI":"10.1101\/846279"},{"key":"4628_CR26","doi-asserted-by":"publisher","first-page":"1002195","DOI":"10.1371\/journal.pcbi.1002195","volume":"7","author":"SR Eddy","year":"2011","unstructured":"Eddy SR. Accelerated profile HMM searches. PLOS Comput Biol. 2011;7:1002195.","journal-title":"PLOS Comput Biol"},{"key":"4628_CR27","doi-asserted-by":"crossref","unstructured":"Steinegger M, Meier M, Mirdita M, V\u00f6hringer H, Haunsberger SJ, S\u00f6ding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinform. 2019; 20.","DOI":"10.1186\/s12859-019-3019-7"},{"key":"4628_CR28","doi-asserted-by":"publisher","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","volume":"45","author":"M Mirdita","year":"2017","unstructured":"Mirdita M, von den Driesch L, Galiez C, Martin MJ, S\u00f6ding J, Steinegger M. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 2017;45:D170\u20136.","journal-title":"Nucleic Acids Res"},{"key":"4628_CR29","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1038\/s41592-019-0437-4","volume":"16","author":"M Steinegger","year":"2019","unstructured":"Steinegger M, Mirdita M, S\u00f6ding J. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat Methods. 2019;16:603\u20136.","journal-title":"Nat Methods"},{"key":"4628_CR30","doi-asserted-by":"publisher","first-page":"1026","DOI":"10.1038\/nbt.3988","volume":"35","author":"M Steinegger","year":"2017","unstructured":"Steinegger M, S\u00f6ding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017;35:1026\u20138.","journal-title":"Nat Biotechnol"},{"key":"4628_CR31","unstructured":"Lin M, Chen Q, Yan S. Network in network. 2013; arXiv:1312.4400."},{"key":"4628_CR32","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In: Leibe Bastian and Matas J and SN and WM, editor. Computer vision\u2014ECCV 2016. Cham: Springer International Publishing; 2016. p. 630\u201345.","DOI":"10.1007\/978-3-319-46493-0_38"},{"key":"4628_CR33","unstructured":"Liu L, Jiang H, He P, Chen W, Liu X, Gao J, et al. On the variance of the adaptive learning rate and beyond. In: Eighth international conference on learning representations (ICLR). 2020."},{"key":"4628_CR34","doi-asserted-by":"crossref","unstructured":"Wu T, Guo Z, Hou J, Cheng J. DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinform. 2021; 22.","DOI":"10.1186\/s12859-021-03960-9"},{"key":"4628_CR35","doi-asserted-by":"publisher","first-page":"2105","DOI":"10.1093\/bioinformatics\/btz863","volume":"36","author":"C Zhang","year":"2020","unstructured":"Zhang C, Zheng W, Mortuza SM, Li Y, Zhang Y. DeepMSA: Constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins. Bioinformatics. 2020;36:2105\u201312.","journal-title":"Bioinformatics"},{"key":"4628_CR36","doi-asserted-by":"publisher","first-page":"1870","DOI":"10.1002\/prot.26161","volume":"89","author":"L Heo","year":"2021","unstructured":"Heo L, Janson G, Feig M. Physics-based protein structure refinement in the era of artificial intelligence. Proteins Struct Funct Bioinform. 2021;89:1870\u201387. https:\/\/doi.org\/10.1002\/prot.26161.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR37","doi-asserted-by":"publisher","first-page":"1722","DOI":"10.1002\/prot.26194","volume":"89","author":"I Anishchenko","year":"2021","unstructured":"Anishchenko I, Baek M, Park H, Hiranuma N, Kim DE, Dauparas J, et al. Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14. Proteins Struct Funct Bioinform. 2021;89:1722\u201333. https:\/\/doi.org\/10.1002\/prot.26194.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"4628_CR38","doi-asserted-by":"publisher","first-page":"1734","DOI":"10.1002\/prot.26193","volume":"89","author":"W Zheng","year":"2021","unstructured":"Zheng W, Li Y, Zhang C, Zhou X, Pearce R, Bell EW, et al. Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14. Proteins Struct Funct Bioinform. 2021;89:1734\u201351. https:\/\/doi.org\/10.1002\/prot.26193.","journal-title":"Proteins Struct Funct Bioinform"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04628-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-022-04628-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04628-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,16]],"date-time":"2022-03-16T08:04:21Z","timestamp":1647417861000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-022-04628-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,16]]},"references-count":38,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["4628"],"URL":"https:\/\/doi.org\/10.1186\/s12859-022-04628-8","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.09.10.459866","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,16]]},"assertion":[{"value":"11 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 March 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 March 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no conflict of interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"93"}}