{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T17:05:34Z","timestamp":1773162334110,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"S19","license":[{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,12,24]],"date-time":"2019-12-24T00:00:00Z","timestamp":1577145600000},"content-version":"vor","delay-in-days":23,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Protein comparative analysis and similarity searches play essential roles in structural bioinformatics. A couple of algorithms for protein structure alignments have been developed in recent years. However, facing the rapid growth of protein structure data, improving overall comparison performance and running efficiency with massive sequences is still challenging.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Here, we propose MADOKA, an ultra-fast approach for massive structural neighbor searching using a novel two-phase algorithm. Initially, we apply a fast alignment between pairwise structures. Then, we employ a score to select pairs with more similarity to carry out a more accurate fragment-based residue-level alignment. MADOKA performs about 6\u2013100 times faster than existing methods, including TM-align and SAL, in massive alignments. Moreover, the quality of structural alignment of MADOKA is better than the existing algorithms in terms of TM-score and number of aligned residues. We also develop a web server to search structural neighbors in PDB database (About 360,000 protein chains in total), as well as additional features such as 3D structure alignment visualization. The MADOKA web server is freely available at:<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/madoka.denglab.org\/\">http:\/\/madoka.denglab.org\/<\/jats:ext-link><\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>MADOKA is an efficient approach to search for protein structure similarity. In addition, we provide a parallel implementation of MADOKA which exploits massive power of multi-core CPUs.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-019-3235-1","type":"journal-article","created":{"date-parts":[[2019,12,24]],"date-time":"2019-12-24T02:02:43Z","timestamp":1577152963000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["MADOKA: an ultra-fast approach for large-scale protein structure similarity searching"],"prefix":"10.1186","volume":"20","author":[{"given":"Lei","family":"Deng","sequence":"first","affiliation":[]},{"given":"Guolun","family":"Zhong","sequence":"additional","affiliation":[]},{"given":"Chenzhe","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Judong","family":"Luo","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7158-913X","authenticated-orcid":false,"given":"Hui","family":"Liu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,12,24]]},"reference":[{"issue":"7421","key":"3235_CR1","doi-asserted-by":"publisher","first-page":"556","DOI":"10.1038\/nature11503","volume":"490","author":"QC Zhang","year":"2012","unstructured":"Zhang QC, Petrey D, Deng L, Qiang L, Shi Y, Thu CA, Bisikirska B, Lefebvre C, Accili D, Hunter T, et al.Structure-based prediction of protein\u2013protein interactions on a genome-wide scale. Nature. 2012; 490(7421):556.","journal-title":"Nature"},{"issue":"4","key":"3235_CR2","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1109\/TNB.2014.2352454","volume":"14","author":"L Wei","year":"2014","unstructured":"Wei L, Liao M, Gao X, Zou Q. An improved protein structural classes prediction method by incorporating both sequence and structure information. IEEE Trans Nanobiosc. 2014; 14(4):339\u201349.","journal-title":"IEEE Trans Nanobiosc"},{"key":"3235_CR3","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.sbi.2015.01.007","volume":"32","author":"D Petrey","year":"2015","unstructured":"Petrey D, Chen TS, Deng L, Garzon JI, Hwang H, Lasso G, Lee H, Silkov A, Honig B. Template-based prediction of protein function. Curr Opin Struct Biol. 2015; 32:33\u20138.","journal-title":"Curr Opin Struct Biol"},{"issue":"4","key":"3235_CR4","doi-asserted-by":"publisher","first-page":"902","DOI":"10.1109\/TCBB.2015.2389213","volume":"12","author":"L Deng","year":"2015","unstructured":"Deng L, Chen Z. An integrated framework for functional annotation of protein structural domains. IEEE\/ACM Trans Comput Biol Bioinform (TCBB). 2015; 12(4):902\u201313.","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform (TCBB)"},{"key":"3235_CR5","doi-asserted-by":"publisher","first-page":"18715","DOI":"10.7554\/eLife.18715","volume":"5","author":"JI Garz\u00f3n","year":"2016","unstructured":"Garz\u00f3n JI, Deng L, Murray D, Shapira S, Petrey D, Honig B. A computational interactome and functional annotation for the human proteome. Elife. 2016; 5:18715.","journal-title":"Elife"},{"key":"3235_CR6","first-page":"8","volume":"1","author":"S Minami","year":"2018","unstructured":"Minami S, Sawada K, Ota M, Chikenji G. Mican-sq: A sequential protein structure alignment program that is applicable to monomers and all types of oligomers. Bioinformatics. 2018; 1:8.","journal-title":"Bioinformatics"},{"key":"3235_CR7","doi-asserted-by":"publisher","unstructured":"Zeng C, Zhan W, Deng L. Sdadb: a functional annotation database of protein structural domains. Database. 2018; 2018. https:\/\/doi.org\/10.1093\/database\/bay064.","DOI":"10.1093\/database\/bay064"},{"issue":"1","key":"3235_CR8","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1006\/jmbi.1993.1489","volume":"233","author":"L Holm","year":"1993","unstructured":"Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993; 233(1):123\u201338.","journal-title":"J Mol Biol"},{"issue":"9","key":"3235_CR9","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1093\/protein\/11.9.739","volume":"11","author":"IN Shindyalov","year":"1998","unstructured":"Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (ce) of the optimal path. Protein Eng. 1998; 11(9):739\u201347.","journal-title":"Protein Eng"},{"issue":"4","key":"3235_CR10","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1016\/j.jmb.2003.10.027","volume":"334","author":"D Kihara","year":"2003","unstructured":"Kihara D, Skolnick J. The pdb is a covering set of small protein structures. J Mol Biol. 2003; 334(4):793.","journal-title":"J Mol Biol"},{"issue":"suppl_2","key":"3235_CR11","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1093\/bioinformatics\/btg1086","volume":"19","author":"Y Ye","year":"2003","unstructured":"Ye Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics. 2003; 19(suppl_2):246\u201355.","journal-title":"Bioinformatics"},{"issue":"7","key":"3235_CR12","doi-asserted-by":"publisher","first-page":"2302","DOI":"10.1093\/nar\/gki524","volume":"33","author":"Y Zhang","year":"2005","unstructured":"Zhang Y, Skolnick J. Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic Acids Res. 2005; 33(7):2302\u20139.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"3235_CR13","doi-asserted-by":"publisher","first-page":"531","DOI":"10.1186\/1471-2105-9-531","volume":"9","author":"SB Pandit","year":"2008","unstructured":"Pandit SB, Skolnick J. Fr-tm-align: a new protein structural alignment method based on fragment alignments and the tm-score. Bmc Bioinformatics. 2008; 9(1):531.","journal-title":"Bmc Bioinformatics"},{"issue":"3","key":"3235_CR14","doi-asserted-by":"publisher","first-page":"618","DOI":"10.1002\/prot.20331","volume":"58","author":"J Zhu","year":"2005","unstructured":"Zhu J, Weng Z. Fast: a novel protein structure alignment algorithm. Proteins Struct Funct Bioinform. 2005; 58(3):618\u201327.","journal-title":"Proteins Struct Funct Bioinform"},{"key":"3235_CR15","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1007\/978-3-642-38865-1_34","volume-title":"Computer Networks","author":"Dariusz Mrozek","year":"2013","unstructured":"Mrozek D, Ma\u0142ysiak-Mrozek B. Cassert: a two-phase alignment algorithm for matching 3d structures of proteins. In: International Conference on Computer Networks. Springer: 2013. p. 334\u201343. https:\/\/doi.org\/10.1007\/978-3-642-38865-1_34."},{"key":"3235_CR16","doi-asserted-by":"publisher","first-page":"1448","DOI":"10.1038\/srep01448","volume":"3","author":"S Wang","year":"2013","unstructured":"Wang S, Ma J, Peng J, Xu J. Protein structure alignment beyond spatial proximity. Sci Rep. 2013; 3:1448.","journal-title":"Sci Rep"},{"issue":"1","key":"3235_CR17","doi-asserted-by":"publisher","first-page":"617","DOI":"10.1016\/S0076-6879(96)66038-8","volume":"266","author":"CA Orengo","year":"1996","unstructured":"Orengo CA, Taylor WR. Ssap: sequential structure alignment program for protein structure comparison. Methods Enzymol. 1996; 266(1):617\u201335.","journal-title":"Methods Enzymol"},{"issue":"11","key":"3235_CR18","doi-asserted-by":"publisher","first-page":"2606","DOI":"10.1110\/ps.0215902","volume":"11","author":"AR Ortiz","year":"2009","unstructured":"Ortiz AR, Strauss CEM, Olmea O. Mammoth (matching molecular models obtained from theory): An automated method for model comparison. Protein Sci. 2009; 11(11):2606\u201321.","journal-title":"Protein Sci"},{"issue":"15","key":"3235_CR19","doi-asserted-by":"publisher","first-page":"2475","DOI":"10.1093\/bioinformatics\/btv177","volume":"31","author":"Q Zou","year":"2015","unstructured":"Zou Q, Hu Q, Guo M, Wang G. Halign: Fast multiple similar dna\/rna sequence alignment based on the centre star strategy. Bioinformatics. 2015; 31(15):2475\u201381.","journal-title":"Bioinformatics"},{"key":"3235_CR20","doi-asserted-by":"publisher","first-page":"380","DOI":"10.1093\/nar\/gkx1013","volume":"46","author":"R Dong","year":"2018","unstructured":"Dong R, Pan S, Peng Z, Zhang Y, Yang J. mtm-align: a server for fast protein structure database search and multiple protein structure alignment. Nucleic Acids Res. 2018; 46:380\u20136.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"3235_CR21","doi-asserted-by":"publisher","first-page":"702","DOI":"10.1002\/prot.20264","volume":"57","author":"Y Zhang","year":"2004","unstructured":"Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004; 57(4):702\u201310.","journal-title":"Proteins"},{"issue":"9","key":"3235_CR22","doi-asserted-by":"publisher","first-page":"776","DOI":"10.1093\/bioinformatics\/16.9.776","volume":"16","author":"N Siew","year":"2000","unstructured":"Siew N, Elofsson A, Rychlewski L, Fischer D. Maxsub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics. 2000; 16(9):776\u201385.","journal-title":"Bioinformatics"},{"issue":"17","key":"3235_CR23","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Sch\u00e4ffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 1997; 25(17):3389\u2013402.","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"3235_CR24","doi-asserted-by":"publisher","first-page":"2067","DOI":"10.1007\/s00894-014-2067-1","volume":"20","author":"D Mrozek","year":"2014","unstructured":"Mrozek D, Bro\u017bek M, Ma\u0142ysiak-Mrozek B. Parallel implementation of 3d protein structure similarity searches using a gpu and the cuda. J Mol Model. 2014; 20(2):2067.","journal-title":"J Mol Model"},{"issue":"1","key":"3235_CR25","doi-asserted-by":"publisher","first-page":"116","DOI":"10.1186\/1756-0500-5-116","volume":"5","author":"B Pang","year":"2012","unstructured":"Pang B, Zhao N, Becchi M, Korkin D, Shyu C-R. Accelerating large-scale protein structure alignments with graphics processing units. BMC Res Notes. 2012; 5(1):116.","journal-title":"BMC Res Notes"},{"issue":"3","key":"3235_CR26","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1006\/jmbi.2000.3973","volume":"301","author":"A-S Yang","year":"2000","unstructured":"Yang A-S, Honig B. An integrated approach to the analysis and modeling of protein sequences and structures. i. protein structural alignment and a quantitative measure for protein structural distance1. J Mol Biol. 2000; 301(3):665\u201378.","journal-title":"J Mol Biol"},{"issue":"1","key":"3235_CR27","doi-asserted-by":"publisher","first-page":"88","DOI":"10.1107\/S0108767307035623","volume":"64","author":"HM Berman","year":"2008","unstructured":"Berman HM. The protein data bank: a historical perspective. Acta Crystallogr A. 2008; 64(1):88\u201395.","journal-title":"Acta Crystallogr A"},{"issue":"8","key":"3235_CR28","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","volume":"5","author":"CA Orengo","year":"1997","unstructured":"Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. Cath \u2013 a hierarchic classification of protein domain structures. Structure. 1997; 5(8):1093\u2013108.","journal-title":"Structure"},{"issue":"4","key":"3235_CR29","doi-asserted-by":"publisher","first-page":"1162","DOI":"10.1002\/prot.21783","volume":"70","author":"H Cheng","year":"2010","unstructured":"Cheng H, Kim BH, Grishin NV. Malidup: a database of manually constructed structure alignments for duplicated domain pairs. Proteins Struct Funct Bioinform. 2010; 70(4):1162\u20136.","journal-title":"Proteins Struct Funct Bioinform"},{"issue":"Database issue","key":"3235_CR30","first-page":"211","volume":"36","author":"H Cheng","year":"2008","unstructured":"Cheng H, Kim BH, Grishin NV. Malisam: a database of structurally analogous motifs in proteins. Nucleic Acids Res. 2008; 36(Database issue):211\u20137.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"3235_CR31","first-page":"536","volume":"247","author":"AG Murzin","year":"1995","unstructured":"Murzin AG, Brenner SE, Hubbard T, Chothia C. Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995; 247(4):536\u201340.","journal-title":"J Mol Biol"},{"key":"3235_CR32","unstructured":"Stroustrup B. The C++ Programming Language, 4th Edition; 2013."},{"issue":"3","key":"3235_CR33","first-page":"1448","volume":"3","author":"S Wang","year":"2012","unstructured":"Wang S, Ma J, Peng J, Xu J. Protein structure alignment beyond spatial proximity. Sci Rep. 2012; 3(3):1448.","journal-title":"Sci Rep"},{"issue":"1","key":"3235_CR34","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1371\/journal.pcbi.0040010","volume":"4","author":"M Menke","year":"2008","unstructured":"Menke M, Berger B, Cowen L. Matt: local flexibility aids protein multiple structure alignment. PloS Comput Biol. 2008; 4(1):10.","journal-title":"PloS Comput Biol"},{"issue":"1","key":"3235_CR35","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1186\/1471-2105-13-259","volume":"13","author":"NM Daniels","year":"2012","unstructured":"Daniels NM, Shilpa N, Cowen LJ. Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment. BMC Bioinformatics. 2012; 13(1):259.","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"3235_CR36","doi-asserted-by":"publisher","first-page":"370","DOI":"10.1093\/bioinformatics\/btv580","volume":"32","author":"P Brown","year":"2016","unstructured":"Brown P, Pullan W, Yang Y, Zhou Y. Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic. Bioinformatics. 2016; 32(3):370.","journal-title":"Bioinformatics"},{"issue":"3","key":"3235_CR37","doi-asserted-by":"publisher","first-page":"502","DOI":"10.1002\/prot.20106","volume":"56","author":"J Skolnick","year":"2004","unstructured":"Skolnick J, Kihara D, Zhang Y. Development and large scale benchmark testing of the prospector_3 threading algorithm. Proteins-Struct Funct Bioinform. 2004; 56(3):502\u201318.","journal-title":"Proteins-Struct Funct Bioinform"},{"issue":"9","key":"3235_CR38","doi-asserted-by":"publisher","first-page":"1059","DOI":"10.1093\/protein\/7.9.1059","volume":"7","author":"RH Lathrop","year":"1994","unstructured":"Lathrop RH. The protein threading problem with sequence amino acid interaction preferences is np-complete. Protein Eng. 1994; 7(9):1059.","journal-title":"Protein Eng"},{"issue":"13","key":"3235_CR39","doi-asserted-by":"publisher","first-page":"465","DOI":"10.1186\/s12859-017-1879-2","volume":"18","author":"Y Tang","year":"2017","unstructured":"Tang Y, Liu D, Wang Z, Wen T, Deng L. A boosting approach for prediction of protein-rna binding residues. BMC Bioinformatics. 2017; 18(13):465.","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"3235_CR40","doi-asserted-by":"publisher","first-page":"177","DOI":"10.2174\/1389200219666180829121038","volume":"20","author":"N Zheng","year":"2019","unstructured":"Zheng N, Wang K, Zhan W, Deng L. Targeting virus-host protein interactions: Feature extraction and machine learning approaches. Curr Drug Metab. 2019; 20(3):177\u201384.","journal-title":"Curr Drug Metab"},{"issue":"9","key":"3235_CR41","doi-asserted-by":"publisher","first-page":"1473","DOI":"10.1093\/bioinformatics\/btx822","volume":"34","author":"Y Pan","year":"2018","unstructured":"Pan Y, Wang Z, Zhan W, Deng L. Computational identification of binding energy hot spots in protein\u2013rna complexes using an ensemble approach. Bioinformatics. 2018; 34(9):1473\u201380.","journal-title":"Bioinformatics"},{"issue":"1","key":"3235_CR42","doi-asserted-by":"publisher","first-page":"14285","DOI":"10.1038\/s41598-018-32511-1","volume":"8","author":"H Wang","year":"2018","unstructured":"Wang H, Liu C, Deng L. Enhanced prediction of hot spots at protein-protein interfaces using extreme gradient boosting. Sci Rep. 2018; 8(1):14285.","journal-title":"Sci Rep"},{"issue":"12","key":"3235_CR43","doi-asserted-by":"publisher","first-page":"2577","DOI":"10.1002\/bip.360221211","volume":"22","author":"W Kabsch","year":"1983","unstructured":"Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983; 22(12):2577\u2013637.","journal-title":"Biopolymers"},{"key":"3235_CR44","unstructured":"Cormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to Algorithms, Third Edition; 2009."},{"issue":"5","key":"3235_CR45","first-page":"922","volume":"32","author":"W Kabsch","year":"1976","unstructured":"Kabsch W. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallogr Section Found Crystallogr. 1976; 32(5):922\u20133.","journal-title":"Acta Crystallogr Section Found Crystallogr"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3235-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-019-3235-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3235-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,24]],"date-time":"2023-09-24T09:57:07Z","timestamp":1695549427000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-019-3235-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12]]},"references-count":45,"journal-issue":{"issue":"S19","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["3235"],"URL":"https:\/\/doi.org\/10.1186\/s12859-019-3235-1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12]]},"assertion":[{"value":"11 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 November 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 December 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"662"}}