{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:43:40Z","timestamp":1753875820076,"version":"3.41.2"},"reference-count":66,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2025,3,31]],"date-time":"2025-03-31T00:00:00Z","timestamp":1743379200000},"content-version":"vor","delay-in-days":30,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62206192"],"award-info":[{"award-number":["62206192"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Major Scientific Instruments and Equipments Development Project of National Natural Science Foundation of China","award":["62427820"],"award-info":[{"award-number":["62427820"]}]},{"DOI":"10.13039\/501100018542","name":"Natural Science Foundation of Sichuan Province","doi-asserted-by":"publisher","award":["2023NSFSC1408","2024NSFTD0048"],"award-info":[{"award-number":["2023NSFSC1408","2024NSFTD0048"]}],"id":[{"id":"10.13039\/501100018542","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["1082204112364"],"award-info":[{"award-number":["1082204112364"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Sichuan Province Engineering Technology Research Center of Broadband Electronics Intelligent Manufacturing"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Structure-based drug design aims to generate molecules that fill the cavity of the protein pocket with a high binding affinity. Many contemporary studies employ sequential generative models. Their standard training method is to sequentialize molecular graphs into ordered sequences and then maximize the likelihood of the resulting sequences. However, the exact likelihood is computationally intractable, which involves a sum over all possible sequential orders. Molecular graphs lack an inherent order and the number of orders is factorial in the graph size. To avoid the intractable full space of factorially-many orders, existing works pre-define a fixed node ordering scheme such as depth-first search to sequentialize the 3D molecular graphs. In these cases, the training objectives are loose lower bounds of the exact likelihoods which are suboptimal for generation. To address the challenges, we propose a unified generative framework named MolEM to learn the 3D molecular graphs and corresponding sequential orders jointly. We derive a tight lower bound of the likelihood and maximize it via variational expectation-maximization algorithm, opening a new line of research in learning-based ordering schemes for 3D molecular graph generation. Besides, we first incorporate the molecular docking method QuickVina 2 to manipulate the binding poses, leading to accurate and flexible ligand conformations. Experimental results demonstrate that MolEM significantly outperforms baseline models in generating molecules with high binding affinities and realistic structures. Our approach efficiently approximates the true marginal graph likelihood and identifies reasonable orderings for 3D molecular graphs, aligning well with relevant chemical priors.<\/jats:p>","DOI":"10.1093\/bib\/bbaf094","type":"journal-article","created":{"date-parts":[[2025,3,31]],"date-time":"2025-03-31T22:21:39Z","timestamp":1743459699000},"source":"Crossref","is-referenced-by-count":0,"title":["MolEM: a unified generative framework for molecular graphs and sequential orders"],"prefix":"10.1093","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-4760-2338","authenticated-orcid":false,"given":"Hanwen","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Computer Science, Sichuan University , No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]},{"name":"Engineering Research Center of Machine Learning and Industry Intelligence , Ministry of Education, No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]}]},{"given":"Deng","family":"Xiong","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering , Stevens Institute of Technology, 1 Castle Point Terrace, Hoboken, NJ 07030 ,","place":["United States"]}]},{"given":"Xianggen","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science, Sichuan University , No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]},{"name":"Engineering Research Center of Machine Learning and Industry Intelligence , Ministry of Education, No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]}]},{"given":"Jiancheng","family":"Lv","sequence":"additional","affiliation":[{"name":"College of Computer Science, Sichuan University , No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]},{"name":"Engineering Research Center of Machine Learning and Industry Intelligence , Ministry of Education, No.24 South Section 1, Yihuan Road, Chengdu 610065 ,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,3,31]]},"reference":[{"key":"2025033118431917300_ref1","doi-asserted-by":"publisher","first-page":"bbab072","DOI":"10.1093\/bib\/bbab072","article-title":"DeepDTAF: a deep learning method to predict protein\u2013ligand binding affinity","volume":"22","author":"Wang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2025033118431917300_ref2","doi-asserted-by":"publisher","first-page":"1033","DOI":"10.1038\/s42256-021-00409-9","article-title":"A geometric deep learning approach to predict binding conformations of bioactive molecules","volume":"3","author":"M\u00e9ndez-Lucio","year":"2021","journal-title":"Nat Mach Intell"},{"key":"2025033118431917300_ref3","doi-asserted-by":"publisher","first-page":"10565","DOI":"10.1080\/07391102.2023.2257800","article-title":"The binding mechanism of failed, in processing and succeed inhibitors target SARS-CoV-2 main protease","volume":"42","author":"Hongyu","year":"2023","journal-title":"J Biomol Struct Dyn"},{"author":"Zhang","key":"2025033118431917300_ref4","article-title":"A survey on graph diffusion models: generative AI in science for molecule, protein and material"},{"key":"2025033118431917300_ref5","doi-asserted-by":"publisher","first-page":"12166","DOI":"10.1039\/D3SC04091G","article-title":"A flexible data-free framework for structure-based de novo drug design with reinforcement learning","volume":"14","author":"Hongyan","year":"2023","journal-title":"Chem Sci"},{"key":"2025033118431917300_ref6","article-title":"3D molecular generation via virtual dynamics","volume":"2024","author":"Shuqi","year":"2024","journal-title":"Transact Mach Learn Res"},{"article-title":"MolCRAFT: structure-based drug design in continuous parameter space","volume-title":"Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research","author":"Yanru","key":"2025033118431917300_ref7"},{"author":"Masuda","key":"2025033118431917300_ref8","article-title":"Generating 3D molecular structures conditional on a receptor binding site with deep generative models"},{"author":"Ragoza","key":"2025033118431917300_ref9","article-title":"Learning a continuous representation of 3D molecular structures with deep generative models"},{"key":"2025033118431917300_ref10","doi-asserted-by":"publisher","first-page":"2701","DOI":"10.1039\/D1SC05976A","article-title":"Generating 3D molecules conditional on receptor binding sites with deep generative models","volume":"13","author":"Ragoza","year":"2022","journal-title":"Chem Sci"},{"article-title":"Structure-based drug design with equivariant diffusion models","author":"Schneuing","key":"2025033118431917300_ref11","doi-asserted-by":"crossref","DOI":"10.1038\/s43588-024-00737-x"},{"key":"2025033118431917300_ref12","article-title":"3D equivariant diffusion for target-aware molecule generation and affinity prediction","volume-title":"International Conference on Learning Representations","author":"Guan","year":"2023"},{"key":"2025033118431917300_ref13","doi-asserted-by":"publisher","first-page":"2657","DOI":"10.1038\/s41467-024-46569-1","article-title":"A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets","volume":"15","author":"Huang","year":"2024","journal-title":"Nat Commun"},{"key":"2025033118431917300_ref14","article-title":"Overcoming order in autoregressive graph generation","volume":"2024","author":"Cohen-Karlik","year":"2024","journal-title":"Transact Mach Learn Res"},{"key":"2025033118431917300_ref15","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1038\/s42256-024-00815-9","article-title":"Equivariant 3D-conditional diffusion model for molecular linker design","volume":"6","author":"Igashov","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2025033118431917300_ref16","first-page":"17391","article-title":"Autoregressive diffusion model for graph generation","volume-title":"International Conference on Machine Learning","author":"Kong","year":"2023"},{"author":"Irwin","key":"2025033118431917300_ref17","article-title":"Efficient 3D molecular generation with flow matching and scale optimal transport"},{"key":"2025033118431917300_ref18","first-page":"1630","article-title":"Order matters: probabilistic modeling of node sequence for graph generation","volume-title":"International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research","author":"Chen","year":"2021"},{"key":"2025033118431917300_ref19","article-title":"Autoregressive diffusion models with non-uniform generation order","volume-title":"International Conference on Machine Learning","author":"Kelvinius","year":"2023"},{"author":"Li","key":"2025033118431917300_ref20","article-title":"Learning deep generative models of graphs"},{"key":"2025033118431917300_ref21","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.1145\/3366423.3380201","article-title":"GraphGen: a scalable approach to domain-agnostic labeled graph generation","volume-title":"Proceedings of the Web Conference 2020","author":"Goyal","year":"2020"},{"key":"2025033118431917300_ref22","first-page":"9559","article-title":"Permutation-invariant variational autoencoder for graph-level representation learning","volume":"34","author":"Winter","year":"2021","journal-title":"Adv Neural Inf Process Syst"},{"key":"2025033118431917300_ref23","doi-asserted-by":"publisher","first-page":"102548","DOI":"10.1016\/j.sbi.2023.102548","article-title":"Structure-based drug design with geometric deep learning","volume":"79","author":"Isert","year":"2023","journal-title":"Curr Opin Struct Biol"},{"key":"2025033118431917300_ref24","doi-asserted-by":"publisher","first-page":"13664","DOI":"10.1039\/D1SC04444C","article-title":"Structure-based de novo drug design using 3D deep generative models","volume":"12","author":"Li","year":"2021","journal-title":"Chem Sci"},{"key":"2025033118431917300_ref25","first-page":"17644","article-title":"Pocket2Mol: efficient molecular sampling based on 3D protein pockets","volume-title":"International Conference on Machine Learning","author":"Peng","year":"2022"},{"key":"2025033118431917300_ref26","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1038\/s43588-023-00530-2","article-title":"Learning on topological surface and geometric structure for 3D molecular generation","volume":"3","author":"Zhang","year":"2023","journal-title":"Nat Comput Sci"},{"key":"2025033118431917300_ref27","first-page":"1","article-title":"Fitting autoregressive graph generative models through maximum likelihood estimation","volume":"24","author":"Han","year":"2023","journal-title":"J Mach Learn Res"},{"key":"2025033118431917300_ref28","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-024-47011-2","article-title":"3D molecular generative framework for interaction-guided drug design","volume":"15","author":"Zhung","year":"2024","journal-title":"Nat Commun"},{"key":"2025033118431917300_ref29","doi-asserted-by":"publisher","first-page":"326","DOI":"10.1038\/s42256-024-00808-8","article-title":"Pocketflow is a data-and-knowledge-driven structure-based molecular generative model","volume":"6","author":"Jiang","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2025033118431917300_ref30","doi-asserted-by":"publisher","first-page":"18209","DOI":"10.1021\/acs.jmedchem.1c01830","article-title":"InteractionGraphNet: a novel and efficient deep graph representation learning framework for accurate protein\u2013ligand interaction predictions","volume":"64","author":"Jiang","year":"2021","journal-title":"J Med Chem"},{"key":"2025033118431917300_ref31","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbad323","article-title":"3D based generative protac linker design with reinforcement learning","volume":"24","author":"Li","year":"2023","journal-title":"Brief Bioinform"},{"key":"2025033118431917300_ref32","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1038\/s42256-023-00775-6","article-title":"Generation of 3D molecules in pockets via a language model","volume":"6","author":"Feng","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2025033118431917300_ref33","first-page":"24240","article-title":"Torsional diffusion for molecular conformer generation","volume":"35","author":"Jing","year":"2022","journal-title":"Adv Neural Inf Process Syst"},{"key":"2025033118431917300_ref34","first-page":"41382","article-title":"Learning subpocket prototypes for generalizable structure-based drug design","volume-title":"International Conference on Machine Learning","author":"Zhang","year":"2023"},{"key":"2025033118431917300_ref35","first-page":"23894","article-title":"Zero-shot 3D drug design by sketching and generating","volume":"35","author":"Long","year":"2022","journal-title":"Adv Neural Inf Process Syst"},{"author":"Yuejiang","key":"2025033118431917300_ref36","article-title":"Do deep learning models really outperform traditional approaches in molecular docking?"},{"key":"2025033118431917300_ref37","doi-asserted-by":"publisher","first-page":"789","DOI":"10.1038\/s43588-023-00511-5","article-title":"Efficient and accurate large library ligand docking with KarmaDock","volume":"3","author":"Zhang","year":"2023","journal-title":"Nat Comput Sci"},{"key":"2025033118431917300_ref38","doi-asserted-by":"publisher","first-page":"7926","DOI":"10.1039\/D3SC06803J","article-title":"DiffBindFR: an SE (3) equivariant network for flexible protein\u2013ligand docking","volume":"15","author":"Zhu","year":"2024","journal-title":"Chem Sci"},{"key":"2025033118431917300_ref39","doi-asserted-by":"publisher","first-page":"2785","DOI":"10.1002\/jcc.21256","article-title":"AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility","volume":"30","author":"Morris","year":"2009","journal-title":"J Comput Chem"},{"key":"2025033118431917300_ref40","doi-asserted-by":"publisher","first-page":"3891","DOI":"10.1021\/acs.jcim.1c00203","article-title":"AutoDock Vina 1.2. 0: new docking methods, expanded force field, and Python bindings","volume":"61","author":"Eberhardt","year":"2021","journal-title":"J Chem Inf Model"},{"key":"2025033118431917300_ref41","doi-asserted-by":"publisher","first-page":"2214","DOI":"10.1093\/bioinformatics\/btv082","article-title":"Fast, accurate, and reliable molecular docking with QuickVina 2","volume":"31","author":"Alhossary","year":"2015","journal-title":"Bioinformatics"},{"key":"2025033118431917300_ref42","doi-asserted-by":"publisher","first-page":"1739","DOI":"10.1021\/jm0306430","article-title":"Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy","volume":"47","author":"Friesner","year":"2004","journal-title":"J Med Chem"},{"key":"2025033118431917300_ref43","doi-asserted-by":"publisher","first-page":"727","DOI":"10.1006\/jmbi.1996.0897","article-title":"Development and validation of a genetic algorithm for flexible docking","volume":"267","author":"Jones","year":"1997","journal-title":"J Mol Biol"},{"key":"2025033118431917300_ref44","article-title":"DiffDock: diffusion steps, twists, and turns for molecular docking","volume-title":"International Conference on Learning Representations","author":"Corso","year":"2023"},{"key":"2025033118431917300_ref45","doi-asserted-by":"publisher","first-page":"4642","DOI":"10.1021\/acs.jcim.2c01057","article-title":"DENVIS: scalable and high-throughput virtual screening using graph neural networks with atomic and surface protein pocket features","volume":"62","author":"Krasoulis","year":"2022","journal-title":"J Chem Inf Model"},{"author":"Zhou","key":"2025033118431917300_ref46","article-title":"Do deep learning methods really perform better in molecular conformation generation?"},{"key":"2025033118431917300_ref47","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac520","article-title":"A fully differentiable ligand pose optimization framework guided by deep learning and a traditional scoring function","volume":"24","author":"Wang","year":"2023","journal-title":"Brief Bioinform"},{"key":"2025033118431917300_ref48","doi-asserted-by":"publisher","first-page":"4536","DOI":"10.1038\/s41467-024-48837-6","article-title":"Structure prediction of protein-ligand complexes from sequence information with Umol","volume":"15","author":"Bryant","year":"2024","journal-title":"Nat Commun"},{"key":"2025033118431917300_ref49","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"1977","journal-title":"J R Stat Soc B Methodol"},{"key":"2025033118431917300_ref50","doi-asserted-by":"publisher","first-page":"2257","DOI":"10.1021\/acscentsci.3c00572","volume-title":"Geometric Deep Learning for Structure-Based Ligand Design","author":"Powers","year":"2023"},{"key":"2025033118431917300_ref51","doi-asserted-by":"publisher","first-page":"103439","DOI":"10.1016\/j.drudis.2022.103439","article-title":"Docking-based generative approaches in the search for new drug candidates","volume":"28","author":"Danel","year":"2023","journal-title":"Drug Discov Today"},{"volume-title":"Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics","year":"1998","author":"Norris","key":"2025033118431917300_ref52"},{"author":"Borman","key":"2025033118431917300_ref53","article-title":"The expectation maximization algorithm-a short tutorial"},{"key":"2025033118431917300_ref54","first-page":"5241","article-title":"GMNN: graph Markov neural networks","volume-title":"International Conference on Machine Learning","author":"Meng","year":"2019"},{"key":"2025033118431917300_ref55","first-page":"9249","article-title":"An EM approach to non-autoregressive conditional sequence generation","volume-title":"International Conference on Machine Learning","author":"Sun","year":"2020"},{"key":"2025033118431917300_ref56","first-page":"3218","article-title":"VigDet: knowledge informed neural temporal point process for coordination detection on social media","volume":"34","author":"Zhang","year":"2021","journal-title":"Adv Neural Inf Process Syst"},{"author":"Min","key":"2025033118431917300_ref57","article-title":"Transformer for graphs: an overview from architecture perspective"},{"key":"2025033118431917300_ref58","article-title":"Molecule generation for target protein binding with structural motifs","volume-title":"International Conference on Learning Representations","author":"Zhang","year":"2023"},{"key":"2025033118431917300_ref59","doi-asserted-by":"publisher","first-page":"4200","DOI":"10.1021\/acs.jcim.0c00411","article-title":"Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design","volume":"60","author":"Francoeur","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2025033118431917300_ref60","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1021\/cc9800071","article-title":"A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases","volume":"1","author":"Ghose","year":"1999","journal-title":"J Comb Chem"},{"key":"2025033118431917300_ref61","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1016\/j.addr.2012.09.019","article-title":"Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings","volume":"64","author":"Lipinski","year":"2012","journal-title":"Adv Drug Deliv Rev"},{"key":"2025033118431917300_ref62","doi-asserted-by":"publisher","first-page":"2615","DOI":"10.1021\/jm020017n","article-title":"Molecular properties that influence the oral bioavailability of drug candidates","volume":"45","author":"Veber","year":"2002","journal-title":"J Med Chem"},{"key":"2025033118431917300_ref63","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-015-0069-3","article-title":"Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?","volume":"7","author":"Bajusz","year":"2015","journal-title":"J Chem"},{"volume-title":"Elementary Mathematical Theory of Classification and Prediction","year":"1958","author":"Tanimoto","key":"2025033118431917300_ref64"},{"key":"2025033118431917300_ref65","doi-asserted-by":"publisher","first-page":"6891","DOI":"10.1038\/s41467-022-34692-w","article-title":"Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor","volume":"13","author":"Li","year":"2022","journal-title":"Nat Commun"},{"key":"2025033118431917300_ref66","doi-asserted-by":"crossref","DOI":"10.1038\/s43588-024-00627-2","article-title":"MISATO: machine learning dataset of protein\u2013ligand complexes for structure-based drug discovery","volume":"4","author":"Siebenmorgen","year":"2024","journal-title":"Nat Comput Sci"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/2\/bbaf094\/62820177\/bbaf094.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/2\/bbaf094\/62820177\/bbaf094.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,31]],"date-time":"2025-03-31T22:23:18Z","timestamp":1743459798000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf094\/8101319"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":66,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,3,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf094","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2025,3]]},"published":{"date-parts":[[2025,3]]},"article-number":"bbaf094"}}