{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T22:06:28Z","timestamp":1772489188899,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2022,2,18]],"date-time":"2022-02-18T00:00:00Z","timestamp":1645142400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Science Foundation grants","award":["DBI-1922969"],"award-info":[{"award-number":["DBI-1922969"]}]},{"name":"National Science Foundation grants","award":["IIS-1908198"],"award-info":[{"award-number":["IIS-1908198"]}]},{"name":"National Science Foundation grants","award":["IIS-1955189"],"award-info":[{"award-number":["IIS-1955189"]}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["#1933212"],"award-info":[{"award-number":["#1933212"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NSF CAREER","award":["1844403"],"award-info":[{"award-number":["1844403"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,4,28]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep-learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Here, we develop a suite of comprehensive machine-learning methods and tools spanning different computational models, molecular representations and loss functions for molecular property prediction and drug discovery. Specifically, we represent molecules as both graphs and sequences. Built on these representations, we develop novel deep models for learning from molecular graphs and sequences. In order to learn effectively from highly imbalanced datasets, we develop advanced loss functions that optimize areas under precision\u2013recall curves (PRCs) and receiver operating characteristic (ROC) curves. Altogether, our work not only serves as a comprehensive tool, but also contributes toward developing novel and advanced graph and sequence-learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that our methods achieve consistent improvements over prior methods. In particular, our methods achieve #1 ranking in terms of both ROC-AUC (area under curve) and PRC-AUC on the AI Cures open challenge for drug discovery related to COVID-19.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Our source code is released as part of the MoleculeX library (https:\/\/github.com\/divelab\/MoleculeX) under AdvProp.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac112","type":"journal-article","created":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T20:13:44Z","timestamp":1645042424000},"page":"2579-2586","source":"Crossref","is-referenced-by-count":84,"title":["Advanced graph and sequence neural networks for molecular property prediction and drug discovery"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5146-2884","authenticated-orcid":false,"given":"Zhengyang","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Meng","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3763-0239","authenticated-orcid":false,"given":"Youzhi","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Zhao","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Yaochen","family":"Xie","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Limei","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Lei","family":"Cai","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]},{"given":"Qi","family":"Qi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Iowa , Iowa City, IA 52242, USA"}]},{"given":"Zhuoning","family":"Yuan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Iowa , Iowa City, IA 52242, USA"}]},{"given":"Tianbao","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Iowa , Iowa City, IA 52242, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4205-4563","authenticated-orcid":false,"given":"Shuiwang","family":"Ji","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University , College Station, TX 77843, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,2,18]]},"reference":[{"key":"2023041605542312900_","article-title":"Layer normalization","author":"Ba","year":"2016","journal-title":"arXiv"},{"key":"2023041605542312900_","first-page":"33, 12449\u201312460","article-title":"wav2vec 2.0: a framework for self-supervised learning of speech representations","author":"Baevski","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"136403","DOI":"10.1103\/PhysRevLett.104.136403","article-title":"Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons","volume":"104","author":"Bart\u00f3k","year":"2010","journal-title":"Phys. Rev. Lett"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"184115","DOI":"10.1103\/PhysRevB.87.184115","article-title":"On representing chemical environments","volume":"87","author":"Bart\u00f3k","year":"2013","journal-title":"Phys. Rev. B"},{"key":"2023041605542312900_","article-title":"Relational inductive biases, deep learning, and graph networks","author":"Battaglia","year":"2018","journal-title":"arXiv"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"146401","DOI":"10.1103\/PhysRevLett.98.146401","article-title":"Generalized neural-network representation of high-dimensional potential-energy surfaces","volume":"98","author":"Behler","year":"2007","journal-title":"Phys. Rev. Lett"},{"key":"2023041605542312900_","article-title":"SMILES enumeration as data augmentation for neural network modeling of molecules","author":"Bjerrum","year":"2017","journal-title":"arXiv"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1080\/1062936X.2011.645874","article-title":"In silico toxicity prediction by support vector machine and smiles representation-based string kernel","volume":"23","author":"Cao","year":"2012","journal-title":"SAR QSAR Environ. Res"},{"key":"2023041605542312900_","author":"Chen","year":"2020"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"1757","DOI":"10.1021\/acs.jcim.6b00601","article-title":"Convolutional embedding of attributed molecular graphs for physical property prediction","volume":"57","author":"Coley","year":"2017","journal-title":"J. Chem. Inf. Model"},{"key":"2023041605542312900_","author":"Cures","year":"2020"},{"key":"2023041605542312900_","first-page":"4171","author":"Devlin","year":"2019"},{"key":"2023041605542312900_","first-page":"2224","author":"Duvenaud","year":"2015"},{"key":"2023041605542312900_","article-title":"Benchmarking graph neural networks","author":"Dwivedi","year":"2020","journal-title":"arXiv"},{"key":"2023041605542312900_","author":"Fey","year":"2020"},{"key":"2023041605542312900_","first-page":"1263","author":"Gilmer","year":"2017"},{"key":"2023041605542312900_","first-page":"1735","author":"Hadsell","year":"2006"},{"key":"2023041605542312900_","first-page":"770","author":"He","year":"2016"},{"key":"2023041605542312900_","first-page":"9729","author":"He","year":"2020"},{"key":"2023041605542312900_","article-title":"Gaussian error linear units (GELUs)","author":"Hendrycks","year":"2016","journal-title":"arXiv"},{"key":"2023041605542312900_","article-title":"SMILES Transformer: pre-trained molecular fingerprint for low data drug discovery","author":"Honda","year":"2019","journal-title":"arXiv"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"241745","DOI":"10.1063\/1.5024797","article-title":"Predicting molecular properties with covariant compositional networks","volume":"148","author":"Hy","year":"2018","journal-title":"J. Chem. Phys"},{"key":"2023041605542312900_","first-page":"448","author":"Ioffe","year":"2015"},{"key":"2023041605542312900_","first-page":"2323","author":"Jin","year":"2018"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1007\/s10822-016-9938-8","article-title":"Molecular graph convolutions: moving beyond fingerprints","volume":"30","author":"Kearnes","year":"2016","journal-title":"J. Comput. Aided Mol. Des"},{"key":"2023041605542312900_","author":"Landrum","year":"2006"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2023041605542312900_","first-page":"419","article-title":"Text classification using string kernels","volume":"2","author":"Lodhi","year":"2002","journal-title":"J. Mach. Learn. Res"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1021\/ci500747n","article-title":"Deep neural nets as a method for quantitative structure\u2013activity relationships","volume":"55","author":"Ma","year":"2015","journal-title":"J. Chem. Inf. Model"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"5441","DOI":"10.1039\/C8SC00148K","article-title":"Large-scale comparison of machine learning methods for drug target prediction on ChEMBL","volume":"9","author":"Mayr","year":"2018","journal-title":"Chem. Sci"},{"key":"2023041605542312900_","first-page":"145","author":"Neglur","year":"2005"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.1073\/pnas.1219097111","article-title":"Ranking and combining multiple predictors without labeled data","volume":"111","author":"Parisi","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023041605542312900_","article-title":"Stochastic optimization of area under precision-recall curve for deep learning with provable convergence","author":"Qi","year":"2021"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1023\/A:1008068904628","article-title":"Feature trees: a new molecular similarity measure based on tree matching","volume":"12","author":"Rarey","year":"1998","journal-title":"J. Comput. Aided Mol. Des"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1016\/S0079-6107(98)00026-1","article-title":"Artificial neural networks for computer-based molecular design","volume":"70","author":"Schneider","year":"1998","journal-title":"Prog. Biophys. Mol. Biol"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"13890","DOI":"10.1038\/ncomms13890","article-title":"Quantum-chemical insights from deep tensor neural networks","volume":"8","author":"Sch\u00fctt","year":"2017","journal-title":"Nat. Commun"},{"key":"2023041605542312900_","first-page":"30","author":"Shaham","year":"2016"},{"key":"2023041605542312900_","first-page":"2539","article-title":"Weisfeiler-Lehman graph kernels","volume":"12","author":"Shervashidze","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1016\/j.cell.2020.01.021","article-title":"A deep learning approach to antibiotic discovery","volume":"180","author":"Stokes","year":"2020","journal-title":"Cell"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"3678","DOI":"10.1021\/acs.jctc.9b00181","article-title":"PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges","volume":"15","author":"Unke","year":"2019","journal-title":"J. Chem. Theory Comput"},{"key":"2023041605542312900_","first-page":"1","author":"Unterthiner","year":"2014"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1021\/ci200409x","article-title":"Machine learning methods for property prediction in chemoinformatics: Quo Vadis?","volume":"52","author":"Varnek","year":"2012","journal-title":"J. Chem. Inf. Model"},{"key":"2023041605542312900_","first-page":"5998","author":"Vaswani","year":"2017"},{"key":"2023041605542312900_","first-page":"429","author":"Wang","year":"2019"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J. Chem. Inf. Comput. Sci"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1039\/C7SC02664A","article-title":"MoleculeNet: a benchmark for molecular machine learning","volume":"9","author":"Wu","year":"2018","journal-title":"Chem. Sci"},{"key":"2023041605542312900_","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1021\/acs.jcim.9b00237","article-title":"Analyzing learned molecular representations for property prediction","volume":"59","author":"Yang","year":"2019","journal-title":"J. Chem. Inf. Model"},{"key":"2023041605542312900_","author":"Yuan","year":"2021"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac112\/42628951\/btac112.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/9\/2579\/49874063\/btac112.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/9\/2579\/49874063\/btac112.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,18]],"date-time":"2024-09-18T19:52:21Z","timestamp":1726689141000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/9\/2579\/6531963"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,2,18]]},"references-count":48,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2022,4,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac112","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5,1]]},"published":{"date-parts":[[2022,2,18]]}}}