{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T23:45:29Z","timestamp":1776123929825,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2021,10,13]],"date-time":"2021-10-13T00:00:00Z","timestamp":1634083200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Project supported by Shanghai Municipal Science and Technology Major Project"},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["81773634"],"award-info":[{"award-number":["81773634"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Tencent AI Lab Rhino-Bird Focused Research Program","award":["JR202002"],"award-info":[{"award-number":["JR202002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The acid dissociation constant (pKa) is a critical parameter to reflect the ionization ability of chemical compounds and is widely applied in a variety of industries. However, the experimental determination of pKa is intricate and time-consuming, especially for the exact determination of micro-pKa information at the atomic level. Hence, a fast and accurate prediction of pKa values of chemical compounds is of broad interest.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we compiled a large-scale pKa dataset containing 16\u00a0595 compounds with 17\u00a0489 pKa values. Based on this dataset, a novel pKa prediction model, named Graph-pKa, was established using graph neural networks. Graph-pKa performed well on the prediction of macro-pKa values, with a mean absolute error around 0.55 and a coefficient of determination around 0.92 on the test dataset. Furthermore, combining multi-instance learning, Graph-pKa was also able to automatically deconvolute the predicted macro-pKa into discrete micro-pKa values.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The Graph-pKa model is now freely accessible via a web-based interface (https:\/\/pka.simm.ac.cn\/).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab714","type":"journal-article","created":{"date-parts":[[2021,10,11]],"date-time":"2021-10-11T14:36:22Z","timestamp":1633962982000},"page":"792-798","source":"Crossref","is-referenced-by-count":43,"title":["Multi-instance learning of graph neural networks for aqueous p<i>K<\/i>a prediction"],"prefix":"10.1093","volume":"38","author":[{"given":"Jiacheng","family":"Xiong","sequence":"first","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]},{"given":"Zhaojun","family":"Li","sequence":"additional","affiliation":[{"name":"Development Department, Suzhou Alphama Biotechnology Co., Ltd , Suzhou City 215000, China"}]},{"given":"Guangchao","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer and Information Engineering, Dezhou University , Dezhou City 253023, China"}]},{"given":"Zunyun","family":"Fu","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"}]},{"given":"Feisheng","family":"Zhong","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]},{"given":"Tingyang","family":"Xu","sequence":"additional","affiliation":[{"name":"Tencent AI Lab , Tencent, Shenzhen 518057, China"}]},{"given":"Xiaomeng","family":"Liu","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]},{"given":"Ziming","family":"Huang","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]},{"given":"Xiaohong","family":"Liu","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"Development Department, Suzhou Alphama Biotechnology Co., Ltd , Suzhou City 215000, China"},{"name":"Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University , Shanghai 200031, China"}]},{"given":"Kaixian","family":"Chen","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]},{"given":"Hualiang","family":"Jiang","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"},{"name":"Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University , Shanghai 200031, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3323-3092","authenticated-orcid":false,"given":"Mingyue","family":"Zheng","sequence":"additional","affiliation":[{"name":"Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences , Shanghai 201203, China"},{"name":"College of Pharmacy, University of Chinese Academy of Sciences , Beijing 100049, China"}]}],"member":"286","published-online":{"date-parts":[[2021,10,13]]},"reference":[{"key":"2023020108473633400_btab714-B1","author":"Bartmess","year":"2010"},{"key":"2023020108473633400_btab714-B2","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1016\/j.patcog.2017.10.009","article-title":"Multiple instance learning: a survey of problem characteristics and applications","volume":"77","author":"Carbonneau","year":"2018","journal-title":"Pattern Recognit"},{"key":"2023020108473633400_btab714-B3","doi-asserted-by":"crossref","first-page":"9701","DOI":"10.1021\/jm501000a","article-title":"Acidic and basic drugs in medicinal chemistry: a perspective","volume":"57","author":"Charifson","year":"2014","journal-title":"J. Med. Chem"},{"key":"2023020108473633400_btab714-B4","first-page":"3844","author":"Defferrard","year":"2016"},{"key":"2023020108473633400_btab714-B5","first-page":"2224","author":"Duvenaud","year":"2015"},{"key":"2023020108473633400_btab714-B6","first-page":"1050","author":"Gal","year":"2016"},{"key":"2023020108473633400_btab714-B7","doi-asserted-by":"crossref","first-page":"D945","DOI":"10.1093\/nar\/gkw1074","article-title":"The ChEMBL database in 2017","volume":"45","author":"Gaulton","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023020108473633400_btab714-B8","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1038\/s42256-020-00257-z","article-title":"Shortcut learning in deep neural networks","volume":"2","author":"Geirhos","year":"2020","journal-title":"Nat. Mach. Intell"},{"key":"2023020108473633400_btab714-B9","author":"Gilmer","year":"2017"},{"key":"2023020108473633400_btab714-B10","doi-asserted-by":"crossref","first-page":"2989","DOI":"10.1021\/acs.jcim.0c00105","article-title":"Predicting pKa using a combination of semi-empirical quantum mechanics and radial basis function methods","volume":"60","author":"Hunt","year":"2020","journal-title":"J. Chem Inf. Model"},{"key":"2023020108473633400_btab714-B11","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1007\/s10822-018-0168-0","article-title":"pka measurements for the sampl6 prediction challenge for a set of kinase inhibitor-like fragments","volume":"32","author":"I\u015f\u0131k","year":"2018","journal-title":"J. Comput. Aided Mol. Des"},{"key":"2023020108473633400_btab714-B12","first-page":"25","article-title":"The pKa distribution of drugs: application to drug discovery","volume":"1","author":"Manallack","year":"2007","journal-title":"Perspect. Med. Chem"},{"key":"2023020108473633400_btab714-B13","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1039\/C2CS35348B","article-title":"The significance of acid\/base properties in drug discovery","volume":"42","author":"Manallack","year":"2013","journal-title":"Chem. Soc. Rev"},{"key":"2023020108473633400_btab714-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13321-019-0384-1","article-title":"Open-source QSAR models for pKa prediction using multiple machine learning approaches","volume":"11","author":"Mansouri","year":"2019","journal-title":"J. Cheminf"},{"key":"2023020108473633400_btab714-B15","author":"Niepert","year":"2016"},{"key":"2023020108473633400_btab714-B16","doi-asserted-by":"crossref","first-page":"17142","DOI":"10.1021\/jacs.9b05895","article-title":"Rapid and accurate prediction of p K a values of C\u2013H acids using graph convolutional neural networks","volume":"141","author":"Roszak","year":"2019","journal-title":"J. Am. Chem. Soc"},{"key":"2023020108473633400_btab714-B17","doi-asserted-by":"crossref","first-page":"307","DOI":"10.2174\/138620711795508403","article-title":"Predicting the pKa of small molecules","volume":"14","author":"Rupp","year":"2011","journal-title":"Comb. Chem. High Throughput Screen"},{"key":"2023020108473633400_btab714-B18","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1093\/bib\/bbz042","article-title":"Graph convolutional networks for computational drug development and discovery","volume":"21","author":"Sun","year":"2020","journal-title":"Brief. Bioinform"},{"key":"2023020108473633400_btab714-B19","doi-asserted-by":"crossref","first-page":"101549","DOI":"10.1016\/j.media.2019.101549","article-title":"RMDL: recalibrated multi-instance deep learning for whole slide gastric image classification","volume":"58","author":"Wang","year":"2019","journal-title":"Med. Image Anal"},{"key":"2023020108473633400_btab714-B20","doi-asserted-by":"crossref","first-page":"D1074","DOI":"10.1093\/nar\/gkx1037","article-title":"DrugBank 5.0: a major update to the DrugBank database for 2018","volume":"46","author":"Wishart","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023020108473633400_btab714-B21","doi-asserted-by":"crossref","first-page":"8749","DOI":"10.1021\/acs.jmedchem.9b00959","article-title":"Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism","volume":"63","author":"Xiong","year":"2020","journal-title":"J. Med. Chem"},{"key":"2023020108473633400_btab714-B22","doi-asserted-by":"crossref","first-page":"19282","DOI":"10.1002\/anie.202008528","article-title":"Holistic prediction of pKa in diverse solvents based on machine-learning approach","volume":"59","author":"Yang","year":"2020","journal-title":"Angew. Chem. Int. Ed"},{"key":"2023020108473633400_btab714-B23","doi-asserted-by":"crossref","first-page":"1466","DOI":"10.1002\/jcc.21707","article-title":"PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints","volume":"32","author":"Yap","year":"2011","journal-title":"J. Comput. Chem"},{"key":"2023020108473633400_btab714-B24","doi-asserted-by":"crossref","first-page":"2981","DOI":"10.1093\/bioinformatics\/btab195","article-title":"FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction","volume":"37","author":"Zhang","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020108473633400_btab714-B25","first-page":"318","author":"Zhou","year":"2017"},{"key":"2023020108473633400_btab714-B26","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1093\/nsr\/nwx106","article-title":"A brief introduction to weakly supervised learning","volume":"5","author":"Zhou","year":"2018","journal-title":"Natl. Sci. Rev"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab714\/41089809\/btab714.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/792\/49007282\/btab714.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/792\/49007282\/btab714.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T20:04:08Z","timestamp":1675281848000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/3\/792\/6395352"}},"subtitle":[],"editor":[{"given":"Zhiyong","family":"Lu","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,10,13]]},"references-count":26,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,1,12]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab714","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,2,1]]},"published":{"date-parts":[[2021,10,13]]}}}