{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T09:52:52Z","timestamp":1772272372042,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2022,8,26]],"date-time":"2022-08-26T00:00:00Z","timestamp":1661472000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61872256"],"award-info":[{"award-number":["61872256"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004377","name":"Hong Kong Polytechnic University","doi-asserted-by":"publisher","award":["P0036200"],"award-info":[{"award-number":["P0036200"]}],"id":[{"id":"10.13039\/501100004377","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,9,20]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Anatomical Therapeutic Chemical (ATC) classification for compounds\/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. We present a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development. To this end, we construct a new benchmark consisting of 4545 compounds which is with larger scale than the one used in previous study. A light-weight prediction model is proposed. The model is with better explainability in the sense that it is consists of a straightforward tokenization that extracts and embeds statistically and physicochemically meaningful tokens, and a deep network backed by a set of pyramid kernels to capture multi-resolution chemical structural characteristics. Its efficacy has been validated in the experiments where it outperforms the state-of-the-art methods by 15.53% in accuracy and by 69.66% in terms of efficiency. We make the benchmark dataset, source code and web server open to ease the reproduction of this study.<\/jats:p>","DOI":"10.1093\/bib\/bbac346","type":"journal-article","created":{"date-parts":[[2022,8,26]],"date-time":"2022-08-26T20:38:00Z","timestamp":1661546280000},"source":"Crossref","is-referenced-by-count":9,"title":["Identifying the kind behind SMILES\u2014anatomical therapeutic chemical classification using structure-only representations"],"prefix":"10.1093","volume":"23","author":[{"given":"Yi","family":"Cao","sequence":"first","affiliation":[{"name":"Department of Computer Science, Sichuan University , 610065, Chengdu, China"}]},{"given":"Zhen-Qun","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Chinese University of Hong Kong , Street, Shatin, Hong Kong"}]},{"given":"Xu-Lu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Sichuan University , 610065, Chengdu, China"}]},{"given":"Wenqi","family":"Fan","sequence":"additional","affiliation":[{"name":"Department of Computing, Hong Kong Polytechnic University , Kowloon, Hong Kong"}]},{"given":"Yaowei","family":"Wang","sequence":"additional","affiliation":[{"name":"Peng Cheng Laboratory , 518000, Shenzhen, China"}]},{"given":"Jiajun","family":"Shen","sequence":"additional","affiliation":[{"name":"TCL AI Research Institute , Hong Kong"}]},{"given":"Dong-Qing","family":"Wei","sequence":"additional","affiliation":[{"name":"School of Life Sciences and Biotechnology, Shanghai Jiao Tong University , Shanghai, China"}]},{"given":"Qing","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computing, Hong Kong Polytechnic University , Kowloon, Hong Kong"}]},{"given":"Xiao-Yong","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Sichuan University , 610065, Chengdu, China"},{"name":"Department of Computing, Hong Kong Polytechnic University , Kowloon, Hong Kong"}]}],"member":"286","published-online":{"date-parts":[[2022,8,26]]},"reference":[{"issue":"suppl_2","key":"2022092013233202000_ref1","doi-asserted-by":"crossref","first-page":"W55","DOI":"10.1093\/nar\/gkn307","article-title":"Superpred: drug classification and target prediction","volume":"36","author":"Dunkel","year":"2008","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"2022092013233202000_ref2","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1093\/bioinformatics\/btt158","article-title":"Network predicting drug\u2019s anatomical therapeutic chemical code","volume":"29","author":"Wang","year":"2013","journal-title":"Bioinformatics"},{"issue":"W1","key":"2022092013233202000_ref3","doi-asserted-by":"crossref","first-page":"W26","DOI":"10.1093\/nar\/gku477","article-title":"Superpred: update on drug classification and target prediction","volume":"42","author":"Nickel","year":"2014","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"2022092013233202000_ref4","article-title":"Predicting anatomical therapeutic chemical (atc) classification of drugs by integrating chemical-chemical interactions and similarities","volume":"7","author":"Chen","year":"2012","journal-title":"PloS one"},{"issue":"3","key":"2022092013233202000_ref5","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1093\/bioinformatics\/btw644","article-title":"iatc-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals","volume":"33","author":"Cheng","year":"2017","journal-title":"Bioinformatics"},{"issue":"35","key":"2022092013233202000_ref6","doi-asserted-by":"crossref","first-page":"58494","DOI":"10.18632\/oncotarget.17028","article-title":"iatc-mhyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals","volume":"8","author":"Cheng","year":"2017","journal-title":"Oncotarget"},{"issue":"18","key":"2022092013233202000_ref7","doi-asserted-by":"crossref","first-page":"2837","DOI":"10.1093\/bioinformatics\/btx278","article-title":"Multi-label classifier based on histogram of gradients for predicting the anatomical therapeutic chemical class\/classes of a given compound","volume":"33","author":"Nanni","year":"2017","journal-title":"Bioinformatics"},{"issue":"6","key":"2022092013233202000_ref8","doi-asserted-by":"crossref","first-page":"2228","DOI":"10.1016\/j.bbadis.2017.12.019","article-title":"Inferring anatomical therapeutic chemical (atc) class of drugs using shortest path and random walk with restart algorithms","volume":"1864","author":"Chen","year":"2018","journal-title":"Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease"},{"issue":"34","key":"2022092013233202000_ref9","doi-asserted-by":"crossref","first-page":"4007","DOI":"10.2174\/1381612824666181112113438","article-title":"Convolutional neural networks for atc classification","volume":"24","author":"Lumini","year":"2018","journal-title":"Curr Pharm Des"},{"key":"2022092013233202000_ref10","doi-asserted-by":"crossref","first-page":"971","DOI":"10.3389\/fphar.2019.00971","article-title":"ATC-NLSP: prediction of the classes of anatomical therapeutic chemicals using a network-based label space partition method","volume":"10","author":"Wang","year":"2019","journal-title":"Front Pharmacol"},{"issue":"5","key":"2022092013233202000_ref11","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1093\/bioinformatics\/btz757","article-title":"iatc-nrakel: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs","volume":"36","author":"Zhou","year":"2020","journal-title":"Bioinformatics"},{"issue":"11","key":"2022092013233202000_ref12","doi-asserted-by":"crossref","first-page":"3568","DOI":"10.1093\/bioinformatics\/btaa166","article-title":"iatc-frakel: a simple multi-label web server for recognizing anatomical therapeutic chemical classes of drugs with their fingerprints only","volume":"36","author":"Zhou","year":"2020","journal-title":"Bioinformatics"},{"issue":"5","key":"2022092013233202000_ref13","doi-asserted-by":"crossref","first-page":"153","DOI":"10.4236\/abb.2020.115012","article-title":"iatc_deep-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals by deep learning","volume":"11","author":"Zhe","year":"2020","journal-title":"Advances in Bioscience and Biotechnology"},{"key":"2022092013233202000_ref14","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/978-981-13-9282-5_12","volume-title":"Smart Intelligent Computing and Applications","author":"Nanni","year":"2020"},{"issue":"18","key":"2022092013233202000_ref15","doi-asserted-by":"crossref","first-page":"2841","DOI":"10.1093\/bioinformatics\/btab204","article-title":"A convolutional neural network and graph convolutional network-based method for predicting the classification of anatomical therapeutic chemicals","volume":"37","author":"Zhao","year":"2021","journal-title":"Bioinformatics"},{"issue":"6","key":"2022092013233202000_ref16","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab289","article-title":"Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery","volume":"22","author":"Wang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022092013233202000_ref17","article-title":"Gated recurrent units and temporal convolutional network for multilabel classification","author":"Nanni","year":"2021"},{"key":"2022092013233202000_ref18","doi-asserted-by":"crossref","DOI":"10.1108\/ACI-11-2021-0301","article-title":"Neural networks for anatomical therapeutic chemical (atc) classification","author":"Nanni","year":"2022","journal-title":"Applied Computing and Informatics"},{"issue":"6","key":"2022092013233202000_ref19","doi-asserted-by":"crossref","first-page":"1092","DOI":"10.1039\/c3mb25555g","article-title":"Some remarks on predicting multi-label attributes in molecular biosystems","volume":"9","author":"Chou","year":"2013","journal-title":"Mol Biosyst"},{"issue":"4","key":"2022092013233202000_ref20","doi-asserted-by":"crossref","first-page":"868","DOI":"10.1039\/c3mb70490d","article-title":"A hybrid method for prediction and repositioning of drug anatomical therapeutic chemical classes","volume":"10","author":"Chen","year":"2014","journal-title":"Mol Biosyst"},{"key":"2022092013233202000_ref21","volume-title":"Computational and mathematical methods in medicine","author":"Zixin","year":"2022"},{"issue":"6","key":"2022092013233202000_ref22","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1021\/acs.jcim.9b00286","article-title":"Rdchiral: An rdkit wrapper for handling stereochemistry in retrosynthetic template extraction and application","volume":"59","author":"Coley","year":"2019","journal-title":"J Chem Inf Model"},{"issue":"D1","key":"2022092013233202000_ref23","doi-asserted-by":"crossref","first-page":"D380","DOI":"10.1093\/nar\/gkv1277","article-title":"Lars Juhl Jensen, Peer Bork, and Michael Kuhn. Stitch 5: augmenting protein\u2013chemical interaction networks with tissue and affinity data","volume":"44","author":"Szklarczyk","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"2022092013233202000_ref24","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1021\/c160017a018","article-title":"The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service","volume":"5","author":"Morgan","year":"1965","journal-title":"J Chem Doc"},{"issue":"6","key":"2022092013233202000_ref25","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1021\/ci010132r","article-title":"Reoptimization of mdl keys for use in drug discovery","volume":"42","author":"Durant","year":"2002","journal-title":"J Chem Inf Comput Sci"},{"issue":"1","key":"2022092013233202000_ref26","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J Chem Inf Comput Sci"},{"issue":"1","key":"2022092013233202000_ref27","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.jtbi.2010.12.024","article-title":"Some remarks on protein attribute prediction and pseudo amino acid composition","volume":"273","author":"Chou","year":"2011","journal-title":"J Theor Biol"},{"issue":"D1","key":"2022092013233202000_ref28","doi-asserted-by":"crossref","first-page":"D545","DOI":"10.1093\/nar\/gkaa970","article-title":"Kegg: integrating viruses and cellular organisms","volume":"49","author":"Kanehisa","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"11","key":"2022092013233202000_ref29","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: freely available python tools for computational molecular biology and bioinformatics","volume":"25","author":"Cock","year":"2009","journal-title":"Bioinformatics"},{"issue":"D1","key":"2022092013233202000_ref30","doi-asserted-by":"crossref","first-page":"D1388","DOI":"10.1093\/nar\/gkaa971","article-title":"Pubchem in 2021: new data content and improved web interfaces","volume":"49","author":"Kim","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022092013233202000_ref31","article-title":"Smiles2vec: An interpretable general-purpose deep neural network for predicting chemical properties","author":"Goh","year":"2017"},{"key":"2022092013233202000_ref32","doi-asserted-by":"crossref","first-page":"895","DOI":"10.3389\/fchem.2019.00895","article-title":"Spvec: a word2vec-inspired feature representation method for drug-target interaction prediction","volume":"7","author":"Zhang","year":"2020","journal-title":"Front Chem"},{"issue":"11","key":"2022092013233202000_ref33","doi-asserted-by":"crossref","first-page":"1022","DOI":"10.1145\/182.358466","article-title":"Extended boolean information retrieval","volume":"26","author":"Salton","year":"1983","journal-title":"Communications of the ACM"},{"issue":"5","key":"2022092013233202000_ref34","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1016\/0306-4573(88)90021-0","article-title":"Term-weighting approaches in automatic text retrieval","volume":"24","author":"Salton","year":"1988","journal-title":"Inf Process Manag"},{"key":"2022092013233202000_ref35","volume-title":"Department of Computer Science","author":"Ramos","year":"2000"},{"key":"2022092013233202000_ref36","article-title":"Distributed representations of words and phrases and their compositionality","volume":"26","author":"Mikolov","year":"2013","journal-title":"Advances in neural information processing systems"},{"issue":"2","key":"2022092013233202000_ref37","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1006\/csla.1993.1007","article-title":"The sphinx-ii speech recognition system: an overview","volume":"7","author":"Huang","year":"1993","journal-title":"Computer Speech & Language"},{"key":"2022092013233202000_ref38","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Kim","year":"2014"},{"issue":"2","key":"2022092013233202000_ref39","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1021\/acs.jcim.8b00803","article-title":"Identifying structure\u2013property relationships through smiles syntax analysis with self-attention mechanism","volume":"59","author":"Zheng","year":"2019","journal-title":"J Chem Inf Model"},{"issue":"1","key":"2022092013233202000_ref40","first-page":"1","article-title":"Randomized smiles strings improve the quality of molecular generative models","volume":"11","author":"Ar\u00fas-Pous","year":"2019","journal-title":"J Chem"},{"issue":"3","key":"2022092013233202000_ref41","article-title":"Advances and challenges in deep generative models for de novo molecule generation","volume":"9","author":"Xue","year":"2019","journal-title":"Wiley Interdisciplinary Reviews: Computational Molecular Science"},{"issue":"6","key":"2022092013233202000_ref42","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab327","article-title":"Learning to smiles: Ban-based strategies to improve latent representation learning from molecules","volume":"22","author":"Wu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022092013233202000_ref43","article-title":"Smiles transformer: Pre-trained molecular fingerprint for low data drug discovery","author":"Honda","year":"2019"},{"issue":"9","key":"2022092013233202000_ref44","doi-asserted-by":"crossref","first-page":"1572","DOI":"10.1021\/acscentsci.9b00576","article-title":"Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction","volume":"5","author":"Schwaller","year":"2019","journal-title":"ACS central science"},{"key":"2022092013233202000_ref45","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1145\/3307339.3342186","volume-title":"Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics","author":"Wang","year":"2019"},{"issue":"81","key":"2022092013233202000_ref46","doi-asserted-by":"crossref","first-page":"12152","DOI":"10.1039\/C9CC05122H","article-title":"Molecular transformer unifies reaction prediction and retrosynthesis across pharma chemical space","volume":"55","author":"Yang","year":"2019","journal-title":"Chem Commun"},{"issue":"4","key":"2022092013233202000_ref47","doi-asserted-by":"crossref","first-page":"275","DOI":"10.3109\/10409239509083488","article-title":"Prediction of protein structural classes","volume":"30","author":"Chou","year":"1995","journal-title":"Crit Rev Biochem Mol Biol"},{"issue":"1","key":"2022092013233202000_ref48","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.jtbi.2010.12.024","article-title":"Some remarks on protein attribute prediction and pseudo amino acid composition","volume":"273","author":"Chou","year":"2011","journal-title":"J Theor Biol"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac346\/45939608\/bbac346.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac346\/45939608\/bbac346.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,26]],"date-time":"2023-11-26T04:15:20Z","timestamp":1700972120000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac346\/6677124"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,26]]},"references-count":48,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac346","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,9]]},"published":{"date-parts":[[2022,8,26]]},"article-number":"bbac346"}}