{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T16:25:09Z","timestamp":1754151909266,"version":"3.41.2"},"reference-count":51,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T00:00:00Z","timestamp":1753056000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:sec><jats:title>Introduction<\/jats:title><jats:p>Diabetes Mellitus (DM) constitutes a global epidemic and is one of the top ten leading causes of mortality (WHO, 2019), projected to rank seventh by 2030. The US National Diabetes Statistics Report (2021) states that 38.4 million Americans have diabetes. Dipeptidyl Peptidase-4 (DPP-4) is an FDA-approved target for the treatment of type 2 diabetes mellitus (T2DM). However, current DPP-4 inhibitors may cause adverse effects, including gastrointestinal issues, severe joint pain (FDA safety warning), nasopharyngitis, hypersensitivity, and nausea. Moreover, the development of novel drugs and the <jats:italic>in vivo<\/jats:italic> assessment of DPP-4 inhibition are both costly and often impractical. These challenges highlight the urgent need for efficient <jats:italic>in-silico<\/jats:italic> approaches to facilitate the discovery and optimization of safer and more effective DPP-4 inhibitors.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methodology<\/jats:title><jats:p>Quantitative Structure-Activity Relationship (QSAR) modeling is a widely used computational approach for evaluating the properties of chemical substances. In this study, we employed a Neuro-symbolic (NeSy) approach, specifically the Logic Tensor Network (LTN), to develop a DPP-4 QSAR model capable of identifying potential small-molecule inhibitors and predicting bioactivity classification. For comparison, we also implemented baseline models using Deep Neural Networks (DNNs) and Transformers. A total of 6,563 bioactivity records (SMILES-based compounds with IC<jats:sub>50<\/jats:sub> values) were collected from ChEMBL, PubChem, BindingDB, and GTP. Feature sets used for model training included descriptors (CDK Extended\u2013PaDEL), fingerprints (Morgan), chemical language model embeddings (ChemBERTa-2), LLaMa 3.2 embedding features, and physicochemical properties.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Among all tested configurations, the Neuro-symbolic QSAR model (NeSyDPP-4) performed best using a combination of CDK extended and Morgan fingerprints. The model achieved an accuracy of 0.9725, an F1-score of 0.9723, an ROC AUC of 0.9719, and a Matthews correlation coefficient (MCC) of 0.9446. These results outperformed the baseline DNN and Transformer models, as well as existing state-of-the-art (SOTA) methods. To further validate the robustness of the model, we conducted an external evaluation using the Drug Target Common (DTC) dataset, where NeSyDPP-4 also demonstrated strong performance, with an accuracy of 0.9579, an AUC-ROC of 0.9565, a Matthews Correlation Coefficient (MCC) of 0.9171, and an F1-score of 0.9577.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>These findings suggest that the NeSyDPP-4 model not only delivered high predictive performance but also demonstrated generalizability to external datasets. This approach presents a cost-effective and reliable alternative to traditional vivo screening, offering valuable support for the identification and classification of biologically active DPP-4 inhibitors in the treatment of type 2 diabetes mellitus (T2DM).<\/jats:p><\/jats:sec>","DOI":"10.3389\/fbinf.2025.1603133","type":"journal-article","created":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T08:59:58Z","timestamp":1753088398000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["NeSyDPP-4: discovering DPP-4 inhibitors for diabetes treatment with a neuro-symbolic AI approach"],"prefix":"10.3389","volume":"5","author":[{"given":"Delower","family":"Hossain","sequence":"first","affiliation":[]},{"given":"Ehsan","family":"Saghapour","sequence":"additional","affiliation":[]},{"given":"Jake Y.","family":"Chen","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,7,21]]},"reference":[{"key":"B52","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2209.01712","article-title":"ChemBERTa-2: towards chemical foundation models","author":"Ahmad","year":"2022","journal-title":"arXiv"},{"key":"B1","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1007\/s42452-021-04880-2","article-title":"Screening of potential antidiabetic phytochemicals from Gongronema latifolium leaf against therapeutic targets of type 2 diabetes mellitus: multi-targets drug design","volume":"4","author":"Ajiboye","year":"2021","journal-title":"SN Appl. Sci."},{"key":"B2","doi-asserted-by":"publisher","first-page":"279","DOI":"10.48550\/arxiv.2006.11524","article-title":"Neuro-symbolic visual reasoning: disentangling \u201cvisual\u201d from \u201creasoning\u201d","volume":"1","author":"Amizadeh","year":"2020","journal-title":"Int. Conf. Mach. Learn."},{"key":"B3","doi-asserted-by":"publisher","first-page":"4902","DOI":"10.1609\/aaai.v35i6.16623","article-title":"Conversational Neuro-Symbolic commonsense reasoning","volume":"35","author":"Arabshahi","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"B4","doi-asserted-by":"publisher","first-page":"103649","DOI":"10.1016\/j.artint.2021.103649","article-title":"Logic tensor networks","volume":"303","author":"Badreddine","year":"2021","journal-title":"Artif. Intell."},{"key":"B5","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1186\/s40537-021-00465-3","article-title":"Artificial intelligence paradigm for ligand-based virtual screening on the drug discovery of type 2 diabetes mellitus","volume":"8","author":"Bustamam","year":"2021","journal-title":"J. Big Data"},{"key":"B6","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1007\/s10822-017-0009-6","article-title":"Predicting DPP-IV inhibitors with machine learning approaches","volume":"31","author":"Cai","year":"2017","journal-title":"J. Computer-Aided Mol. Des."},{"journal-title":"Diabetes","article-title":"Methods for the national diabetes statistics report","year":"2024","key":"B7"},{"key":"B8","first-page":"471","article-title":"Fuzzy symbolic dynamics for neurodynamical systems","volume-title":"Lecture notes in computer science","author":"Dobosz","year":"2008"},{"key":"B53","doi-asserted-by":"publisher","DOI":"10.48550\/arxiv.2407.21783","article-title":"The Llama 3 Herd of Models","author":"Ettaleb","year":"2024","journal-title":"Arxiv"},{"key":"B10","unstructured":"FDA approved dipeptidyl peptidase IV (DPP IV) inhibitors\n          \n          \n          2023"},{"key":"B11","doi-asserted-by":"publisher","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","article-title":"ChEMBL: a large-scale bioactivity database for drug discovery","volume":"40","author":"Gaulton","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"B12","doi-asserted-by":"publisher","first-page":"D1045","DOI":"10.1093\/nar\/gkv1072","article-title":"BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology","volume":"44","author":"Gilson","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"B13","doi-asserted-by":"publisher","first-page":"1832","DOI":"10.1109\/jbhi.2020.3022806","article-title":"MulTiPREDGO: deep Multi-Modal Protein function prediction by amalgamating protein structure, sequence, and interaction information","volume":"25","author":"Giri","year":"2020","journal-title":"IEEE J. Biomed. Health Inf."},{"key":"B48","doi-asserted-by":"publisher","first-page":"1375","DOI":"10.1007\/s11030-021-10204-8","article-title":"A novel artificial intelligence protocol to investigate potential leads for diabetes mellitus","volume":"25","author":"Gong","year":"2021","journal-title":"Mol Divers"},{"key":"B14","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1109\/icacsis51025.2020.9263204","article-title":"Predicting the molecular structure relationship and the biological activity of DPP-4 inhibitor using deep neural network with CatBoost method as feature selection","author":"Hamzah","year":"2020","journal-title":"IEEEXplore"},{"article-title":"Neuro-symbolic learning: principles and applications in ophthalmology","year":"2022","author":"Hassan","key":"B15"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.21203\/rs.2.22282\/v2","article-title":"Virtual screening of DPP-4 inhibitors using QSAR-Based artificial intelligence and molecular docking of HIT compounds to DPP-8 and DPP-9 enzymes","author":"Hermansyah","year":"2020","journal-title":"Res. Square Res. Square"},{"key":"B17","doi-asserted-by":"publisher","first-page":"107597","DOI":"10.1016\/j.compbiolchem.2021.107597","article-title":"Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure\u2013activity relationship-based artificial intelligence and molecular docking of hit compounds","volume":"95","author":"Hermansyah","year":"2021","journal-title":"Comput. Biol. Chem."},{"volume-title":"Identification of DPP-4 inhibitor active compounds using machine learning classification","year":"2023","author":"Hermansyah","key":"B18"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.11661","article-title":"Ontology reasoning with deep neural networks","volume":"68","author":"Hohenecker","year":"2020","journal-title":"J. Artif. Intell. Res."},{"article-title":"A study on neuro-symbolic artificial intelligence: healthcare perspectives","year":"2025","author":"Hossain","key":"B20"},{"key":"B21","first-page":"2025","article-title":"hERG-LTN: a new paradigm","volume-title":"hERG cardiotoxicity assessment using neuro-symbolic and generative AI embedding (MegaMolBART, Llama3. 2, gemini, DeepSeek) approach","author":"Hossain","year":"2025"},{"key":"B22","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1186\/s40360-020-00447-w","article-title":"Adverse event profiles of dipeptidyl peptidase-4 inhibitors: data mining of the public version of the FDA adverse event reporting system","volume":"21","author":"Huang","year":"2020","journal-title":"BMC Pharmacol. Toxicol."},{"key":"B23","unstructured":"The RDKit 2024.09.6 documentation\n          \n          \n          2024"},{"key":"B24","first-page":"104","volume-title":"Explainable diabetic Retinopathy classification based on neural-symbolic learning","author":"Jang","year":"2021"},{"key":"B25","doi-asserted-by":"publisher","first-page":"100257","DOI":"10.1016\/j.imu.2019.100257","article-title":"Detection of Cardiac arrhythmia using fuzzy logic","volume":"17","author":"Kora","year":"2019","journal-title":"Inf. Med. Unlocked"},{"key":"B26","first-page":"49","article-title":"Neuro-Symbolic neurodegenerative disease modeling as probabilistic programmed deep kernels","volume-title":"Studies in computational intelligence","author":"Lavin","year":"2022"},{"key":"B27","first-page":"249","article-title":"Refining algorithms with knowledge-based neural networks: improving the Chou-Fasman algorithm for protein folding","author":"Maclin","year":"1994"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.48550\/arxiv.1904.12584","article-title":"The Neuro-Symbolic Concept Learner: interpreting scenes, words, and sentences from natural supervision","author":"Mao","year":"2019","journal-title":"Int. Conf. Learn. Represent"},{"key":"B29","doi-asserted-by":"publisher","first-page":"100720","DOI":"10.1016\/j.imu.2021.100720","article-title":"Elucidating the interactions of compounds identified from Aframomum melegueta seeds as promising candidates for the management of diabetes mellitus: a computational approach","volume":"26","author":"Ojo","year":"2021","journal-title":"Inf. Med. Unlocked"},{"key":"B31","doi-asserted-by":"publisher","first-page":"1666","DOI":"10.1897\/01-171","article-title":"Quantitative structure\u2010activity relationship methods: perspectives on drug discovery and toxicology","volume":"22","author":"Perkins","year":"2003","journal-title":"Environ. Toxicol. Chem."},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.48550\/arxiv.2006.13155","article-title":"Logical neural networks","author":"Riegel","year":"2020","journal-title":"arXiv Cornell Univ."},{"key":"B33","doi-asserted-by":"publisher","first-page":"974","DOI":"10.29207\/resti.v6i6.4470","article-title":"DPP IV inhibitors activities prediction as an anti-diabetic agent using particle swarm optimization-support vector machine method","volume":"6","author":"Septiawan","year":"2022","journal-title":"J. RESTI Rekayasa Sist. Dan. Teknol. Inf."},{"key":"B34","first-page":"8368","article-title":"Explainable and explicit visual reasoning over scene graphs","author":"Shi","year":"2019"},{"key":"B35","doi-asserted-by":"publisher","first-page":"9448","DOI":"10.48550\/arxiv.1911.06962","article-title":"Inductive relation prediction by subgraph reasoning","volume":"1","author":"Teru","year":"2020","journal-title":"Int. Conf. Mach. Learn."},{"key":"B36","doi-asserted-by":"publisher","first-page":"977","DOI":"10.5555\/2986916.2987036","article-title":"Interpretation of artificial neural networks: mapping knowledge-based neural networks into rules","volume":"4","author":"Towell","year":"1991","journal-title":"Neural Inf. Process. Syst."},{"key":"B37","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1016\/0004-3702(94)90105-8","article-title":"Knowledge-based artificial neural networks","volume":"70","author":"Towell","year":"1994","journal-title":"Artif. Intell."},{"key":"B38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/aims52415.2021.9466083","article-title":"Model QSAR classification using Conv1D-LSTM of dipeptidyl peptidase-4 inhibitors","author":"Ulfa","year":"2021","journal-title":"IEEExplore"},{"key":"B39","doi-asserted-by":"publisher","first-page":"btae057","DOI":"10.1093\/bioinformatics\/btae057","article-title":"StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides","volume":"40","author":"Wang","year":"2024","journal-title":"Bioinformatics"},{"key":"B40","doi-asserted-by":"publisher","first-page":"6046","DOI":"10.3390\/ijms20236046","article-title":"DEEPMIR2GO: inferring functions of human MicroRNAs using a deep Multi-Label Classification model","volume":"20","author":"Wang","year":"2019","journal-title":"Int. J. Mol. Sci."},{"key":"B41","doi-asserted-by":"publisher","first-page":"878","DOI":"10.1109\/tpami.2024.3483273","article-title":"Towards data-and knowledge-driven AI: a survey on neuro-symbolic computing","volume":"47","author":"Wang","year":"2024","journal-title":"IEEE Trans. Pattern Analysis Mach. Intell."},{"key":"B42","unstructured":"Wikipedia DPP-4 inhibitors\n          \n          \n          2018"},{"key":"B43","unstructured":"World health statistics 2020: monitoring health for the SDGs, sustainable development goals\n          \n          \n          2020"},{"key":"B44","unstructured":"The top 10 causes of death\n          \n          \n          2024"},{"volume-title":"A semantic loss function for deep learning with symbolic knowledge","year":"2018","author":"Xu","key":"B45"},{"volume-title":"Differentiable learning of logical rules for knowledge base reasoning","year":"2017","author":"Yang","key":"B46"},{"key":"B47","first-page":"1755","article-title":"NeurASP: embracing neural networks into answer set programming","author":"Yang","year":"2020"},{"key":"B51","doi-asserted-by":"publisher","first-page":"1466","DOI":"10.1002\/jcc.21707","article-title":"PaDEL\u2010descriptor: An open source software to calculate molecular descriptors and fingerprints","volume":"32","author":"Yap","year":"2011","journal-title":"J. Comput. Chem."},{"key":"B49","first-page":"1031","article-title":"Neural-symbolic VQA: disentangling reasoning from vision and language understanding","volume":"31","author":"Yi","year":"2018","journal-title":"arXiv Cornell Univ."},{"key":"B50","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1016\/j.neunet.2023.06.028","article-title":"A survey on neural-symbolic learning systems","volume":"166","author":"Yu","year":"2023","journal-title":"Neural Netw."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1603133\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T09:00:05Z","timestamp":1753088405000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1603133\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,21]]},"references-count":51,"alternative-id":["10.3389\/fbinf.2025.1603133"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1603133","relation":{},"ISSN":["2673-7647"],"issn-type":[{"type":"electronic","value":"2673-7647"}],"subject":[],"published":{"date-parts":[[2025,7,21]]},"article-number":"1603133"}}