{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T22:40:56Z","timestamp":1764715256226,"version":"3.41.2"},"reference-count":30,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,2,4]],"date-time":"2025-02-04T00:00:00Z","timestamp":1738627200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Digit. Health"],"abstract":"<jats:sec><jats:title>Background<\/jats:title><jats:p>Chronic Kidney Disease (CKD) is a global health concern and is frequently underdiagnosed due to its subtle initial symptoms, contributing to increasing morbidity and mortality. A comprehensive understanding of CKD comorbidities could lead to the identification of risk-groups, more effective treatment and improved patient outcomes. Our research presents a two-fold objective: developing an effective machine learning (ML) workflow for text classification and entity relation extraction and assembling a broad list of diseases influencing CKD development and progression.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>We analysed 39,680 abstracts with CKD in the title from the Embase library. Abstracts about a disease affecting CKD development and\/or progression were selected by multiple ML classifiers trained on a human-labelled sample. The best classifier was further trained with active learning. Disease names in question were extracted from the selected abstracts using a novel entity relation extraction methodology. The resulting disease list and their corresponding abstracts were manually checked and a final disease list was created.<\/jats:p><\/jats:sec><jats:sec><jats:title>Findings<\/jats:title><jats:p>The SVM model gave the best results and was chosen for further training with active learning. This optimised ML workflow enabled us to discern 68 comorbidities across 15 ICD-10 disease groups contributing to CKD progression or development. The reading of the ML-selected abstracts showed that some diseases have direct causal effect on CKD, while others, like schizophrenia, has indirect causal effect on CKD.<\/jats:p><\/jats:sec><jats:sec><jats:title>Interpretation<\/jats:title><jats:p>These findings have the potential to guide future CKD investigations, by facilitating the inclusion of a broader array of comorbidities in CKD prognostic models. Ultimately, our study enhances understanding of prognostic comorbidities and supports clinical practice by enabling improved patient monitoring, preventive strategies, and early detection for individuals at higher CKD development or progression risk.<\/jats:p><\/jats:sec>","DOI":"10.3389\/fdgth.2025.1495879","type":"journal-article","created":{"date-parts":[[2025,2,4]],"date-time":"2025-02-04T06:35:10Z","timestamp":1738650910000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["A novel machine learning methodology for the systematic extraction of chronic kidney disease comorbidities from abstracts"],"prefix":"10.3389","volume":"7","author":[{"given":"Eszter","family":"S\u00e1ghy","sequence":"first","affiliation":[]},{"given":"Mostafa","family":"Elsharkawy","sequence":"additional","affiliation":[]},{"given":"Frank","family":"Moriarty","sequence":"additional","affiliation":[]},{"given":"S\u00e1ndor","family":"Kov\u00e1cs","sequence":"additional","affiliation":[]},{"given":"Istv\u00e1n","family":"Wittmann","sequence":"additional","affiliation":[]},{"given":"Antal","family":"Zempl\u00e9nyi","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,2,4]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1136\/bmj.312.7023.71","article-title":"Evidence based medicine: what it is and what it isn't","volume":"312","author":"Sackett","year":"1996","journal-title":"Br Med J"},{"key":"B2","doi-asserted-by":"publisher","first-page":"2420","DOI":"10.1001\/jama.1992.03490170092032","article-title":"Evidence-based medicine: a new approach to teaching the practice of medicine","volume":"268","author":"Guyatt","year":"1992","journal-title":"JAMA"},{"key":"B3","doi-asserted-by":"publisher","first-page":"e1000326","DOI":"10.1371\/journal.pmed.1000326","article-title":"Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?","volume":"7","author":"Bastian","year":"2010","journal-title":"PLoS Med"},{"key":"B4","doi-asserted-by":"publisher","first-page":"e012545","DOI":"10.1136\/bmjopen-2016-012545","article-title":"Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry","volume":"7","author":"Borah","year":"2017","journal-title":"BMJ Open"},{"key":"B5","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1186\/s13643-021-01640-6","article-title":"Iterative guided machine learning-assisted systematic literature reviews: a diabetes case study","volume":"10","author":"Zimmerman","year":"2021","journal-title":"Syst Rev"},{"key":"B6","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1186\/s13643-019-1074-9","article-title":"Toward systematic review automation: a practical guide to using machine learning tools in research synthesis","volume":"8","author":"Marshall","year":"2019","journal-title":"Syst Rev"},{"key":"B7","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1186\/s13643-020-01520-5","article-title":"Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews","volume":"9","author":"Popoff","year":"2020","journal-title":"Syst Rev"},{"key":"B8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/505282.505283","article-title":"Machine learning in automated text categorization","volume":"34","author":"Sebastiani","year":"2002","journal-title":"ACM Comput Surv"},{"volume-title":"Information Extraction: Algorithms and Prospects in a Retrieval Context","year":"2006","author":"Moens","key":"B9"},{"key":"B10","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1145\/3445965","article-title":"Named entity recognition and relation extraction: state-of-the-art","volume":"54","author":"Nasar","year":"2021","journal-title":"ACM Comput Surv"},{"key":"B11","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-01560-1","volume-title":"Active Learning","author":"Settles","year":"2012"},{"key":"B12","doi-asserted-by":"publisher","first-page":"668","DOI":"10.1016\/j.future.2019.07.013","article-title":"Breast mass detection from the digitized x-ray mammograms based on the combination of deep active learning and self-paced learning","volume":"101","author":"Shen","year":"2019","journal-title":"Future Gener Comput Syst"},{"key":"B13","doi-asserted-by":"publisher","first-page":"860","DOI":"10.1093\/cid\/ciad633","article-title":"Black box warning: large language models and the future of infectious diseases consultation","volume":"78","author":"Schwartz","year":"2024","journal-title":"Clin Infect Dis"},{"key":"B14","doi-asserted-by":"publisher","first-page":"709","DOI":"10.1016\/S0140-6736(20)30045-3","article-title":"Global, regional, and national burden of chronic kidney disease, 1990\u20132017: a systematic analysis for the global burden of disease study 2017","volume":"395","author":"Bikbov","year":"2020","journal-title":"Lancet"},{"key":"B15","doi-asserted-by":"publisher","first-page":"1258","DOI":"10.1038\/ki.2011.368","article-title":"The contribution of chronic kidney disease to the global burden of major noncommunicable diseases","volume":"80","author":"Couser","year":"2011","journal-title":"Kidney Int"},{"key":"B16","doi-asserted-by":"publisher","first-page":"iii73","DOI":"10.1093\/ndt\/gfs269","article-title":"Estimating the financial cost of chronic kidney disease to the NHS in England","volume":"27","author":"Kerr","year":"2012","journal-title":"Nephrol Dial Transplant"},{"key":"B17","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1186\/s12882-015-0189-z","article-title":"The burden of comorbidity in people with chronic kidney disease stage 3: a cohort study","volume":"16","author":"Fraser","year":"2015","journal-title":"BMC Nephrol"},{"key":"B18","doi-asserted-by":"publisher","first-page":"493","DOI":"10.3390\/jcm7120493","article-title":"The number of comorbidities predicts renal outcomes in patients with stage 3\u20135 chronic kidney disease","volume":"7","author":"Lee","year":"2018","journal-title":"J Clin Med"},{"key":"B19","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1038\/ki.2015.228","article-title":"Comorbidity as a driver of adverse outcomes in people with chronic kidney disease","volume":"88","author":"Tonelli","year":"2015","journal-title":"Kidney Int"},{"year":"2023","key":"B20","article-title":"The NCBI Disease Corpus"},{"key":"B21","doi-asserted-by":"publisher","first-page":"1892","DOI":"10.1093\/jamia\/ocab090","article-title":"Biomedical and clinical english model packages for the stanza python NLP library","volume":"28","author":"Zhang","year":"2021","journal-title":"J Am Med Inform Assoc"},{"key":"B22","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-73531-3","volume-title":"Machine Learning for Text","author":"Aggarwal","year":"2018"},{"key":"B23","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1016\/B978-0-12-809633-8.20474-3","article-title":"Data mining: accuracy and error measures for classification and prediction","volume-title":"Encyclopedia of Bioinformatics and Computational Biology","author":"Galdi","year":"2019"},{"key":"B24","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"B25","doi-asserted-by":"publisher","first-page":"e0180446","DOI":"10.1371\/journal.pone.0180446","article-title":"Increased risk of chronic kidney disease in patients with rosacea: a nationwide population-based matched cohort study","volume":"12","author":"Chiu","year":"2017","journal-title":"PLoS One"},{"key":"B26","doi-asserted-by":"publisher","first-page":"e429","DOI":"10.1097\/MD.0000000000000429","article-title":"Nonapnea sleep disorders and incident chronic kidney disease: a population-based retrospective cohort study","volume":"94","author":"Huang","year":"2015","journal-title":"Medicine (United States)"},{"key":"B27","doi-asserted-by":"publisher","first-page":"e019868","DOI":"10.1136\/bmjopen-2017-019868","article-title":"Second-generation antipsychotic medications and risk of chronic kidney disease in schizophrenia: population-based nested case\u2013control study","volume":"8","author":"Wang","year":"2018","journal-title":"BMJ Open"},{"key":"B28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0266462319000825","article-title":"The \u2018top 10\u2019 challenges for health technology assessment: iNAHTA viewpoint","volume":"36","author":"Huang","year":"2020","journal-title":"Int J Technol Assess Health Care"},{"year":"2023","key":"B29","article-title":"GPT-4 is OpenAI's most advanced system, producing safer and more useful responses"},{"key":"B30","doi-asserted-by":"publisher","first-page":"671","DOI":"10.1038\/d41586-023-02366-2","article-title":"ChatGPT is a black box: how AI research can break it open","volume":"619","author":"Editorial","year":"2023","journal-title":"Nature."}],"container-title":["Frontiers in Digital Health"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1495879\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,4]],"date-time":"2025-02-04T06:35:13Z","timestamp":1738650913000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fdgth.2025.1495879\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,4]]},"references-count":30,"alternative-id":["10.3389\/fdgth.2025.1495879"],"URL":"https:\/\/doi.org\/10.3389\/fdgth.2025.1495879","relation":{},"ISSN":["2673-253X"],"issn-type":[{"type":"electronic","value":"2673-253X"}],"subject":[],"published":{"date-parts":[[2025,2,4]]},"article-number":"1495879"}}