{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T00:59:55Z","timestamp":1771894795055,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2025,6,10]],"date-time":"2025-06-10T00:00:00Z","timestamp":1749513600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100018227","name":"National Research Foundation of Ukraine","doi-asserted-by":"publisher","award":["2023.04\/0094"],"award-info":[{"award-number":["2023.04\/0094"]}],"id":[{"id":"10.13039\/100018227","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Academy of Sciences of Ukraine","award":["2023.04\/0094"],"award-info":[{"award-number":["2023.04\/0094"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>This study explores the potential of unsupervised machine learning algorithms to identify latent cardiac risk profiles by analyzing ECG-derived parameters from two general groups: clinically healthy individuals (Norm dataset, n = 14,863) and patients hospitalized with heart failure (patients\u2019 dataset, n = 8220). Each dataset includes 153 ECG and heart rate variability (HRV) features, including both conventional and novel diagnostic parameters obtained using a Universal Scoring System. The study aims to apply unsupervised clustering algorithms to ECG data to detect latent risk profiles related to heart failure, based on distinctive ECG features. The focus is on identifying patterns that correlate with cardiac health risks, potentially aiding in early detection and personalized care. We applied a combination of Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and Hierarchical Density-Based Spatial Clustering (HDBSCAN) for unsupervised clustering. Models trained on one dataset were applied to the other to explore structural differences and detect latent predispositions to cardiac disorders. Both Euclidean and Manhattan distance metrics were evaluated. Features such as the QRS angle in the frontal plane, Detrended Fluctuation Analysis (DFA), High-Frequency power (HF), and others were analyzed for their ability to distinguish different patient clusters. In the Norm dataset, Euclidean distance clustering identified two main clusters, with Cluster 0 indicating a lower risk of heart failure. Key discriminative features included the \u201cALPHA QRS ANGLE IN THE FRONTAL PLANE\u201d and DFA. In the patients\u2019 dataset, three clusters emerged, with Cluster 1 identified as potentially high-risk. Manhattan distance clustering provided additional insights, highlighting features like \u201cST DISLOCATION\u201d and \u201cT AMP NORMALIZED\u201d as significant for distinguishing between clusters. The analysis revealed distinct clusters that correspond to varying levels of heart failure risk. In the Norm dataset, two main clusters were identified, with one associated with a lower risk profile. In the patients\u2019 dataset, a three-cluster structure emerged, with one subgroup displaying markedly elevated risk indicators such as high-frequency power (HF) and altered QRS angle values. Cross-dataset clustering confirmed consistent feature shifts between groups. These findings demonstrate the feasibility of ECG-based unsupervised clustering for early risk stratification. The results offer a non-invasive tool for personalized cardiac monitoring and merit further clinical validation. These findings emphasize the potential for clustering techniques to contribute to early heart failure detection and personalized monitoring. Future research should aim to validate these results in other populations and integrate these methods into clinical decision-making frameworks.<\/jats:p>","DOI":"10.3390\/computation13060144","type":"journal-article","created":{"date-parts":[[2025,6,10]],"date-time":"2025-06-10T12:53:16Z","timestamp":1749559996000},"page":"144","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Scalable Clustering of Complex ECG Health Data: Big Data Clustering Analysis with UMAP and HDBSCAN"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6940-579X","authenticated-orcid":false,"given":"Vladislav","family":"Kaverinskiy","sequence":"first","affiliation":[{"name":"Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, 03187 Kyiv, Ukraine"}]},{"given":"Illya","family":"Chaikovsky","sequence":"additional","affiliation":[{"name":"Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, 03187 Kyiv, Ukraine"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5448-4045","authenticated-orcid":false,"given":"Anton","family":"Mnevets","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering, Igor Sikorsky Kyiv Polytechnic Institute, 03056 Kyiv, Ukraine"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1935-774X","authenticated-orcid":false,"given":"Tatiana","family":"Ryzhenko","sequence":"additional","affiliation":[{"name":"Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, 03187 Kyiv, Ukraine"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9198-3855","authenticated-orcid":false,"given":"Mykhailo","family":"Bocharov","sequence":"additional","affiliation":[{"name":"Department of Moral and Psychological Support of the Activity of the Troops (Forces), National Defense University of Ukraine Named After Ivan Cherniakhovskyi, 03049 Kyiv, Ukraine"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3223-9844","authenticated-orcid":false,"given":"Kyrylo","family":"Malakhov","sequence":"additional","affiliation":[{"name":"Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, 03187 Kyiv, Ukraine"}]}],"member":"1968","published-online":{"date-parts":[[2025,6,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1080\/17434440.2020.1754795","article-title":"Electrocardiogram scoring beyond the routine analysis: Subtle changes matters","volume":"17","author":"Chaikovsky","year":"2020","journal-title":"Expert Rev. Med. Devices"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"105753","DOI":"10.1016\/j.ijmedinf.2024.105753","article-title":"A Systematic Review on the Impact of Artificial Intelligence on Electrocardiograms in Cardiology","volume":"195","author":"Oke","year":"2025","journal-title":"Int. J. Med. Inform."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"105742","DOI":"10.1016\/j.ijmedinf.2024.105742","article-title":"Universal Representations in Cardiovascular ECG Assessment: A Self-Supervised Learning Approach","volume":"195","author":"Liu","year":"2025","journal-title":"Int. J. Med. Inform."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Mondal, A., Manikandan, M.S., and Pachori, R.B. (2025). Automatic ECG Signal Quality Assessment Using Convolutional Neural Networks and Derivative ECG Signal for False Alarm Reduction in Wearable Vital Signs Monitoring Devices. Biomed. Signal Process. Control, 108.","DOI":"10.1016\/j.bspc.2025.107876"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"101375","DOI":"10.1016\/j.imu.2023.101375","article-title":"Simple, Efficient, and Generalized ECG Signal Quality Assessment Method for Telemedicine Applications","volume":"42","author":"Kuetche","year":"2023","journal-title":"Inform. Med. Unlocked"},{"key":"ref_6","unstructured":"Chaikovsky, I., Starynska, G., and Budnyk, M. (2020). Method of ECG Evaluating Based on Universal Scoring System. (US10512412B2)."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1109\/TBME.2012.2212278","article-title":"Machine Learning-Based Method for Personalized and Cost-Effective Detection of Alzheimer\u2019s Disease","volume":"60","author":"Escudero","year":"2013","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1212\/WNL.0000000000003734","article-title":"Polygenic Risk Scores in Familial Alzheimer Disease","volume":"88","author":"Tosto","year":"2017","journal-title":"Neurology"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1959","DOI":"10.1093\/brain\/awx118","article-title":"Clinical Criteria for Subtyping Parkinson\u2019s Disease: Biomarkers and Longitudinal Progression","volume":"140","author":"Fereshtehnejad","year":"2017","journal-title":"Brain"},{"key":"ref_10","first-page":"4","article-title":"Data-Driven Stratification of Parkinson\u2019s Disease Patients Based on the Progression of Motor and Cognitive Disease Markers","volume":"17","author":"Krasniqi","year":"2021","journal-title":"Ger. Med. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"2291","DOI":"10.1016\/j.jacc.2008.02.068","article-title":"Long QT Syndrome","volume":"51","author":"Goldenberg","year":"2008","journal-title":"J. Am. Coll. Cardiol."},{"key":"ref_12","unstructured":"Chhabra, L., Goyal, A., and Benham, M.D. (2023, August 07). Wolff-Parkinson-White Syndrome. StatPearls, 7, Available online: https:\/\/pubmed.ncbi.nlm.nih.gov\/32119324\/."},{"key":"ref_13","first-page":"2525","article-title":"Task Force for the Redefinition of Myocardial Infarction","volume":"28","author":"Thygesen","year":"2007","journal-title":"Eur. Heart J."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TNN.2005.845141","article-title":"Survey of Clustering Algorithms","volume":"16","author":"Xu","year":"2005","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","article-title":"A Guide to Deep Learning in Healthcare","volume":"25","author":"Esteva","year":"2019","journal-title":"Nat. Med."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Murali, L., Gopakumar, G., and Viswanathan, D.M. (2023). Towards Electronic Health Record-Based Medical Knowledge Graph Construction, Completion, and Applications: A Literature Study. J. Biomed. Inform., 143.","DOI":"10.1016\/j.jbi.2023.104403"},{"key":"ref_17","unstructured":"McInnes, L., Healy, J., and Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv, 2020."},{"key":"ref_18","unstructured":"Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996, January 2\u20134). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD\u201996), Portland, OR, USA. Available online: https:\/\/file.biolab.si\/papers\/1996-DBSCAN-KDD.pdf."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2733381","article-title":"Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection","volume":"10","author":"Campello","year":"2013","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1017\/S1431927621013696","article-title":"Strategies for EELS Data Analysis: Introducing UMAP and HDBSCAN for Dimensionality Reduction and Clustering","volume":"28","year":"2022","journal-title":"Microsc. Microanal."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"94","DOI":"10.7250\/csimq.2024-40.04","article-title":"Machine Learning Analysis of Arterial Oscillograms for Depression Level Diagnosis in Cardiovascular Health","volume":"40","author":"Kaverinsky","year":"2024","journal-title":"Complex Syst. Inform. Model. Q."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1016\/j.cjco.2024.10.012","article-title":"Comparing ECG Lead Subsets for Heart Arrhythmia\/ECG Pattern Classification: Convolutional Neural Networks and Random Forest","volume":"7","author":"Reznichenko","year":"2025","journal-title":"CJC Open"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lanerolle, G.D., Roberts, E.S., Haroon, A., and Shetty, A. (2024). Chapter 7\u2014Neuropsychiatry and Mental Health. Quality Assurance Management, Academic Press.","DOI":"10.1016\/B978-0-12-822732-9.00007-2"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"100366","DOI":"10.1016\/j.health.2024.100366","article-title":"An Electrocardiogram Signal Classification Using a Hybrid Machine Learning and Deep Learning Approach","volume":"6","author":"Zabihi","year":"2024","journal-title":"Healthc. Anal."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Cascianelli, S., and Masseroli, M. (2025). Biological and Medical Ontologies: Introduction. Encyclopedia of Bioinformatics and Computational Biology, Elsevier.","DOI":"10.1016\/B978-0-323-95502-7.00061-0"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"e6635","DOI":"10.5195\/ijt.2024.6635","article-title":"Innovative Hybrid Cloud Solutions for Physical Medicine and Telerehabilitation Research","volume":"16","author":"Malakhov","year":"2024","journal-title":"Int. J. Telerehabil."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"183","DOI":"10.24061\/2413-4260.XIV.4.54.2024.25","article-title":"Telerehabilitation: Current Opportunities And Problems of Remote Patient Monitoring","volume":"14","author":"Romanchuk","year":"2024","journal-title":"Neonatol. H\u00ecr. Perinat. Med."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Vakulenko, D., and Vakulenko, L. (2024). Information System Telerehabilitation: Needs, Tasks and Way Optimisation with AI. Arterial Oscillography: NewCapabilities of the Blood Pressure Monitor with the Oranta-AO Information System, Nova Science Publishers.","DOI":"10.52305\/XFFR7057"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Vladymyrov, O.A., Semykopna, T.V., Vakulenko, D.V., Syvak, O.V., and Budnyk, M.M. (2024). Telerehabilitation Guidelines for Patients with Breast Cancer. Int. J. Telerehabil., 1\u201376.","DOI":"10.5195\/ijt.2024.6640"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"736","DOI":"10.1007\/s10559-024-00711-5","article-title":"Algorithmization and Optimization Models of Patient-Centric Rehabilitation Programs","volume":"60","author":"Vakulenko","year":"2024","journal-title":"Cybern. Syst. Anal."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1007\/s40860-022-00191-4","article-title":"Components of Oranta-AO Software Expert System for Innovative Application of Blood Pressure Monitors","volume":"9","author":"Vakulenko","year":"2023","journal-title":"J. Reliab. Intell Environ."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"105356","DOI":"10.1016\/j.nsa.2024.105356","article-title":"Indicators of Somatized PTSD in Ukrainian Active Military Personnel Undergoing Rehabilitation after TBI Treatment","volume":"3","author":"Khaustova","year":"2024","journal-title":"Neurosci. Appl."},{"key":"ref_33","first-page":"119","article-title":"Natural Language-Driven Dialogue Systems for Support in Physical Medicine and Rehabilitation","volume":"35","author":"Kaverinsky","year":"2023","journal-title":"S. Afr. Comput. J."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Chaikovsky, I., Dziuba, D., Kryvova, O., Marushko, K., Vakulenko, J., Malakhov, K., and Loskutov, O. (2025). Subtle changes on electrocardiogram in severe patients with COVID-19 may be predictors of treatment outcome. Front. Artif. Intell., 8.","DOI":"10.3389\/frai.2025.1561079"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"112461","DOI":"10.1016\/j.optlastec.2025.112461","article-title":"DBSCAN Clustering Model for Parameter Inversion Using Laser Cutting Edge Morphology Characteristic in Zr-4 Alloy","volume":"184","author":"Tu","year":"2025","journal-title":"Opt. Laser Technol."},{"key":"ref_36","first-page":"1","article-title":"NJmat 2.0: User Instructions of Data-Driven Machine Learning Interface for Materials Science","volume":"83","author":"Zhang","year":"2025","journal-title":"Comput. Mater. Contin."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Chebanyuk, O. (2018, January 23\u201324). An Approach of Text to Model Transformation of Software Models. Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2018), Funchal, Portugal.","DOI":"10.5220\/0006804504320439"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1007\/978-3-031-44668-9_11","article-title":"Software Reuse Approach Based on Review and Analysis of Reuse Risks from Projects Uploaded to GitHub","volume":"Volume 514","author":"Chebanyuk","year":"2023","journal-title":"Computer Science and Education in Computer Science"},{"key":"ref_39","first-page":"514","article-title":"Investigation of Drawbacks of the Software Development Artifacts Reuse Approaches Based on Semantic Analysis","volume":"Volume 181","author":"Chebanyuk","year":"2023","journal-title":"Advances in Computer Science for Engineering and Education VI"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"105699","DOI":"10.1016\/j.ijmedinf.2024.105699","article-title":"A Method and Validation for Auditing E-Health Applications Based on Reusable Software Security Requirements Specifications","volume":"194","year":"2025","journal-title":"Int. J. Med. Inform."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"e42822","DOI":"10.2196\/42822","article-title":"A Data Transformation Methodology to Create Findable, Accessible, Interoperable, and Reusable Health Data: Software Design, Development, and Evaluation Study","volume":"25","author":"Sinaci","year":"2023","journal-title":"J. Med. Internet Res."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"618","DOI":"10.1007\/BF02366417","article-title":"Processor Structure Design","volume":"31","author":"Kurgaev","year":"1995","journal-title":"Cybern. Syst. Anal."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Opanasenko, V., Palagin, O., and Zavyalov, S. (2019, January 18\u201321). The FPGA-Based Problem-Oriented On-Board Processor. Proceedings of the 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Metz, France.","DOI":"10.1109\/IDAACS.2019.8924360"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1007\/s10559-007-0093-z","article-title":"Reconfigurable-Computing Technology","volume":"43","author":"Palagin","year":"2007","journal-title":"Cybern. Syst. Anal."},{"key":"ref_45","unstructured":"Palagin, O., Petrenko, M., Litvin, A., and Boyko, M. (2024, January 14\u201315). Method of Developing an Ontological System with Automatic Formation of a Knowledge Base and User Queries. Proceedings of the 14th International Scientific and Practical Programming Conference, UkrPROG 2024, Kyiv, Ukraine. Available online: https:\/\/ceur-ws.org\/Vol-3806\/S_2_Palagin.pdf."}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/6\/144\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:49:39Z","timestamp":1760032179000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/13\/6\/144"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,10]]},"references-count":45,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["computation13060144"],"URL":"https:\/\/doi.org\/10.3390\/computation13060144","relation":{},"ISSN":["2079-3197"],"issn-type":[{"value":"2079-3197","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,10]]}}}