{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T00:09:15Z","timestamp":1761610155625,"version":"build-2065373602"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T00:00:00Z","timestamp":1761523200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T00:00:00Z","timestamp":1761523200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"VSB-Technical University Ostrava","award":["SP2025\/018"],"award-info":[{"award-number":["SP2025\/018"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Netw Sci"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Visual representation of data mining results is essential for accurate assessment by domain experts without extensive knowledge of complex relationships in multi-label data. Similarity networks are a commonly used tool for analyzing such data, utilizing data clustering and community detection and visualizing the relationships between individual objects in an understandable form. In clinical data mining, a patient similarity network (PSN) can help clinicians identify patient clusters with representative labels and interpret their relationships. This article demonstrates the use of the Matthews correlation coefficient (MCC) to analyze cluster-class relationships to complement the PSN visualization in several synthetic datasets. We then discuss the limitations of MCC for this application and propose a modification in the form of a rescaled MCC (rMCC). Furthermore, we introduce a novel measure, Connection Purity, that complements rMCC in an informative way. We propose an augmented visualization of patient similarity networks utilizing both measures. We demonstrate this approach on several real-world datasets, showing how clinical intuition may be biased and how our method helps to rectify it.<\/jats:p>","DOI":"10.1007\/s41109-025-00741-8","type":"journal-article","created":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T11:53:53Z","timestamp":1761566033000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Augmented visualization of class\u2013cluster match in patient similarity networks"],"prefix":"10.1007","volume":"10","author":[{"given":"Ondrej","family":"Janca","sequence":"first","affiliation":[]},{"given":"Arootin","family":"Gharibian","sequence":"additional","affiliation":[]},{"given":"Eliska","family":"Ochodkova","sequence":"additional","affiliation":[]},{"given":"Eva","family":"Kriegova","sequence":"additional","affiliation":[]},{"given":"Milos","family":"Kudelka","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,27]]},"reference":[{"key":"741_CR1","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.chemolab.2017.12.004","volume":"174","author":"D Ballabio","year":"2018","unstructured":"Ballabio D, Grisoni F, Todeschini R (2018) Multivariate comparison of classification performance measures. Chemom Intell Lab Syst 174:33\u201344","journal-title":"Chemom Intell Lab Syst"},{"issue":"10","key":"741_CR2","doi-asserted-by":"publisher","first-page":"10008","DOI":"10.1088\/1742-5468\/2008\/10\/P10008","volume":"2008","author":"VD Blondel","year":"2008","unstructured":"Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008","journal-title":"J Stat Mech Theory Exp"},{"issue":"6","key":"741_CR3","doi-asserted-by":"publisher","first-page":"0177678","DOI":"10.1371\/journal.pone.0177678","volume":"12","author":"S Boughorbel","year":"2017","unstructured":"Boughorbel S, Jarray F, El-Anbari M (2017) Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS ONE 12(6):0177678","journal-title":"PLoS ONE"},{"issue":"1\u20132","key":"741_CR4","doi-asserted-by":"publisher","first-page":"1700127","DOI":"10.1002\/minf.201700127","volume":"37","author":"J Brown","year":"2018","unstructured":"Brown J (2018) Classifiers and their metrics quantified. Mol Inf 37(1\u20132):1700127","journal-title":"Mol Inf"},{"issue":"3","key":"741_CR5","doi-asserted-by":"publisher","first-page":"441","DOI":"10.15388\/21-INFOR457","volume":"32","author":"V Bulavas","year":"2021","unstructured":"Bulavas V, Marcinkevi\u010dius V, Rumi\u0144ski J (2021) Study of multi-class classification algorithms\u2019 performance on highly imbalanced network intrusion datasets. Informatica 32(3):441\u2013475","journal-title":"Informatica"},{"key":"741_CR6","doi-asserted-by":"publisher","unstructured":"Charytanowicz M, Jerzy N, Piotr K, Piotr K, Szymon L et al (2012) Seeds. UCI Mach Learn Repos. https:\/\/doi.org\/10.24432\/C5H30K","DOI":"10.24432\/C5H30K"},{"key":"741_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12864-019-6413-7","volume":"21","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genom 21:1\u201313","journal-title":"BMC Genom"},{"issue":"1","key":"741_CR8","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/s13040-023-00322-4","volume":"16","author":"D Chicco","year":"2023","unstructured":"Chicco D, Jurman G (2023) The Matthews correlation coefficient (mcc) should replace the roc auc as the standard metric for assessing binary classification. BioData Min 16(1):4","journal-title":"BioData Min"},{"issue":"2","key":"741_CR9","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1007\/s00778-023-00815-y","volume":"33","author":"JE d\u2019Hondt","year":"2024","unstructured":"d\u2019Hondt JE, Minartz K, Papapetrou O (2024) Efficient detection of multivariate correlations with different correlation measures. VLDB J 33(2):481\u2013505. https:\/\/doi.org\/10.1007\/s00778-023-00815-y","journal-title":"VLDB J"},{"issue":"2","key":"741_CR10","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1111\/j.1469-1809.1936.tb02137.x","volume":"7","author":"RA Fisher","year":"1936","unstructured":"Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179\u2013188","journal-title":"Ann Eugen"},{"issue":"5\u20136","key":"741_CR11","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1016\/j.compbiolchem.2004.09.006","volume":"28","author":"J Gorodkin","year":"2004","unstructured":"Gorodkin J (2004) Comparing two k-category assignments by a k-category correlation coefficient. Comput Biol Chem 28(5\u20136):367\u2013374","journal-title":"Comput Biol Chem"},{"key":"741_CR12","unstructured":"Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756"},{"key":"741_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2024.109189","volume":"137","author":"CQ Gull","year":"2024","unstructured":"Gull CQ, Aguilar J (2024) A semi-supervised learning algorithm for multi-label classification and multi-assignment clustering problems based on a multivariate data analysis. Eng Appl Artif Intell 137:109189. https:\/\/doi.org\/10.1016\/j.engappai.2024.109189","journal-title":"Eng Appl Artif Intell"},{"key":"741_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.rineng.2024.103836","volume":"25","author":"N Gupta","year":"2025","unstructured":"Gupta N, Kubicek J, Penhaker M, Derawi M (2025) A novel diagnostic framework for breast cancer: combining deep learning with mammogram-DBT feature fusion. Res Eng 25:103836. https:\/\/doi.org\/10.1016\/j.rineng.2024.103836","journal-title":"Res Eng"},{"key":"741_CR15","doi-asserted-by":"crossref","unstructured":"Hemphill JF (2003) Interpreting the magnitudes of correlation coefficients","DOI":"10.1037\/0003-066X.58.1.78"},{"key":"741_CR16","doi-asserted-by":"publisher","unstructured":"Higuera C, Gardiner K, Cios K (2015) Mice protein expression. UCI Mach Learn Repos. https:\/\/doi.org\/10.24432\/C50S3Z","DOI":"10.24432\/C50S3Z"},{"key":"741_CR17","unstructured":"Horton P, Nakai K (1996) A probabilistic classification system for predicting the cellular localization sites of proteins. In: Ismb 4:109\u2013115 St. Louis, Missouri, USA"},{"issue":"1","key":"741_CR18","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1007\/s41109-023-00582-3","volume":"8","author":"O Janca","year":"2023","unstructured":"Janca O, Ochodkova E, Kriegova E, Horak P, Skacelova M, Kudelka M (2023) Real-world data in rheumatoid arthritis: patient similarity networks as a tool for clinical evaluation of disease activity. Appl Netw Sci 8(1):57","journal-title":"Appl Netw Sci"},{"key":"741_CR19","doi-asserted-by":"crossref","unstructured":"Janca O, Gharibian A, Ochodkova E, Kriegova E, Kudelka M (2024) Class dominancy profiles in multi-class and multi-cluster similarity networks. In: International Conference on Complex Networks and Their Applications pp. 15\u201326. Springer","DOI":"10.1007\/978-3-031-82439-5_2"},{"key":"741_CR20","doi-asserted-by":"publisher","first-page":"9920","DOI":"10.7717\/peerj.9920","volume":"8","author":"K-M Kuo","year":"2020","unstructured":"Kuo K-M, Talley P, Kao Y, Huang CH (2020) A multi-class classification model for supporting the diagnosis of type II diabetes mellitus. Peer J 8:9920","journal-title":"Peer J"},{"issue":"2","key":"741_CR21","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW (1975) Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochim Biophys Acta (BBA)-Protein Struct 405(2):442\u2013451","journal-title":"Biochim Biophys Acta (BBA)-Protein Struct"},{"key":"741_CR22","doi-asserted-by":"crossref","unstructured":"Ochodkova E, Zehnalova S, Kudelka M (2017) Graph construction based on local representativeness. In: Cao Y, Chen J (eds) Computing and Combinatorics. Springer, Cham, pp 654\u2013665","DOI":"10.1007\/978-3-319-62389-4_54"},{"key":"741_CR23","unstructured":"Opitz J (2022) From bias and prevalence to macro f1, kappa, and mcc: a structured overview of metrics for multi-class evaluation. Heidelberg University"},{"issue":"5","key":"741_CR24","doi-asserted-by":"publisher","first-page":"1763","DOI":"10.1213\/ANE.0000000000002864","volume":"126","author":"P Schober","year":"2018","unstructured":"Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: appropriate use and interpretation. Anesth Analg 126(5):1763\u20131768","journal-title":"Anesth Analg"},{"issue":"11","key":"741_CR25","doi-asserted-by":"publisher","first-page":"2422","DOI":"10.3390\/v14112422","volume":"14","author":"M Sova","year":"2022","unstructured":"Sova M, Kudelka M, Raska M, Mizera J, Mikulkova Z, Trajerova M, Ochodkova E, Genzor S, Jakubec P, Borikova A et al (2022) Network analysis for uncovering the relationship between host response and clinical factors to virus pathogen: lessons from SARS-COV-2. Viruses 14(11):2422","journal-title":"Viruses"},{"issue":"7","key":"741_CR26","doi-asserted-by":"publisher","first-page":"70364","DOI":"10.1002\/ctm2.70364","volume":"15","author":"D Starostka","year":"2025","unstructured":"Starostka D, Dolezilek R, Kvasnicka HM, Kudelka M, Miczkova P, Kriegova E, Kolacek D, Sotkovska B, Anlauf T, Juranova J, Chasakova K, Kolarova S, Paprota M, Buffa D, Kovac P, Zmatlo V (2025) The utility of automated artificial intelligence-assisted digital cytomorphology for bone marrow analysis in diagnostic haemato-oncology. Clin Transl Med 15(7):70364. https:\/\/doi.org\/10.1002\/ctm2.70364","journal-title":"Clin Transl Med"},{"key":"741_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2024.109511","volume":"222","author":"P Stoica","year":"2024","unstructured":"Stoica P, Babu P (2024) Pearson\u2013Matthews correlation coefficients for binary and multinary classification. Signal Process 222:109511. https:\/\/doi.org\/10.1016\/j.sigpro.2024.109511","journal-title":"Signal Process"},{"issue":"12","key":"741_CR28","doi-asserted-by":"publisher","first-page":"1583","DOI":"10.1016\/j.joca.2022.08.019","volume":"30","author":"M Trajerov\u00e1","year":"2022","unstructured":"Trajerov\u00e1 M, Kriegov\u00e1 E, Mikulkov\u00e1 Z, Savara J, Kudelka M, Gallo J (2022) Knee osteoarthritis phenotypes based on synovial fluid immune cells correlate with clinical outcome trajectories. Osteoarthr Cartil 30(12):1583\u20131592","journal-title":"Osteoarthr Cartil"},{"issue":"1","key":"741_CR29","doi-asserted-by":"publisher","first-page":"15264","DOI":"10.1038\/s41598-018-33654-x","volume":"8","author":"J Zheng","year":"2018","unstructured":"Zheng J, Zhang X, Zhao X, Tong X, Hong X, Xie J, Liu S (2018) Deep-RBPPred: predicting RNA binding proteins in the proteome scale based on deep learning. Sci Rep 8(1):15264. https:\/\/doi.org\/10.1038\/s41598-018-33654-x","journal-title":"Sci Rep"}],"container-title":["Applied Network Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41109-025-00741-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41109-025-00741-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41109-025-00741-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T00:03:13Z","timestamp":1761609793000},"score":1,"resource":{"primary":{"URL":"https:\/\/appliednetsci.springeropen.com\/articles\/10.1007\/s41109-025-00741-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,27]]},"references-count":29,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["741"],"URL":"https:\/\/doi.org\/10.1007\/s41109-025-00741-8","relation":{},"ISSN":["2364-8228"],"issn-type":[{"type":"electronic","value":"2364-8228"}],"subject":[],"published":{"date-parts":[[2025,10,27]]},"assertion":[{"value":"31 March 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 September 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare no conflict of interest.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"52"}}