{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:22:38Z","timestamp":1760235758905,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2021,9,28]],"date-time":"2021-09-28T00:00:00Z","timestamp":1632787200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>The problems of gene regulatory network (GRN) reconstruction and the creation of disease diagnostic effective systems based on genes expression data are some of the current directions of modern bioinformatics. In this manuscript, we present the results of the research focused on the evaluation of the effectiveness of the most used metrics to estimate the gene expression profiles\u2019 proximity, which can be used to extract the groups of informative gene expression profiles while taking into account the states of the investigated samples. Symmetry is very important in the field of both genes\u2019 and\/or proteins\u2019 interaction since it undergirds essentially all interactions between molecular components in the GRN and extraction of gene expression profiles, which allows us to identify how the investigated biological objects (disease, state of patients, etc.) contribute to the further reconstruction of GRN in terms of both the symmetry and understanding the mechanism of molecular element interaction in a biological organism. Within the framework of our research, we have investigated the following metrics: Mutual information maximization (MIM) using various methods of Shannon entropy calculation, Pearson\u2019s \u03c72 test and correlation distance. The accuracy of the investigated samples classification was used as the main quality criterion to evaluate the appropriate metric effectiveness. The random forest classifier (RF) was used during the simulation process. The research results have shown that results of the use of various methods of Shannon entropy within the framework of the MIM metric disagree with each other. As a result, we have proposed the modified mutual information maximization (MMIM) proximity metric based on the joint use of various methods of Shannon entropy calculation and the Harrington desirability function. The results of the simulation have also shown that the correlation proximity metric is less effective in comparison to both the MMIM metric and Pearson\u2019s \u03c72 test. Finally, we propose the hybrid proximity metric (HPM) that considers both the MMIM metric and Pearson\u2019s \u03c72 test. The proposed metric was investigated within the framework of one-cluster structure effectiveness evaluation. To our mind, the main benefit of the proposed HPM is in increasing the objectivity of mutually similar gene expression profiles extraction due to the joint use of the various effective proximity metrics that can contradict with each other when they are used alone.<\/jats:p>","DOI":"10.3390\/sym13101812","type":"journal-article","created":{"date-parts":[[2021,9,28]],"date-time":"2021-09-28T21:39:29Z","timestamp":1632865169000},"page":"1812","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Comparison Analysis of Gene Expression Profiles Proximity Metrics"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6797-1467","authenticated-orcid":false,"given":"Sergii","family":"Babichev","sequence":"first","affiliation":[{"name":"Department of Physics, Kherson State University, 73000 Kherson, Ukraine"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8629-8658","authenticated-orcid":false,"given":"Lyudmyla","family":"Yasinska-Damri","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Technology, Ukrainian Academy of Printing, 79000 Lviv, Ukraine"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5417-9403","authenticated-orcid":false,"given":"Igor","family":"Liakh","sequence":"additional","affiliation":[{"name":"Department of Informatics, Phisical and Mathematical Disciplines, Uzhhorod National University, 88000 Uzhhorod, Ukraine"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1526-9005","authenticated-orcid":false,"given":"Bohdan","family":"Durnyak","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Technology, Ukrainian Academy of Printing, 79000 Lviv, Ukraine"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,28]]},"reference":[{"key":"ref_1","unstructured":"(2014, May 01). ArrayExpress\u2014Functional Genomics Data. Available online: https:\/\/www.ebi.ac.uk\/arrayexpress\/."},{"key":"ref_2","first-page":"62","article-title":"Current state of the problem of gene expression data processing and extraction to solve the reverse engineering tasks in the field of bioinformatics","volume":"2853","author":"Babichev","year":"2021","journal-title":"Ceur Workshop Proc."},{"key":"ref_3","first-page":"100754","article-title":"Comparative microRNAs expression profiles analysis during embryonic development of common carp, Cyprinus carpio","volume":"37","author":"Wang","year":"2021","journal-title":"Comp. Biochem. Physiol.\u2014Part Genom. Proteom."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"953","DOI":"10.1001\/jamadermatol.2020.1731","article-title":"Performance of Gene Expression Profile Tests for Prognosis in Patients with Localized Cutaneous Melanoma: A Systematic Review and Meta-Analysis","volume":"156","author":"Marchetti","year":"2020","journal-title":"JAMA Dermatol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"78533","DOI":"10.1109\/ACCESS.2019.2922987","article-title":"A survey on hybrid feature selection methods in microarray gene expression data for cancer classification","volume":"7","author":"Almugren","year":"2019","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/j.neucom.2016.07.080","article-title":"A hybrid feature selection algorithm for gene expression data classification","volume":"256","author":"Lu","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1007\/s10916-018-0910-0","article-title":"Fuzzy expert system based on a novel hybrid stem cell (HSC) algorithm for classification of micro array data","volume":"42","author":"Vijay","year":"2018","journal-title":"J. Med. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1016\/j.asoc.2009.11.010","article-title":"A novel hybrid feature selection method for microarray data analysis","volume":"11","author":"Lee","year":"2011","journal-title":"Appl. Soft Comput."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1016\/j.compbiomed.2011.02.004","article-title":"A hybrid feature selection method for DNA microarray data","volume":"41","author":"Chuang","year":"2011","journal-title":"Comput. Biol. Med."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/j.asoc.2017.09.038","article-title":"Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification","volume":"62","author":"Jain","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.ygeno.2017.01.004","article-title":"Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts","volume":"109","author":"Dashtban","year":"2017","journal-title":"Genomics"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1016\/j.asoc.2016.11.026","article-title":"Classification of human cancer diseases by gene expression profiles","volume":"50","author":"Salem","year":"2017","journal-title":"Appl. Soft Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1016\/j.ygeno.2016.05.001","article-title":"A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization","volume":"107","author":"Sharbaf","year":"2016","journal-title":"Genomics"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.ygeno.2017.07.010","article-title":"Gene selection for tumor classification using a novel bio-inspired multi-objective approach","volume":"110","author":"Dashtban","year":"2018","journal-title":"Genomics"},{"key":"ref_15","first-page":"604910","article-title":"mRMR-ABC: A hybrid gene selection algorithm for cancer classification using microarray gene expression profiling","volume":"2015","author":"Alshamlan","year":"2018","journal-title":"Biomed. Res. Int."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1016\/j.compbiolchem.2015.03.001","article-title":"Genetic bee colony (GBC) algorithm: A new gene selection method for microarray cancer classification","volume":"56","author":"Alshamlan","year":"2015","journal-title":"Comput. Biol. Chem."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/j.asoc.2016.01.044","article-title":"A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy","volume":"43","author":"Moradi","year":"2016","journal-title":"Appl. Soft Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1109\/TNB.2013.2294716","article-title":"Multiobjective binary biogeography based optimization for feature selection using gene expression data","volume":"12","author":"Li","year":"2013","journal-title":"IEEE Trans. Nanobiosci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1312","DOI":"10.1080\/00207721.2014.924600","article-title":"Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm","volume":"47","author":"Shreem","year":"2016","journal-title":"Int. J. Syst. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1016\/j.procs.2019.11.054","article-title":"Recovery of Incomplete IoT Sensed Data using High-Performance Extended-Input Neural-Like Structure","volume":"160","author":"Izonin","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.procs.2019.08.006","article-title":"An Approach towards Missing Data Recovery within IoT Smart System","volume":"155","author":"Izonin","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/978-3-030-54215-3_2","article-title":"Technique of gene expression profiles selection based on SOTA clustering algorithm using statistical criteria and Shannon entropy","volume":"1246","author":"Babichev","year":"2021","journal-title":"Adv. Intell. Syst. Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Babichev, S., and \u0160kvor, J. (2020). Technique of Gene Expression Profiles Extraction Based on the Complex Use of Clustering and Classification Methods. Diagnostics, 10.","DOI":"10.20944\/preprints202008.0241.v1"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Babichev, S., Barilla, J., Fi\u0161er, J., and \u0160kvor, J. (2020, January 9\u201313). A hybrid model of gene expression profiles reducing based on the complex use of fuzzy inference system and clustering quality criteria. Proceedings of the 11th Conference of the European Society for Fuzzy Logic and Technology, EUSFLAT 2019, Prague, Czech Republic.","DOI":"10.2991\/eusflat-19.2019.20"},{"key":"ref_25","unstructured":"Thomas, M.C., and Joy, A.T. (2006). Elements of Information Theory, John Wiley & Sons. [2nd ed.]."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J."},{"key":"ref_27","first-page":"1469","article-title":"Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks","volume":"10","author":"Hausser","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"ref_28","unstructured":"Miller, G. (2021, August 10). Note on the Bias of Information Estimates. Information Theory in Psychology. Available online: https:\/\/www.scienceopen.com\/document?vid=357d299f-62fa-4bda-8dd2-e4d5b5abde5d."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1080\/01621459.1952.10483446","article-title":"A generalization of sampling without replacement from a finite universe","volume":"47","author":"Horvitz","year":"1952","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1126\/science.1088284","article-title":"Always Good Turing: Asymptotically optimal probability estimation","volume":"302","author":"Orlitsky","year":"2003","journal-title":"Science"},{"key":"ref_31","first-page":"2833","article-title":"Bayesian Entropy Estimation for Countable Discrete Distributions","volume":"15","author":"Archer","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_32","first-page":"494","article-title":"The desirability function","volume":"21","author":"Harrington","year":"1965","journal-title":"Ind. Qual. Control"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1080\/10618600.1996.10474713","article-title":"R: A language for data analysis and graphics","volume":"5","author":"Ihaka","year":"1996","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Hou, J., Aerts, J., den Hamer, B., van Ijcken, W., den Bakker, M., Riegman, P., Van Der Leest, C., Van Der Spek, P., Foekens, J.A., and Hoogsteden, H.C. (2010). Gene expression-based classification of non-small cell lung carcinomas and survival prediction. PLoS ONE, 5.","DOI":"10.1371\/journal.pone.0010312"},{"key":"ref_35","first-page":"5","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Breiman"},{"key":"ref_36","unstructured":"Kuhn, M., Wing, J., and Weston, S. (2020, May 18). Classification and Regression Training. Available online: https:\/\/github.com\/topepo\/caret\/."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/13\/10\/1812\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:06:33Z","timestamp":1760166393000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/13\/10\/1812"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,28]]},"references-count":36,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2021,10]]}},"alternative-id":["sym13101812"],"URL":"https:\/\/doi.org\/10.3390\/sym13101812","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2021,9,28]]}}}