{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T17:28:03Z","timestamp":1773250083230,"version":"3.50.1"},"reference-count":25,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2021,12,15]],"date-time":"2021-12-15T00:00:00Z","timestamp":1639526400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Without assuming any functional or distributional structure, we select collections of major factors embedded within response-versus-covariate (Re-Co) dynamics via selection criteria [C1: confirmable] and [C2: irrepaceable], which are based on information theoretic measurements. The two criteria are constructed based on the computing paradigm called Categorical Exploratory Data Analysis (CEDA) and linked to Wiener\u2013Granger causality. All the information theoretical measurements, including conditional mutual information and entropy, are evaluated through the contingency table platform, which primarily rests on the categorical nature within all involved features of any data types: quantitative or qualitative. Our selection task identifies one chief collection, together with several secondary collections of major factors of various orders underlying the targeted Re-Co dynamics. Each selected collection is checked with algorithmically computed reliability against the finite sample phenomenon, and so is each member\u2019s major factor individually. The developments of our selection protocol are illustrated in detail through two experimental examples: a simple one and a complex one. We then apply this protocol on two data sets pertaining to two somewhat related but distinct pitching dynamics of two pitch types: slider and fastball. In particular, we refer to a specific Major League Baseball (MLB) pitcher and we consider data of multiple seasons.<\/jats:p>","DOI":"10.3390\/e23121684","type":"journal-article","created":{"date-parts":[[2021,12,15]],"date-time":"2021-12-15T09:02:19Z","timestamp":1639558939000},"page":"1684","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Categorical Nature of Major Factor Selection via Information Theoretic Measurements"],"prefix":"10.3390","volume":"23","author":[{"given":"Ting-Li","family":"Chen","sequence":"first","affiliation":[{"name":"Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8983-142X","authenticated-orcid":false,"given":"Elizabeth P.","family":"Chou","sequence":"additional","affiliation":[{"name":"Department of Statistics, National Chengchi University, Taipei 11605, Taiwan"}]},{"given":"Hsieh","family":"Fushing","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of California, Davis, CA 95616, USA"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,15]]},"reference":[{"key":"ref_1","first-page":"16","article-title":"What is complexity?","volume":"1","year":"1995","journal-title":"Complexity"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1002\/bies.10192","article-title":"What is Complexity?","volume":"24","author":"Adami","year":"2002","journal-title":"BioEssays"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1126\/science.177.4047.393","article-title":"More is different","volume":"177","author":"Anderson","year":"1972","journal-title":"Science"},{"key":"ref_4","unstructured":"Child, D. (2006). The Essentials of Factor Analysis, Continuum International Publishing. [3rd ed.]."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Fushing, H., and Chou, E.P. (2021). Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics perspectives of baseball pitching dynamics. Entropy, 23.","DOI":"10.3390\/e23070792"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Fushing, H., Chou, E.P., and Chen, T.-L. (2021). Mimicking complexity of structured data matrix\u2019s information content: Categorical Exploratory Data Analysis. Entropy, 23.","DOI":"10.3390\/e23050594"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1119\/1.1934921","article-title":"Effect of Spin and Speed on the Lateral Deflection (Curve) of a Baseball; and the Magnus Effect for Smooth Spheres","volume":"27","author":"Briggs","year":"1959","journal-title":"Am. J. Phys."},{"key":"ref_8","unstructured":"Tukey, J.W. (1977). Exploratory Data Analysis, Addison\u2013Wesley."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"171026","DOI":"10.1098\/rsos.171026","article-title":"Complexity of Possibly-gapped Histogram and Analysis of Histogram (ANOHT)","volume":"5","author":"Fushing","year":"2018","journal-title":"R. Soc. Open Sci."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Fushing, H., Liu, S.-Y., Hsieh, Y.-C., and McCowan, B. (2018). From patterned response dependency to structured covariate dependency: Categorical-pattern-matching. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0198253"},{"key":"ref_11","first-page":"27","article-title":"Conditional likelihood maximisation: A unifying framework for information theoretic feature selection","volume":"13","author":"Brown","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/s00521-013-1368-0","article-title":"A review of feature selection methods based on mutual information","volume":"24","author":"Vergara","year":"2014","journal-title":"Neural Comput. Appl."},{"key":"ref_13","unstructured":"Cover, T.M., and Thomas, J.A. (1991). Elements of Information Theory, Wiley."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1007\/s11071-016-3254-7","article-title":"Mutual-information matrix analysis for nonlinear interactions of multivariate time series","volume":"88","author":"Zhao","year":"2017","journal-title":"Nonlin. Dyn."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"8520","DOI":"10.1016\/j.eswa.2015.07.007","article-title":"Feature selection using Joint Mutual Information Maximisation","volume":"42","author":"Bennasar","year":"2015","journal-title":"Expert Syst. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1257\/0002828041464669","article-title":"Time series analysis, cointegration, and applications","volume":"94","author":"Granger","year":"2004","journal-title":"Am. Econ. Rev."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wibral, M., Vicente, R., and Lizier, J. (2014). Conditional Entropy-Based Evaluation of Information Dynamics in Physiological Systems. Directed Information Measures in Neuroscience, Springer.","DOI":"10.1007\/978-3-642-54474-3"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.physrep.2006.12.004","article-title":"Causality detection based on information-theoretic approaches in time series analysis","volume":"441","author":"Palus","year":"2007","journal-title":"Phys. Rep."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Contreras-Reyes, J.E., and Hernandez-Santoro, C. (2020). Assessing Granger-Causality in the Southern Humboldt Current Ecosystem Using Cross-Spectral Methods. Entropy, 22.","DOI":"10.3390\/e22101071"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1103\/PhysRevLett.85.461","article-title":"Measuring information transfer","volume":"85","author":"Schreiber","year":"2000","journal-title":"Phys. Rev. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Abdul Razak, F., and Jensen, H.J. (2014). Quantifying \u201cCausality\u201d in Complex Systems: Understanding Transfer Entropy. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0099462"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"238701","DOI":"10.1103\/PhysRevLett.103.238701","article-title":"Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables","volume":"103","author":"Barnett","year":"2009","journal-title":"Phys. Rev. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"365001","DOI":"10.1088\/1751-8121\/aba028","article-title":"Replica analysis of overfitting in generalized linear regression models","volume":"53","author":"Coolen","year":"2020","journal-title":"J. Phys. Math. Theor."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"033117","DOI":"10.1063\/1.5145005","article-title":"Wavelet entropy-based evaluation of intrinsic predictability of time series","volume":"30","author":"Guntu","year":"2020","journal-title":"Chaos Interdiscip. J. Nonlinear Sci."},{"key":"ref_25","unstructured":"Pearl, J. (2000). Models, Reasoning and Inference, Cambridge University Press."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/12\/1684\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:48:23Z","timestamp":1760168903000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/12\/1684"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,15]]},"references-count":25,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["e23121684"],"URL":"https:\/\/doi.org\/10.3390\/e23121684","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,15]]}}}