{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T18:40:16Z","timestamp":1772908816512,"version":"3.50.1"},"reference-count":60,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2019,8,12]],"date-time":"2019-08-12T00:00:00Z","timestamp":1565568000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Statistical hypothesis testing is among the most misunderstood quantitative analysis methods from data science. Despite its seeming simplicity, it has complex interdependencies between its procedural components. In this paper, we discuss the underlying logic behind statistical hypothesis testing, the formal meaning of its components and their connections. Our presentation is applicable to all statistical hypothesis tests as generic backbone and, hence, useful across all application domains in data science and artificial intelligence.<\/jats:p>","DOI":"10.3390\/make1030054","type":"journal-article","created":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T04:31:21Z","timestamp":1565670681000},"page":"945-961","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":69,"title":["Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference"],"prefix":"10.3390","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0745-5641","authenticated-orcid":false,"given":"Frank","family":"Emmert-Streib","sequence":"first","affiliation":[{"name":"Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33100 Tampere, Finland"},{"name":"Institute of Biosciences and Medical Technology, Tampere University, 33520 Tampere, Finland"}]},{"given":"Matthias","family":"Dehmer","sequence":"additional","affiliation":[{"name":"Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr Campus, 4040 Steyr, Austria"},{"name":"Department of Mechatronics and Biomedical Computer Science, University for Health Sciences, Medical Informatics and Technology (UMIT), 6060 Hall, Tyrol, Austria"},{"name":"College of Computer and Control Engineering, Nankai University, Tianjin 300000, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,8,12]]},"reference":[{"key":"ref_1","unstructured":"Helbing, D. (2019, June 01). The Automation of Society Is Next: How to Survive the Digital Revolution. Available online: https:\/\/ssrn.com\/abstract=2694312."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hacking, I. (2016). Logic of Statistical Inference, Cambridge University Press.","DOI":"10.1017\/CBO9781316534960"},{"key":"ref_3","unstructured":"Gigerenzer, G. (1993). The Superego, the Ego, and the id in Statistical Reasoning. A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues, Lawrence Erlbaum Associates, Inc."},{"key":"ref_4","unstructured":"Fisher, R.A. (1925). Statistical Methods for Research Workers, Genesis Publishing Pvt Ltd."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Fisher, R.A. (1992). The Arrangement of Field Experiments (1926). Breakthroughs in Statistics, Springer.","DOI":"10.1007\/978-1-4612-4380-9_8"},{"key":"ref_6","first-page":"189","article-title":"The statistical method in psychical research","volume":"39","author":"Fisher","year":"1929","journal-title":"Proc. Soc. Psych. Res."},{"key":"ref_7","first-page":"1","article-title":"On the use and interpretation of certain test criteria for purposes of statistical inference: Part I","volume":"20","author":"Neyman","year":"1967","journal-title":"Biometrika"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1098\/rsta.1933.0009","article-title":"On the Problem of the Most Efficient Tests of Statistical Hypotheses","volume":"231","author":"Neyman","year":"1933","journal-title":"Philos. Trans. R. Soc. Lond."},{"key":"ref_9","unstructured":"Lehman, E. (2005). Testing Statistical Hypotheses, Springer."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1214\/ss\/1056397487","article-title":"Multiple hypothesis testing in microarray experiments","volume":"18","author":"Dudoit","year":"2003","journal-title":"Stat. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Tripathi, S., and Emmert-Streib, F. (2012). Assessment Method for a Power Analysis to Identify Differentially Expressed Pathways. PLoS ONE, 7.","DOI":"10.1371\/journal.pone.0037510"},{"key":"ref_12","first-page":"e53354","article-title":"Ensuring the statistical soundness of competitive gene set approaches: Gene filtering and genome-scale coverage are essential","volume":"6","author":"Tripathi","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1093\/bioinformatics\/btl599","article-title":"Extensions to gene set enrichment","volume":"23","author":"Jiang","year":"2007","journal-title":"Bioinformatics"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1089\/cmb.2007.0041","article-title":"The Chronic Fatigue Syndrome: A Comparative Pathway Analysis","volume":"14","year":"2007","journal-title":"J. Comput. Biol."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Siroker, D., and Koomen, P. (2013). A\/B Testing: The Most Powerful Way to Turn Clicks into Customers, John Wiley & Sons.","DOI":"10.1002\/9781119176459"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1020","DOI":"10.1056\/NEJMoa067731","article-title":"Stent thrombosis in randomized clinical trials of drug-eluting stents","volume":"356","author":"Mauri","year":"2007","journal-title":"N. Engl. J. Med."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"896","DOI":"10.1056\/NEJMoa060281","article-title":"A randomized trial of deep-brain stimulation for Parkinson\u2019s disease","volume":"355","author":"Deuschl","year":"2006","journal-title":"N. Engl. J. Med."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1899","DOI":"10.1056\/NEJMoa1313122","article-title":"Randomized trial of posaconazole and benznidazole for chronic Chagas\u2019 disease","volume":"370","author":"Molina","year":"2014","journal-title":"N. Engl. J. Med."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1440","DOI":"10.4088\/JCP.v64n1207","article-title":"Randomized placebo-controlled trial of baclofen for cocaine dependence: Preliminary effects for individuals with chronic patterns of cocaine use","volume":"64","author":"Shoptaw","year":"2003","journal-title":"J. Clin. Psychiatry"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1139","DOI":"10.1037\/a0028168","article-title":"The psychological effects of meditation: A meta-analysis","volume":"138","author":"Sedlmeier","year":"2012","journal-title":"Psychol. Bull."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1056\/NEJM197811022991808","article-title":"Interpretation by Physicians of Clinical Laboratory Results","volume":"299","author":"Casscells","year":"1978","journal-title":"N. Engl. J. Med."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ioannidis, J.P.A. (2005). Why Most Published Research Findings Are False. PLoS Med., 2.","DOI":"10.1371\/journal.pmed.0020124"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"127","DOI":"10.4103\/0972-6748.62274","article-title":"Self-medication practice among undergraduate medical students in a tertiary care medical college, West Bengal","volume":"18","author":"Banerjee","year":"2009","journal-title":"Ind. Psychiatry J."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"e32","DOI":"10.1016\/j.forsciint.2015.11.013","article-title":"Statistical hypothesis testing and common misinterpretations: Should we abandon p-values in forensic science applications?","volume":"259","author":"Taroni","year":"2016","journal-title":"Forensic Sci. Int."},{"key":"ref_25","first-page":"235","article-title":"Defining Data Science by a Data-Driven Quantification of the Community","volume":"1","author":"Dehmer","year":"2019","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sheskin, D.J. (2004). Handbook of Parametric and Nonparametric Statistical Procedures, RC Press. [3rd ed.].","DOI":"10.1201\/9781420036268"},{"key":"ref_27","unstructured":"Chernick, M.R., and LaBudde, R.A. (2014). An Introduction to Bootstrap Methods with Applications to R, John Wiley & Sons."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1093\/ije\/dyr178","article-title":"What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations","volume":"41","author":"Panagiotou","year":"2011","journal-title":"Int. J. Epidemiol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1198\/000313008X332421","article-title":"p-valuess are random variables","volume":"62","author":"Murdoch","year":"2008","journal-title":"Am. Stat."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Emmert-Streib, F., Moutari, S., and Dehmer, M. (2019). A comprehensive survey of error measures for evaluating binary decision making in data science. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., e1303.","DOI":"10.1002\/widm.1303"},{"key":"ref_31","unstructured":"Breiman, L. (1973). Statistics: With a View Toward Applications, Houghton Mifflin Co."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Baron, M. (2013). Probability and Statistics for Computer Scientists, Chapman and Hall\/CRC.","DOI":"10.1201\/b14800"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Efron, B., and Tibshirani, R. (1994). An Introduction to the Bootstrap, Chapman and Hall\/CRC.","DOI":"10.1201\/9780429246593"},{"key":"ref_34","unstructured":"R Development Core Team (2008). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing."},{"key":"ref_35","first-page":"3","article-title":"The data analysis dilemma: Ban or abandon. A review of null hypothesis significance testing","volume":"5","author":"Nix","year":"1998","journal-title":"Res. Sch."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"390","DOI":"10.3389\/fnhum.2017.00390","article-title":"When null hypothesis significance testing is unsuitable for research: A reassessment","volume":"11","author":"Szucs","year":"2017","journal-title":"Front. Hum. Neurosci."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1007\/s11999-009-1164-4","article-title":"P value and the theory of hypothesis testing: An explanation for new researchers","volume":"468","author":"Biau","year":"2010","journal-title":"Clin. Orthop. Relat. Res.\u00ae"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1242","DOI":"10.1080\/01621459.1993.10476404","article-title":"The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two?","volume":"88","author":"Lehmann","year":"1993","journal-title":"J. Am. stat. Assoc."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"223","DOI":"10.3389\/fpsyg.2015.00223","article-title":"Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing","volume":"6","author":"Perezgonzalez","year":"2015","journal-title":"Front. Psychol."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1007\/s10654-016-0149-3","article-title":"Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations","volume":"31","author":"Greenland","year":"2016","journal-title":"Eur. J. Epidemiol."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1053\/j.seminhematol.2008.04.003","article-title":"A Dirty Dozen: Twelve p-values Misconceptions","volume":"Volume 45","author":"Goodman","year":"2008","journal-title":"Seminars in Hematology"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1080\/00031305.2016.1154108","article-title":"The ASA\u2019s statement on p-valuess: Context, process, and purpose","volume":"70","author":"Wasserstein","year":"2016","journal-title":"Am. Stat."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1080\/00031305.2019.1583913","article-title":"Moving to a World Beyond p < 0.05","volume":"73","author":"Wasserstein","year":"2019","journal-title":"Am. Stat."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1038\/d41586-019-00969-2","article-title":"Retiring significance: A free pass to bias","volume":"567","author":"Ioannidis","year":"2019","journal-title":"Nature"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1038\/d41586-019-00857-9","article-title":"Scientists rise up against statistical significance","volume":"567","author":"Amrhein","year":"2019","journal-title":"Nature"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1080\/00031305.2018.1543135","article-title":"Three Recommendations for Improving the Use of p-valuess","volume":"73","author":"Benjamin","year":"2019","journal-title":"Am. Stat."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1111\/j.1539-6053.2008.00033.x","article-title":"Helping doctors and patients make sense of health statistics","volume":"8","author":"Gigerenzer","year":"2007","journal-title":"Psychol. Sci. Public Interest"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1093\/bioinformatics\/btt687","article-title":"Gene Sets Net Correlations Analysis (GSNCA): A multivariate differential coexpression test for gene sets","volume":"30","author":"Rahmatallah","year":"2014","journal-title":"Bioinformatics"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"De Matos Simoes, R., and Emmert-Streib, F. (2012). Bagging statistical network inference from large-scale gene expression data. PLoS ONE, 7.","DOI":"10.1371\/journal.pone.0033624"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Rahmatallah, Y., Zybailov, B., Emmert-Streib, F., and Glazko, G. (2017). GSAR: Bioconductor package for Gene Set analysis in R. BMC Bioinform., 18.","DOI":"10.1186\/s12859-017-1482-6"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1037\/1082-989X.2.2.161","article-title":"On the logic and purpose of significance testing","volume":"2","author":"Cortina","year":"1997","journal-title":"Psychol. Methods"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1177\/0959354397074006","article-title":"The spread of statistical significance testing in psychology: The case of the Journal of Applied Psychology, 1917\u20131994","volume":"7","author":"Hubbard","year":"1997","journal-title":"Theory Psychol."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"149","DOI":"10.3390\/make1010009","article-title":"A Machine Learning Perspective on Personalized Medicine: An Automatized, Comprehensive Knowledge Base with Ontology for Pattern Recognition","volume":"1","author":"Dehmer","year":"2018","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1037\/1082-989X.5.2.241","article-title":"Null hypothesis significance testing: A review of an old and continuing controversy","volume":"5","author":"Nickerson","year":"2000","journal-title":"Psychol. Methods"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"122","DOI":"10.1177\/002224378302000203","article-title":"The significance of statistical significance tests in marketing research","volume":"20","author":"Sawyer","year":"1983","journal-title":"J. Mark. Res."},{"key":"ref_56","first-page":"125","article-title":"Controlling the false discovery rate: A practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Efron, B. (2010). Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, Cambridge University Press.","DOI":"10.1017\/CBO9780511761362"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"653","DOI":"10.3390\/make1020039","article-title":"Large-Scale Simultaneous Inference with Hypothesis Testing: Multiple Testing Procedures in Practice","volume":"1","author":"Dehmer","year":"2019","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1177\/0962280206079046","article-title":"A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion","volume":"17","author":"Farcomeni","year":"2008","journal-title":"Stat. Methods Med. Res."},{"key":"ref_60","first-page":"1","article-title":"Neural correlates of interspecies perspective taking in the post-mortem atlantic salmon: An argument for proper multiple comparisons correction","volume":"1","author":"Bennett","year":"2011","journal-title":"J. Serendipitous Unexpect. Results"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/1\/3\/54\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:10:32Z","timestamp":1760188232000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/1\/3\/54"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,12]]},"references-count":60,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2019,9]]}},"alternative-id":["make1030054"],"URL":"https:\/\/doi.org\/10.3390\/make1030054","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,12]]}}}