{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T18:29:35Z","timestamp":1776104975720,"version":"3.50.1"},"reference-count":87,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T00:00:00Z","timestamp":1641513600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput.-Hum. Interact."],"published-print":{"date-parts":[[2022,2,28]]},"abstract":"<jats:p>\n            Data analysis requires translating higher level questions and hypotheses into computable statistical models. We present a mixed-methods study aimed at identifying the steps, considerations, and challenges involved in operationalizing hypotheses into statistical models, a process we refer to as\n            <jats:italic>hypothesis formalization<\/jats:italic>\n            . In a formative content analysis of 50 research papers, we find that researchers highlight decomposing a hypothesis into sub-hypotheses, selecting proxy variables, and formulating statistical models based on data collection design as key steps. In a lab study, we find that analysts fixated on implementation and shaped their analyses to fit familiar approaches, even if sub-optimal. In an analysis of software tools, we find that tools provide inconsistent, low-level abstractions that may limit the statistical models analysts use to formalize hypotheses. Based on these observations, we characterize hypothesis formalization as a dual-search process balancing conceptual and statistical considerations constrained by data and computation and discuss implications for future tools.\n          <\/jats:p>","DOI":"10.1145\/3476980","type":"journal-article","created":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T14:50:16Z","timestamp":1641567016000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Hypothesis Formalization: Empirical Findings, Software Limitations, and Design Implications"],"prefix":"10.1145","volume":"29","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4050-4284","authenticated-orcid":false,"given":"Eunice","family":"Jun","sequence":"first","affiliation":[{"name":"University of Washington, Seattle, WA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6119-0544","authenticated-orcid":false,"given":"Melissa","family":"Birchfield","sequence":"additional","affiliation":[{"name":"University of Washington, Seattle, WA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2521-5126","authenticated-orcid":false,"given":"Nicole","family":"De Moura","sequence":"additional","affiliation":[{"name":"Eastlake High School, Seattle, WA"}]},{"given":"Jeffrey","family":"Heer","sequence":"additional","affiliation":[{"name":"University of Washington, Seattle, WA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5982-275X","authenticated-orcid":false,"given":"Ren\u00e9","family":"Just","sequence":"additional","affiliation":[{"name":"University of Washington, Seattle, WA"}]}],"member":"320","published-online":{"date-parts":[[2022,1,7]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2865040"},{"key":"e_1_3_2_3_2","article-title":"Fitting linear mixed-effects models using lme4","author":"Bates Douglas","year":"2014","unstructured":"Douglas Bates, Martin M\u00e4chler, Ben Bolker, and Steve Walker. 2014. Fitting linear mixed-effects models using lme4. arXiv:1406.5823 (2014). Retrieved from https:\/\/arxiv.org\/abs\/1701.00133.","journal-title":"arXiv:1406.5823"},{"key":"e_1_3_2_4_2","article-title":"Package \u2018lme4\u2018","author":"Bates Douglas","year":"2019","unstructured":"Douglas Bates, Martin M\u00e4chler, Ben Bolker, Steve Walker, Rune H. B. Christensen, Henrik Singmann, Bin Dai, Fabian Scheipl, Gabor Grothendieck, Peter Green, and John Fox. 2019. Package \u2018lme4\u2018. CRAN (2019). Retrieved from https:\/\/cran.r-project.org\/web\/packages\/lme4\/lme4.pdf.","journal-title":"CRAN"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13678"},{"key":"e_1_3_2_6_2","volume-title":"Graphics and Graphic Information Processing","author":"Bertin Jacques","year":"2011","unstructured":"Jacques Bertin. 2011. Graphics and Graphic Information Processing. Walter de Gruyter."},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"Michael Betancourt. 2020. Towards a Principled Bayesian Workflow. Psychological Methods 26 1 (2020) 103-126. Retrieved from https:\/\/betanalpha.github.io\/assets\/case_studies\/principled_bayesian_workflow.html.","DOI":"10.1037\/met0000275"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.5555\/3322706.3322734"},{"key":"e_1_3_2_9_2","article-title":"Top 15 python libr aries for data science in 2017","author":"Bobriakov Igor","year":"2017","unstructured":"Igor Bobriakov. 2017. Top 15 python libr aries for data science in 2017. ActiveWizards in Medium (2017). Retrieved January 7, 2020 from https:\/\/medium.com\/activewizards-machine-learning-company\/top-15-python-libraries-for-data-science-in-in-2017-ab61b4f9b4a7.","journal-title":"ActiveWizards in Medium"},{"key":"e_1_3_2_10_2","article-title":"Top 20 python libraries for data science in 2018","author":"Bobriakov Igor","year":"2018","unstructured":"Igor Bobriakov. 2018. Top 20 python libraries for data science in 2018. ActiveWizards in Medium (2018). Retrieved December 16, 2019 from https:\/\/medium.com\/activewizards-machine-learning-company\/top-20-python-libraries-for-data-science-in-2018-2ae7d1db8049.","journal-title":"ActiveWizards in Medium"},{"key":"e_1_3_2_11_2","unstructured":"Leo Breiman Adele Cutler Andy Liaw and Matthew Wiener. 2018. Package \u201crandomForest\u201d. (2018). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/randomForest\/randomForest.pdf."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.32614\/RJ-2017-066"},{"key":"e_1_3_2_13_2","article-title":"API design for machine learning software: Experiences from the scikit-learn project","author":"Buitinck Lars","year":"2013","unstructured":"Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, and Ga\u00ebl Varoquaux. 2013. API design for machine learning software: Experiences from the scikit-learn project. arXiv:1309.0238. Retrieved from https:\/\/arxiv.org\/abs\/1309.0238.","journal-title":"arXiv:1309.0238"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v080.i01"},{"key":"e_1_3_2_15_2","article-title":"Stan : A probabilistic programming language","volume":"76","author":"Carpenter Bob","year":"2017","unstructured":"Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan : A probabilistic programming language. Journal of Statistical Software 76, 1 (2017), 1\u201332. DOI:https:\/\/doi.org\/10.18637\/jss.v076.i01","journal-title":"Journal of Statistical Software"},{"key":"e_1_3_2_16_2","unstructured":"Robert Carver Michelle Everson John Gabrosek Nicholas Horton Robin Lock Megan Mocko Allan Rossman Ginger Holmes Roswell Paul Velleman Jeffrey Witmer and Beverly Wood. 2016. Guidelines for assessment and instruction in statistics education (GAISE) college report 2016. AMSTAT (2016)."},{"key":"e_1_3_2_17_2","unstructured":"Yunshun Chen Aaron T. L. Lun Davis J. McCarthy Matthew E. Ritchie Belinda Phipson Yifang Hu Xiaobei Zhou Mark D. Robinson and Gordon K. Smyth. 2020. Empirical analysis of digital gene expression data in R (v3.30.3). (2020). Retrieved September 16 2020 from https:\/\/bioconductor.org\/packages\/release\/bioc\/html\/edgeR.html."},{"key":"e_1_3_2_18_2","unstructured":"Yunshun Chen David McCarthy Matthew Ritchie Mark Robinson and Gordon Smyth. 2020. edgeR: Differential analysisof sequence read count data. (2020). Retrieved September 16 2020 from https:\/\/bioconductor.org\/packages\/release\/bioc\/vignettes\/edgeR\/inst\/doc\/edgeRUsersGuide.pdf."},{"key":"e_1_3_2_19_2","article-title":"Keras","author":"Chollet Fran\u00e7ois","year":"2015","unstructured":"Fran\u00e7ois Chollet. 2015. Keras. keras.io Retrieved from https:\/\/keras.io.","journal-title":"Retrieved from https:\/\/keras.io"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300447"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2012.6240511"},{"key":"e_1_3_2_22_2","unstructured":"Jerome Friedman Trevor Hastie Rob Tibshirani Balasubramanian Narasimhan Kenneth Tay Noah Simon and Junyang Qian. 2020. Package \u201cglmnet\u201d. (2020). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/glmnet\/index.html."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1111\/rssa.12378"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1201\/b16018"},{"key":"e_1_3_2_25_2","article-title":"The garden of forking paths: Why multiple comparisons can be a problem, even when there is no \u201cfishing expedition\u201d or \u201cp-hacking\u201d and the research hypothesis was posited ahead of time","author":"Gelman Andrew","year":"2013","unstructured":"Andrew Gelman and Eric Loken. 2013. The garden of forking paths: Why multiple comparisons can be a problem, even when there is no \u201cfishing expedition\u201d or \u201cp-hacking\u201d and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University (2013).","journal-title":"Department of Statistics, Columbia University"},{"key":"e_1_3_2_26_2","article-title":"Bayesian workflow","author":"Gelman Andrew","year":"2020","unstructured":"Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian B\u00fcrkner, and Martin Modr\u00e1k. 2020. Bayesian workflow. arXiv:2011.01808. Retrieved from https:\/\/arxiv.org\/abs\/2011.01808.","journal-title":"arXiv:2011.01808"},{"key":"e_1_3_2_27_2","unstructured":"LLC. GraphPad Software. 2020. GraphPad prism 8 user guide. (2020). Retrieved fromhttps:\/\/www.graphpad.com\/guides\/prism\/8\/user-guide\/index.htm."},{"key":"e_1_3_2_28_2","article-title":"Quick list of useful R packages","author":"Grolemund Garrett","year":"2019","unstructured":"Garrett Grolemund. 2019. Quick list of useful R packages. R Studio Support (2019). Retrieved April 28, 2020 fromhttps:\/\/support.rstudio.com\/hc\/en-us\/articles\/201057987-Quick-list-of-useful-R-packages.","journal-title":"R Studio Support"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1111\/insr.12028"},{"key":"e_1_3_2_30_2","unstructured":"Jarrod Hadfield. 2020. Package \u201cMCMCglmm\u201d. (2020). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/MCMCglmm\/MCMCglmm.pdf."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v033.i02"},{"key":"e_1_3_2_32_2","unstructured":"Trevor Hastie and Junyang Qian. 2014. Glmnet vignette. (2014). Retrieved September 16 2020 from https:\/\/web.stanford.edu\/hastie\/glmnet\/glmnet_alpha.html."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3332165.3347925"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/575777"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1002\/wics.162"},{"key":"e_1_3_2_36_2","unstructured":"Eric Jones Travis Oliphant and Pearu Peterson. 2001\u20132020. SciPy: Open source scientific tools for Python. Retrieved August 6 2020 from http:\/\/www.scipy.org\/."},{"key":"e_1_3_2_37_2","unstructured":"Eric Jones Travis Oliphant and Pearu Peterson. 2001\u20132020. Statistical functions (scipy.stats). Retrieved September 14 2020 from https:\/\/docs.scipy.org\/doc\/scipy\/reference\/stats.html."},{"key":"e_1_3_2_38_2","unstructured":"Eric Jones Travis Oliphant and Pearu Peterson. 2001\u20132020. Optimization and root finding (scipy.optimize). Retrieved September 14 2020 from https:\/\/docs.scipy.org\/doc\/scipy\/reference\/optimize.html."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3332165.3347940"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300432"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2012.219"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1207\/s15327957pspr0203_4"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog1201_1"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1037\/0033-2909.125.5.524"},{"key":"e_1_3_2_45_2","first-page":"118","volume-title":"Expertise out of Context","author":"Klein Gary","year":"2007","unstructured":"Gary Klein, Jennifer K. Phillips, Erica L. Rall, and Deborah A. Peluso. 2007. A data\u2013frame theory of sensemaking. In Proceedings of theExpertise out of Context. Psychology Press, 118\u2013160."},{"key":"e_1_3_2_46_2","article-title":"\u201cANOVA\u2019s three types of estimating sums of squares: Don\u2019t make the wrong choice!","author":"Korstanje Joos","year":"2019","unstructured":"Joos Korstanje. 2019. \u201cANOVA\u2019s three types of estimating sums of squares: Don\u2019t make the wrong choice! Towards Data Science, Medium (2019). Retrieved September 14, 2020 from https:\/\/towardsdatascience.com\/anovas-three-types-of-estimating-sums-of-squares-don-t-make-the-wrong-choice-91107c77a27a.","journal-title":"Towards Data Science, Medium"},{"key":"e_1_3_2_47_2","unstructured":"Max Kuhn Davis Vaughan and RStudio. 2020. parsnip: A Common API to Modeling and Analysis Functions. Retrieved from https:\/\/parsnip.tidymodels.org\/."},{"key":"e_1_3_2_48_2","volume-title":"Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles.","author":"Kuhn Max","year":"2020","unstructured":"Max Kuhn and Hadley Wickham. 2020. Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles. Retrieved from https:\/\/www.tidymodels.org."},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1007\/s42113-019-00029-y"},{"issue":"1","key":"e_1_3_2_50_2","first-page":"66","article-title":"Understanding the role of alternatives in data analysis practices","volume":"26","author":"Liu Jiali","year":"2019","unstructured":"Jiali Liu, Nadia Boukhelifa, and James R. Eagan. 2019. Understanding the role of alternatives in data analysis practices. IEEE Transactions on Visualization and Computer Graphics 26, 1 (2019), 66\u201376.","journal-title":"IEEE Transactions on Visualization and Computer Graphics"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376533"},{"key":"e_1_3_2_52_2","unstructured":"StataCorp LLC. 2020. Language syntax. Retrieved September 16 2020 fromhttps:\/\/www.stata.com\/manuals13\/u11.pdf."},{"key":"e_1_3_2_53_2","unstructured":"StataCorp LLC. 2020. Stata 16 Documentation. Retrieved fromhttps:\/\/www.stata.com\/features\/documentation\/."},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008929526011"},{"key":"e_1_3_2_55_2","unstructured":"Arni Magnusson Hans Skaug Anders Nielsen Casper Berg Kasper Kristensen Martin Maechler Koen van Bentham Ben Bolker Nafis Sadat Daniel L\u00fcdecke Russ Lenth Joseph O\u2019Brien and Mollie Brooks. 2020. Package \u201cglmmTMB\u201d. (2020). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/glmmTMB\/index.html."},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1201\/9780429029608"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.5555\/1095704"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1177\/0956797619879441"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.5555\/576915"},{"key":"e_1_3_2_60_2","unstructured":"University of Amsterdam. 2020. JASP: A Fresh Way to do Statistics. Retrieved September 16 2020 from https:\/\/jasp-stats.org\/."},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_3_2_62_2","unstructured":"Josef Perktold Skipper Seabold Jonathan Taylor and statsmodels developers. 2020. Statsmodels v0.10.2 reference guide. (2020). Retrieved April 1 2021 from https:\/\/www.statsmodels.org\/stable."},{"key":"e_1_3_2_63_2","first-page":"171","article-title":"Statistical thinking: One statistician\u2019s perspective","author":"Pfannkuch M.","year":"1997","unstructured":"M. Pfannkuch. 1997. Statistical thinking: One statistician\u2019s perspective. Research Papers on Stochastics Education (1997), 171\u2013178.","journal-title":"Research Papers on Stochastics Education"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1214\/ss\/1009212754"},{"key":"e_1_3_2_65_2","unstructured":"Jos\u00e9 Pinheiro Douglas Bates Saikat DebRoy Deepayan Sarkar EISPACK authors Siem Heisterkamp Bert Van Willigen and R-core. 2020. Package \u201cnlme\u201d. (2020). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/nlme\/nlme.pdf."},{"key":"e_1_3_2_66_2","first-page":"2","volume-title":"Proceedings of International Conference on Intelligence Analysis","volume":"5","author":"Pirolli Peter","year":"2005","unstructured":"Peter Pirolli and Stuart Card. 2005. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of International Conference on Intelligence Analysis, Vol. 5. McLean, VA, 2\u20134."},{"key":"e_1_3_2_67_2","article-title":"Top python libraries used in data science","author":"Prabhu Tanu N.","year":"2019","unstructured":"Tanu N. Prabhu. 2019. Top python libraries used in data science. Towards Data Science, Medium (2019). Retrieved December 16, 2019 from https:\/\/towardsdatascience.com\/top-python-libraries-used-in-data-science-a58e90f1b4ba.","journal-title":"Towards Data Science, Medium"},{"key":"e_1_3_2_68_2","unstructured":"Brian Ripley Bill Venables Douglas M. Bates Kurt Hornik Albrecht Gebhardt and David Firth. 2020. Package \u201cMASS\u201d. (2020). Retrieved September 16 2020 from https:\/\/cran.r-project.org\/web\/packages\/MASS\/MASS.pdf."},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/169059.169209"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.7717\/peerj-cs.55"},{"key":"e_1_3_2_71_2","unstructured":"SAS. 2020. JMP. Retrieved September 16 2020 fromhttps:\/\/www.jmp.com\/en_us\/home.html."},{"key":"e_1_3_2_72_2","first-page":"106","volume-title":"Proceedings of the 17th Annual Conference of the Cognitive Science Society","author":"Schunn Christian D.","year":"1995","unstructured":"Christian D. Schunn and David Klahr. 1995. A 4-space model of scientific discovery. In Proceedings of the 17th Annual Conference of the Cognitive Science Society. 106\u2013111."},{"key":"e_1_3_2_73_2","first-page":"25","volume-title":"Proceedings of the 18th Annual Conference of the Cognitive Science Society: July 12\u201315, 1996, University of California, San Diego","volume":"18","author":"Schunn Christian D.","year":"1996","unstructured":"Christian D. Schunn and David Klahr. 1996. When and how to go beyond a 2-space model of scientific discovery. In Proceedings of the 18th Annual Conference of the Cognitive Science Society: July 12\u201315, 1996, University of California, San Diego, Vol. 18. Psychology Press, 25."},{"key":"e_1_3_2_74_2","unstructured":"scikit-learn developers. 2020. Scikit-learn v0.23.2 documentation. (2020). Retrieved November 20 2020 from https:\/\/scikit-learn.org\/stable\/."},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.25080\/Majora-92bf1922-011"},{"key":"e_1_3_2_76_2","unstructured":"IBM SPSS. [n.d.]. SPSS Software. Retrieved August 18 2020 fromhttps:\/\/www.ibm.com\/analytics\/spss-statistics-software."},{"key":"e_1_3_2_77_2","doi-asserted-by":"crossref","unstructured":"Stata. [n.d.]. Stata Software. Retrieved September 14 2020 fromhttps:\/\/www.stata.com\/.","DOI":"10.4324\/9781003149286-3"},{"key":"e_1_3_2_78_2","unstructured":"Michael Suh. 2014. Higher Education Gender & Work Dataset. Retrieved September 16 2020 from https:\/\/www.pewsocialtrends.org\/category\/datasets\/?download=20041."},{"key":"e_1_3_2_79_2","article-title":"Package \u2018stats\u2019 v4.1.0","author":"Team R Core","year":"2020","unstructured":"R Core Team and contributors worldwide. 2020. Package \u2018stats\u2019 v4.1.0. CRAN (2020). Retrieved September 16, 2020 from https:\/\/stat.ethz.ch\/R-manual\/R-devel\/library\/stats\/html\/00Index.html.","journal-title":"CRAN"},{"key":"e_1_3_2_80_2","unstructured":"Inc. The MathWorks. 2020. Matlab. Retrieved from https:\/\/www.mathworks.com\/."},{"key":"e_1_3_2_81_2","unstructured":"Inc. The MathWorks. 2020. Statistics and machine learning toolbox. (2020). Retrieved fromhttps:\/\/www.mathworks.com\/help\/stats\/index.html."},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3242587.3242663"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1145\/2702123.2702347"},{"issue":"1","key":"e_1_3_2_84_2","first-page":"1","article-title":"Tidy data","volume":"59","author":"Wickham Hadley","year":"2014","unstructured":"Hadley Wickham. 2014. Tidy data. Journal of Statistical Software 59, 1 (2014), 1\u201323.","journal-title":"Journal of Statistical Software"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.21105\/joss.01686"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1751-5823.1999.tb00442.x"},{"key":"e_1_3_2_87_2","article-title":"Goals, process, and challenges of exploratory data analysis: An interview study","author":"Wongsuphasawat Kanit","year":"2019","unstructured":"Kanit Wongsuphasawat, Yang Liu, and Jeffrey Heer. 2019. Goals, process, and challenges of exploratory data analysis: An interview study. arXiv:1911.00568. Retrieved from https:\/\/arxiv.org\/abs\/1911.00568.","journal-title":"arXiv:1911.00568"},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1901326117"}],"container-title":["ACM Transactions on Computer-Human Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476980","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3476980","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:46Z","timestamp":1750188646000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476980"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,7]]},"references-count":87,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,2,28]]}},"alternative-id":["10.1145\/3476980"],"URL":"https:\/\/doi.org\/10.1145\/3476980","relation":{},"ISSN":["1073-0516","1557-7325"],"issn-type":[{"value":"1073-0516","type":"print"},{"value":"1557-7325","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,7]]},"assertion":[{"value":"2021-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}