{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T21:36:31Z","timestamp":1763156191698,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":85,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,8,18]],"date-time":"2021-08-18T00:00:00Z","timestamp":1629244800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100005801","name":"Facebook","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100005801","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004318","name":"Microsoft","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004318","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["CCF-1846354,CCF-1956374,CCF-2008883,CCF-2028861"],"award-info":[{"award-number":["CCF-1846354,CCF-1956374,CCF-2008883,CCF-2028861"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,8,20]]},"DOI":"10.1145\/3468264.3468615","type":"proceedings-article","created":{"date-parts":[[2021,8,19]],"date-time":"2021-08-19T01:40:37Z","timestamp":1629337237000},"page":"603-614","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":25,"title":["FLEX: fixing flaky tests in machine learning projects by updating assertion bounds"],"prefix":"10.1145","author":[{"given":"Saikat","family":"Dutta","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"August","family":"Shi","sequence":"additional","affiliation":[{"name":"University of Texas at Austin, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sasa","family":"Misailovic","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,8,18]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2020. https:\/\/anaconda.org\/  2020. https:\/\/anaconda.org\/"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In ICSE.  Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In ICSE.","DOI":"10.1145\/1985793.1985795"},{"key":"e_1_3_2_1_3_1","unstructured":"Andrea Arcuri and Lionel Briand. 2014. A hitchhiker\u2019s guide to statistical tests for assessing randomized algorithms in software engineering. STVR.  Andrea Arcuri and Lionel Briand. 2014. A hitchhiker\u2019s guide to statistical tests for assessing randomized algorithms in software engineering. STVR."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1214\/17-AOAS1092"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"August A Balkema and Laurens De Haan. 1978. Limit distributions for order statistics. I. Theory of Probability & Its Applications.  August A Balkema and Laurens De Haan. 1978. Limit distributions for order statistics. I. Theory of Probability & Its Applications.","DOI":"10.1137\/1123006"},{"key":"e_1_3_2_1_6_1","unstructured":"2021. https:\/\/bazel.build\/  2021. https:\/\/bazel.build\/"},{"key":"e_1_3_2_1_7_1","article-title":"Pyro: Deep universal probabilistic programming","author":"Bingham Eli","year":"2019","unstructured":"Eli Bingham , Jonathan P Chen , Martin Jankowiak , Fritz Obermeyer , Neeraj Pradhan , Theofanis Karaletsos , Rohit Singh , Paul Szerlip , Paul Horsfall , and Noah D Goodman . 2019 . Pyro: Deep universal probabilistic programming . The Journal of Machine Learning Research. Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D Goodman. 2019. Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research.","journal-title":"The Journal of Machine Learning Research."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1964.tb00553.x"},{"key":"e_1_3_2_1_9_1","volume-title":"Stan: A probabilistic programming language. JSTATSOFT.","author":"Carpenter Bob","year":"2016","unstructured":"Bob Carpenter , Andrew Gelman , Matt Hoffman , Daniel Lee , Ben Goodrich , Michael Betancourt , Michael A Brubaker , Jiqiang Guo , Peter Li , and Allen Riddell . 2016 . Stan: A probabilistic programming language. JSTATSOFT. Bob Carpenter, Andrew Gelman, Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Michael A Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2016. Stan: A probabilistic programming language. JSTATSOFT."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"crossref","unstructured":"Vartan Choulakian and Michael A Stephens. 2001. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics.  Vartan Choulakian and Michael A Stephens. 2001. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics.","DOI":"10.1198\/00401700152672573"},{"key":"e_1_3_2_1_11_1","unstructured":"2020. https:\/\/github.com\/microsoft\/coax\/pull\/13  2020. https:\/\/github.com\/microsoft\/coax\/pull\/13"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Brett Daniel Tihomir Gvero and Darko Marinov. 2010. On test repair using symbolic execution. In ISSTA.  Brett Daniel Tihomir Gvero and Darko Marinov. 2010. On test repair using symbolic execution. In ISSTA.","DOI":"10.1145\/1831708.1831734"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Brett Daniel Vilas Jagannath Danny Dig and Darko Marinov. 2009. ReAssert: Suggesting repairs for broken unit tests. In ASE.  Brett Daniel Vilas Jagannath Danny Dig and Darko Marinov. 2009. ReAssert: Suggesting repairs for broken unit tests. In ASE.","DOI":"10.1109\/ASE.2009.17"},{"volume-title":"Extreme value theory: an introduction","author":"Haan Laurens De","key":"e_1_3_2_1_14_1","unstructured":"Laurens De Haan and Ana Ferreira . 2007. Extreme value theory: an introduction . Springer Science & Business Media . Laurens De Haan and Ana Ferreira. 2007. Extreme value theory: an introduction. Springer Science & Business Media."},{"key":"e_1_3_2_1_15_1","unstructured":"2020. https:\/\/github.com\/deepchem\/deepchem\/pull\/2408  2020. https:\/\/github.com\/deepchem\/deepchem\/pull\/2408"},{"key":"e_1_3_2_1_16_1","unstructured":"Joshua V Dillon Ian Langmore Dustin Tran Eugene Brevdo Srinivas Vasudevan Dave Moore Brian Patton Alex Alemi Matt Hoffman and Rif A Saurous. 2017. Tensorflow distributions. arXiv preprint arXiv:1711.10604.  Joshua V Dillon Ian Langmore Dustin Tran Eugene Brevdo Srinivas Vasudevan Dave Moore Brian Patton Alex Alemi Matt Hoffman and Rif A Saurous. 2017. Tensorflow distributions. arXiv preprint arXiv:1711.10604."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Saikat Dutta Owolabi Legunsen Zixin Huang and Sasa Misailovic. 2018. Testing probabilistic programming systems. In ESEC\/FSE.  Saikat Dutta Owolabi Legunsen Zixin Huang and Sasa Misailovic. 2018. Testing probabilistic programming systems. In ESEC\/FSE.","DOI":"10.1145\/3236024.3236057"},{"key":"e_1_3_2_1_18_1","volume-title":"TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects. In ISSTA.","author":"Dutta Saikat","year":"2021","unstructured":"Saikat Dutta , Jeeva Selvam , Aryaman Jain , and Sasa Misailovic . 2021 . TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects. In ISSTA. Saikat Dutta, Jeeva Selvam, Aryaman Jain, and Sasa Misailovic. 2021. TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects. In ISSTA."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Saikat Dutta August Shi Rutvik Choudhary Zhekun Zhang Aryaman Jain and Sasa Misailovic. 2020. Detecting flaky tests in probabilistic and machine learning applications. In ISSTA.  Saikat Dutta August Shi Rutvik Choudhary Zhekun Zhang Aryaman Jain and Sasa Misailovic. 2020. Detecting flaky tests in probabilistic and machine learning applications. In ISSTA.","DOI":"10.1145\/3395363.3397366"},{"key":"e_1_3_2_1_20_1","volume-title":"Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems. In FSE.","author":"Dutta Saikat","year":"2019","unstructured":"Saikat Dutta , Wenxian Zhang , Zixin Huang , and Sasa Misailovic . 2019 . Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems. In FSE. Saikat Dutta, Wenxian Zhang, Zixin Huang, and Sasa Misailovic. 2019. Storm: Program Reduction for Testing and Debugging Probabilistic Programming Systems. In FSE."},{"key":"e_1_3_2_1_21_1","volume-title":"Neville Dubash, and Sanjay Podder.","author":"Dwarakanath Anurag","year":"2018","unstructured":"Anurag Dwarakanath , Manish Ahuja , Samarth Sikand , Raghotham M Rao , RP Jagadeesh Chandra Bose , Neville Dubash, and Sanjay Podder. 2018 . Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In ISSTA. Anurag Dwarakanath, Manish Ahuja, Samarth Sikand, Raghotham M Rao, RP Jagadeesh Chandra Bose, Neville Dubash, and Sanjay Podder. 2018. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In ISSTA."},{"volume-title":"An introduction to the bootstrap","author":"Efron Bradley","key":"e_1_3_2_1_22_1","unstructured":"Bradley Efron and Robert J Tibshirani . 1994. An introduction to the bootstrap . CRC press . Bradley Efron and Robert J Tibshirani. 1994. An introduction to the bootstrap. CRC press."},{"key":"e_1_3_2_1_23_1","unstructured":"2021. https:\/\/rdrr.io\/cran\/eva\/man\/eva.html  2021. https:\/\/rdrr.io\/cran\/eva\/man\/eva.html"},{"key":"e_1_3_2_1_24_1","unstructured":"2020. https:\/\/github.com\/fastnlp\/fastNLP\/pull\/352  2020. https:\/\/github.com\/fastnlp\/fastNLP\/pull\/352"},{"key":"e_1_3_2_1_25_1","unstructured":"2019. https:\/\/github.com\/box\/flaky  2019. https:\/\/github.com\/box\/flaky"},{"key":"e_1_3_2_1_26_1","unstructured":"Maurice Fr\u00e9chet. 1927. Sur la loi de probabilit\u00e9 de l\u2019\u00e9cart maximum. Ann. Soc. Math. Polon..  Maurice Fr\u00e9chet. 1927. Sur la loi de probabilit\u00e9 de l\u2019\u00e9cart maximum. Ann. Soc. Math. Polon.."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Alessio Gambi Jonathan Bell and Andreas Zeller. 2018. Practical Test Dependency Detection. In ICST.  Alessio Gambi Jonathan Bell and Andreas Zeller. 2018. Practical Test Dependency Detection. In ICST.","DOI":"10.1109\/ICST.2018.00011"},{"key":"e_1_3_2_1_28_1","unstructured":"2020. https:\/\/github.com\/rlworkgroup\/garage\/pull\/2242  2020. https:\/\/github.com\/rlworkgroup\/garage\/pull\/2242"},{"key":"e_1_3_2_1_29_1","unstructured":"2021. https:\/\/github.com\/RaRe-Technologies\/gensim\/pull\/3050  2021. https:\/\/github.com\/RaRe-Technologies\/gensim\/pull\/3050"},{"key":"e_1_3_2_1_30_1","unstructured":"2020. https:\/\/github.com\/RaRe-Technologies\/gensim\/pull\/3059  2020. https:\/\/github.com\/RaRe-Technologies\/gensim\/pull\/3059"},{"volume-title":"Deep learning","author":"Goodfellow Ian","key":"e_1_3_2_1_31_1","unstructured":"Ian Goodfellow , Yoshua Bengio , Aaron Courville , and Yoshua Bengio . 2016. Deep learning . MIT Press Cambridge . Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. MIT Press Cambridge."},{"key":"e_1_3_2_1_32_1","unstructured":"Noah D Goodman Vikash K Mansinghka Daniel Roy Keith Bonawitz and Joshua B Tenenbaum. 2008. Church: a language for generative models. In UAI.  Noah D Goodman Vikash K Mansinghka Daniel Roy Keith Bonawitz and Joshua B Tenenbaum. 2008. Church: a language for generative models. In UAI."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Andrew D Gordon Thomas A Henzinger Aditya V Nori and Sriram K Rajamani. 2014. Probabilistic programming. In FoSE.  Andrew D Gordon Thomas A Henzinger Aditya V Nori and Sriram K Rajamani. 2014. Probabilistic programming. In FoSE.","DOI":"10.1145\/2593882.2593900"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Christian Gourieroux Alberto Holly and Alain Monfort. 1982. Likelihood ratio test Wald test and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica: journal of the Econometric Society.  Christian Gourieroux Alberto Holly and Alain Monfort. 1982. Likelihood ratio test Wald test and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica: journal of the Econometric Society.","DOI":"10.2307\/1912529"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416571"},{"volume-title":"Statistical methods in water resources","author":"Helsel Dennis R","key":"e_1_3_2_1_36_1","unstructured":"Dennis R Helsel and Robert M Hirsch . 1992. Statistical methods in water resources . Elsevier . Dennis R Helsel and Robert M Hirsch. 1992. Statistical methods in water resources. Elsevier."},{"key":"e_1_3_2_1_37_1","unstructured":"Qiang Hu Lei Ma Xiaofei Xie Bing Yu Yang Liu and Jianjun Zhao. 2019. DeepMutation++: A mutation testing framework for deep learning systems. In ASE.  Qiang Hu Lei Ma Xiaofei Xie Bing Yu Yang Liu and Jianjun Zhao. 2019. DeepMutation++: A mutation testing framework for deep learning systems. In ASE."},{"key":"e_1_3_2_1_38_1","volume-title":"Psense: Automatic sensitivity analysis for probabilistic programs. In ATVA.","author":"Huang Zixin","year":"2018","unstructured":"Zixin Huang , Zhenbang Wang , and Sasa Misailovic . 2018 . Psense: Automatic sensitivity analysis for probabilistic programs. In ATVA. Zixin Huang, Zhenbang Wang, and Sasa Misailovic. 2018. Psense: Automatic sensitivity analysis for probabilistic programs. In ATVA."},{"key":"e_1_3_2_1_39_1","unstructured":"2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/449  2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/449"},{"key":"e_1_3_2_1_40_1","unstructured":"2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/450  2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/450"},{"key":"e_1_3_2_1_41_1","unstructured":"2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/451  2021. https:\/\/github.com\/microsoft\/hummingbird\/pull\/451"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Keyur Joshi Vimuth Fernando and Sasa Misailovic. 2019. Statistical algorithmic profiling for randomized approximate programs. In ICSE.  Keyur Joshi Vimuth Fernando and Sasa Misailovic. 2019. Statistical algorithmic profiling for randomized approximate programs. In ICSE.","DOI":"10.1109\/ICSE.2019.00071"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622737.1622748"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Wing Lam Patrice Godefroid Suman Nath Anirudh Santhiar and Suresh Thummalapenta. 2019. Root Causing Flaky Tests in a Large-Scale Industrial Setting. In ISSTA.  Wing Lam Patrice Godefroid Suman Nath Anirudh Santhiar and Suresh Thummalapenta. 2019. Root Causing Flaky Tests in a Large-Scale Industrial Setting. In ISSTA.","DOI":"10.1145\/3293882.3330570"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Wing Lam K\u0131van\u00e7 Mu\u015flu Hitesh Sajnani and Suresh Thummalapenta. 2020. A Study on the Lifecycle of Flaky Tests. In ICSE.  Wing Lam K\u0131van\u00e7 Mu\u015flu Hitesh Sajnani and Suresh Thummalapenta. 2020. A Study on the Lifecycle of Flaky Tests. In ICSE.","DOI":"10.1145\/3377811.3381749"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Wing Lam Reed Oei August Shi Darko Marinov and Tao Xie. 2019. iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests. In ICST.  Wing Lam Reed Oei August Shi Darko Marinov and Tao Xie. 2019. iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests. In ICST.","DOI":"10.1109\/ICST.2019.00038"},{"key":"e_1_3_2_1_47_1","unstructured":"Xiangyu Li Marcelo d\u2019Amorim and Alessandro Orso. 2019. Intent-Preserving Test Repair. In ICST.  Xiangyu Li Marcelo d\u2019Amorim and Alessandro Orso. 2019. Intent-Preserving Test Repair. In ICST."},{"key":"e_1_3_2_1_48_1","unstructured":"Yamilet R Serrano Llerena Marcel B\u00f6hme Marc Br\u00fcnink Guoxin Su and David S Rosenblum. 2018. Verifying the long-run behavior of probabilistic system models in the presence of uncertainty. In ESEC\/FSE.  Yamilet R Serrano Llerena Marcel B\u00f6hme Marc Br\u00fcnink Guoxin Su and David S Rosenblum. 2018. Verifying the long-run behavior of probabilistic system models in the presence of uncertainty. In ESEC\/FSE."},{"key":"e_1_3_2_1_49_1","unstructured":"Qingzhou Luo Farah Hariri Lamyaa Eloussi and Darko Marinov. 2014. An empirical analysis of flaky tests. In FSE.  Qingzhou Luo Farah Hariri Lamyaa Eloussi and Darko Marinov. 2014. An empirical analysis of flaky tests. In FSE."},{"key":"e_1_3_2_1_50_1","unstructured":"2020. https:\/\/github.com\/plasticityai\/magnitude\/pull\/84  2020. https:\/\/github.com\/plasticityai\/magnitude\/pull\/84"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"crossref","unstructured":"Claudio Mandrioli and Martina Maggio. 2020. Testing self-adaptive software with probabilistic guarantees on performance metrics. In ESEC\/FSE.  Claudio Mandrioli and Martina Maggio. 2020. Testing self-adaptive software with probabilistic guarantees on performance metrics. In ESEC\/FSE.","DOI":"10.1145\/3368089.3409685"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"crossref","unstructured":"Mehdi Mirzaaghaei Fabrizio Pastore and Mauro Pezz\u00e8. 2012. Supporting test suite evolution through test case adaptation. In ICST.  Mehdi Mirzaaghaei Fabrizio Pastore and Mauro Pezz\u00e8. 2012. Supporting test suite evolution through test case adaptation. In ICST.","DOI":"10.1109\/ICST.2012.103"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-2496(02)00028-7"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"crossref","unstructured":"Mahdi Nejadgholi and Jinqiu Yang. 2019. A Study of Oracle Approximations in Testing Deep Learning Libraries. In ASE.  Mahdi Nejadgholi and Jinqiu Yang. 2019. A Study of Oracle Approximations in Testing Deep Learning Libraries. In ASE.","DOI":"10.1109\/ASE.2019.00078"},{"key":"e_1_3_2_1_55_1","unstructured":"2021. https:\/\/github.com\/IntelLabs\/nlp-architect\/pull\/207  2021. https:\/\/github.com\/IntelLabs\/nlp-architect\/pull\/207"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"crossref","unstructured":"Bernard Nongpoh Rajarshi Ray Saikat Dutta and Ansuman Banerjee. 2017. AutoSense: A framework for automated sensitivity analysis of program data. TSE.  Bernard Nongpoh Rajarshi Ray Saikat Dutta and Ansuman Banerjee. 2017. AutoSense: A framework for automated sensitivity analysis of program data. TSE.","DOI":"10.1109\/TSE.2017.2654251"},{"key":"e_1_3_2_1_57_1","unstructured":"2020. NumPyro. https:\/\/github.com\/pyro-ppl\/numpyro  2020. NumPyro. https:\/\/github.com\/pyro-ppl\/numpyro"},{"key":"e_1_3_2_1_58_1","unstructured":"Felix Boakye Oppong and Senyo Yao Agbedra. 2016. Assessing univariate and multivariate normality. a guide for non-statisticians. Math. Theory Modeling.  Felix Boakye Oppong and Senyo Yao Agbedra. 2016. Assessing univariate and multivariate normality. a guide for non-statisticians. Math. Theory Modeling."},{"key":"e_1_3_2_1_59_1","volume-title":"Atilla Halil Elhan, and Ers\u00f6z T\u00fcccar","author":"\u00d6ztuna Derya","year":"2006","unstructured":"Derya \u00d6ztuna , Atilla Halil Elhan, and Ers\u00f6z T\u00fcccar . 2006 . Investigation of four different normality tests in terms of type 1 error rate and power under different distributions. Turkish Journal of Medical Sciences . Derya \u00d6ztuna, Atilla Halil Elhan, and Ers\u00f6z T\u00fcccar. 2006. Investigation of four different normality tests in terms of type 1 error rate and power under different distributions. Turkish Journal of Medical Sciences."},{"key":"e_1_3_2_1_60_1","unstructured":"2021. https:\/\/github.com\/facebookresearch\/ParlAI\/pull\/3467  2021. https:\/\/github.com\/facebookresearch\/ParlAI\/pull\/3467"},{"key":"e_1_3_2_1_61_1","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein and Luca Antiga. 2019. PyTorch: An imperative style high-performance deep learning library. In NeurIPS.  Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein and Luca Antiga. 2019. PyTorch: An imperative style high-performance deep learning library. In NeurIPS."},{"key":"e_1_3_2_1_62_1","unstructured":"2020. https:\/\/github.com\/pgmpy\/pgmpy\/pull\/1380  2020. https:\/\/github.com\/pgmpy\/pgmpy\/pull\/1380"},{"key":"e_1_3_2_1_63_1","unstructured":"Hung Viet Pham Thibaud Lutellier Weizhen Qi and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In ICSE.  Hung Viet Pham Thibaud Lutellier Weizhen Qi and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In ICSE."},{"key":"e_1_3_2_1_64_1","article-title":"Statistical inference using extreme order statistics","author":"James Pickands","year":"1975","unstructured":"James Pickands III. 1975 . Statistical inference using extreme order statistics . Annals of statistics. James Pickands III. 1975. Statistical inference using extreme order statistics. Annals of statistics.","journal-title":"Annals of statistics."},{"key":"e_1_3_2_1_65_1","unstructured":"2020. https:\/\/github.com\/ICB-DCM\/pyPESTO  2020. https:\/\/github.com\/ICB-DCM\/pyPESTO"},{"key":"e_1_3_2_1_66_1","unstructured":"2021. https:\/\/github.com\/ICB-DCM\/pyPESTO\/pull\/570  2021. https:\/\/github.com\/ICB-DCM\/pyPESTO\/pull\/570"},{"key":"e_1_3_2_1_67_1","unstructured":"2020. Pyro. http:\/\/pyro.ai  2020. Pyro. http:\/\/pyro.ai"},{"key":"e_1_3_2_1_68_1","unstructured":"2020. https:\/\/docs.pytest.org\/en\/stable  2020. https:\/\/docs.pytest.org\/en\/stable"},{"key":"e_1_3_2_1_69_1","unstructured":"2021. https:\/\/github.com\/tristandeleu\/pytorch-meta\/pull\/117  2021. https:\/\/github.com\/tristandeleu\/pytorch-meta\/pull\/117"},{"key":"e_1_3_2_1_70_1","unstructured":"2020. https:\/\/github.com\/refnx\/refnx\/pull\/540  2020. https:\/\/github.com\/refnx\/refnx\/pull\/540"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"crossref","unstructured":"John Salvatier Thomas V Wiecki and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Computer Science.  John Salvatier Thomas V Wiecki and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Computer Science.","DOI":"10.7287\/peerj.preprints.1686v1"},{"key":"e_1_3_2_1_72_1","unstructured":"Koushik Sen Mahesh Viswanathan and Gul Agha. 2005. On statistical model checking of stochastic systems. In CAV.  Koushik Sen Mahesh Viswanathan and Gul Agha. 2005. On statistical model checking of stochastic systems. In CAV."},{"key":"e_1_3_2_1_73_1","unstructured":"August Shi Alex Gyori Owolabi Legunsen and Darko Marinov. 2016. Detecting Assumptions on Deterministic Implementations of Non-deterministic Specifications. In ICST.  August Shi Alex Gyori Owolabi Legunsen and Darko Marinov. 2016. Detecting Assumptions on Deterministic Implementations of Non-deterministic Specifications. In ICST."},{"key":"e_1_3_2_1_74_1","unstructured":"August Shi Wing Lam Reed Oei Tao Xie and Darko Marinov. 2019. iFixFlakies: A framework for automatically fixing order-dependent flaky tests. In FSE.  August Shi Wing Lam Reed Oei Tao Xie and Darko Marinov. 2019. iFixFlakies: A framework for automatically fixing order-dependent flaky tests. In FSE."},{"key":"e_1_3_2_1_75_1","unstructured":"2020. https:\/\/github.com\/stellargraph\/stellargraph\/pull\/1880  2020. https:\/\/github.com\/stellargraph\/stellargraph\/pull\/1880"},{"key":"e_1_3_2_1_76_1","unstructured":"2020. TensorFlow. https:\/\/www.tensorflow.org  2020. TensorFlow. https:\/\/www.tensorflow.org"},{"key":"e_1_3_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1239\/jap\/1082552200"},{"key":"e_1_3_2_1_78_1","unstructured":"2020. https:\/\/github.com\/lmcinnes\/umap\/pull\/600  2020. https:\/\/github.com\/lmcinnes\/umap\/pull\/600"},{"key":"e_1_3_2_1_79_1","doi-asserted-by":"crossref","unstructured":"Peixin Wang Hongfei Fu Krishnendu Chatterjee Yuxin Deng and Ming Xu. 2019. Proving Expected Sensitivity of Probabilistic Programs with Randomized Variable-Dependent Termination Time. POPL.  Peixin Wang Hongfei Fu Krishnendu Chatterjee Yuxin Deng and Ming Xu. 2019. Proving Expected Sensitivity of Probabilistic Programs with Randomized Variable-Dependent Termination Time. POPL.","DOI":"10.1145\/3371093"},{"key":"e_1_3_2_1_80_1","unstructured":"Tsui-Wei Weng Huan Zhang Pin-Yu Chen Jinfeng Yi Dong Su Yupeng Gao Cho-Jui Hsieh and Luca Daniel. 2018. Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach. In ICLR.  Tsui-Wei Weng Huan Zhang Pin-Yu Chen Jinfeng Yi Dong Su Yupeng Gao Cho-Jui Hsieh and Luca Daniel. 2018. Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach. In ICLR."},{"key":"e_1_3_2_1_81_1","doi-asserted-by":"crossref","unstructured":"Guowei Yang Sarfraz Khurshid and Miryung Kim. 2012. Specification-based test repair using a lightweight formal method. In FM.  Guowei Yang Sarfraz Khurshid and Miryung Kim. 2012. Specification-based test repair using a lightweight formal method. In FM.","DOI":"10.1007\/978-3-642-32759-9_37"},{"key":"e_1_3_2_1_82_1","unstructured":"2021. https:\/\/github.com\/zfit\/zfit\/pull\/288  2021. https:\/\/github.com\/zfit\/zfit\/pull\/288"},{"key":"e_1_3_2_1_83_1","unstructured":"2021. https:\/\/github.com\/zfit\/zfit\/pull\/290  2021. https:\/\/github.com\/zfit\/zfit\/pull\/290"},{"key":"e_1_3_2_1_84_1","doi-asserted-by":"crossref","unstructured":"Peilun Zhang Yangjie Jiang Anjiang Wei Victoria Stodden Darko Marinov and August Shi. 2021. Domain-Specific Fixes for Flaky Tests with Wrong Assumptions on Underdetermined Specifications. In ICSE.  Peilun Zhang Yangjie Jiang Anjiang Wei Victoria Stodden Darko Marinov and August Shi. 2021. Domain-Specific Fixes for Flaky Tests with Wrong Assumptions on Underdetermined Specifications. In ICSE.","DOI":"10.1109\/ICSE43902.2021.00018"},{"key":"e_1_3_2_1_85_1","doi-asserted-by":"crossref","unstructured":"Yuhao Zhang Luyao Ren Liqian Chen Yingfei Xiong Shing-Chi Cheung and Tao Xie. 2020. Detecting numerical bugs in neural network architectures. In ESEC\/FSE.  Yuhao Zhang Luyao Ren Liqian Chen Yingfei Xiong Shing-Chi Cheung and Tao Xie. 2020. Detecting numerical bugs in neural network architectures. In ESEC\/FSE.","DOI":"10.1145\/3368089.3409720"}],"event":{"name":"ESEC\/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering"],"location":"Athens Greece","acronym":"ESEC\/FSE '21"},"container-title":["Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3468264.3468615","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3468264.3468615","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3468264.3468615","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:22Z","timestamp":1750191442000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3468264.3468615"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,18]]},"references-count":85,"alternative-id":["10.1145\/3468264.3468615","10.1145\/3468264"],"URL":"https:\/\/doi.org\/10.1145\/3468264.3468615","relation":{},"subject":[],"published":{"date-parts":[[2021,8,18]]},"assertion":[{"value":"2021-08-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}