{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T17:18:54Z","timestamp":1775063934743,"version":"3.50.1"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T00:00:00Z","timestamp":1682640000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T00:00:00Z","timestamp":1682640000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/R513313\/1"],"award-info":[{"award-number":["EP\/R513313\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2023,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>A <jats:italic>flaky test<\/jats:italic> is a test case whose outcome changes without modification to the code of the test case or the program under test. These tests disrupt continuous integration, cause a loss of developer productivity, and limit the efficiency of testing. Many flaky test detection techniques are <jats:italic>rerunning-based<\/jats:italic>, meaning they require repeated test case executions at a considerable time cost, or are <jats:italic>machine learning-based<\/jats:italic>, and thus they are fast but offer only an approximate solution with variable detection performance. These two extremes leave developers with a stark choice. This paper introduces <jats:sc>CANNIER<\/jats:sc>, an approach for reducing the time cost of rerunning-based detection techniques by combining them with machine learning models. The empirical evaluation involving 89,668 test cases from 30 Python projects demonstrates that <jats:sc>CANNIER<\/jats:sc> can reduce the time cost of existing rerunning-based techniques by an order of magnitude while maintaining a detection performance that is significantly better than machine learning models alone. Furthermore, the comprehensive study extends existing work on machine learning-based detection and reveals a number of additional findings, including (1) the performance of machine learning models for detecting polluter test cases; (2) using the mean values of dynamic test case features from repeated measurements can slightly improve the detection performance of machine learning models; and (3) correlations between various test case features and the probability of the test case being flaky.<\/jats:p>","DOI":"10.1007\/s10664-023-10307-w","type":"journal-article","created":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T08:02:18Z","timestamp":1682668938000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Empirically evaluating flaky test detection techniques combining test case rerunning and machine learning models"],"prefix":"10.1007","volume":"28","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0917-1274","authenticated-orcid":false,"given":"Owain","family":"Parry","sequence":"first","affiliation":[]},{"given":"Gregory M.","family":"Kapfhammer","sequence":"additional","affiliation":[]},{"given":"Michael","family":"Hilton","sequence":"additional","affiliation":[]},{"given":"Phil","family":"McMinn","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,4,28]]},"reference":[{"key":"10307_CR1","unstructured":"(2022). Python Package Index, https:\/\/pypi.org\/"},{"key":"10307_CR2","doi-asserted-by":"crossref","unstructured":"Al-Qutaish R, Abran A (2010) Halstead metrics: analysis of their design. Wiley, pp 145\u2013159","DOI":"10.1002\/9780470606834.ch7"},{"key":"10307_CR3","doi-asserted-by":"crossref","unstructured":"Alshammari A, Morris C, Hilton M, Bell J (2021) FlakeFlagger: predicting flakiness without rerunning tests. In: Proceedings of the international conference on software engineering (ICSE)","DOI":"10.1109\/ICSE43902.2021.00140"},{"key":"10307_CR4","doi-asserted-by":"crossref","unstructured":"Bell J, Kaiser G, Melski E, Dattatreya M (2015) Efficient dependency detection for safe Java test acceleration. In: Proceedings of the joint meeting of the European software engineering conference and the symposium on the foundations of software engineering (ESEC\/FSE), pp 770\u2013781","DOI":"10.1145\/2786805.2786823"},{"key":"10307_CR5","doi-asserted-by":"crossref","unstructured":"Bell J, Legunsen O, Hilton M, Eloussi L, Yung T, Marinov D (2018) DeFlaker: automatically detecting flaky tests. In: Proceedings of the international conference on software engineering (ICSE), pp 433\u2013444","DOI":"10.1145\/3180155.3180164"},{"key":"10307_CR6","doi-asserted-by":"publisher","first-page":"76119","DOI":"10.1109\/ACCESS.2021.3082424","volume":"9","author":"A Bertolino","year":"2021","unstructured":"Bertolino A, Cruciani E, Miranda B, Verdecchia R (2021) Know your neighbor: fast static prediction of test flakiness. IEEE Access 9:76119\u201376134","journal-title":"IEEE Access"},{"key":"10307_CR7","doi-asserted-by":"crossref","unstructured":"Biagiola M, Stocco A, Mesbah A, Ricca F, Tonella P (2019) Web test dependency detection. In: Proceedings of the joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC\/FSE), pp 154\u2013164","DOI":"10.1145\/3338906.3338948"},{"issue":"1","key":"10307_CR8","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45(1):5\u201332","journal-title":"Mach Learn"},{"key":"10307_CR9","unstructured":"CANNIER experiment (2022) https:\/\/github.com\/flake-it\/cannier-experiment"},{"key":"10307_CR10","unstructured":"CANNIER framework (2022) https:\/\/github.com\/flake-it\/cannier-framework"},{"key":"10307_CR11","doi-asserted-by":"crossref","unstructured":"Camara B, Silva M, Endo A, S. V (2021) On the use of test smells for prediction of flaky tests. In: Proceedings of the Brazilian symposium on systematic and automated software testing (SAST), pp 46\u201354","DOI":"10.1145\/3482909.3482916"},{"key":"10307_CR12","doi-asserted-by":"crossref","unstructured":"Camara B, Silva M, Endo A, S. V (2021) What is the vocabulary of flaky tests? An extended replication. In: Proceedings of the international conference on program comprehension (ICPC), pp 444\u2013454","DOI":"10.1109\/ICPC52881.2021.00052"},{"key":"10307_CR13","doi-asserted-by":"crossref","unstructured":"Candido J, Melo L, D\u2019Amorim M (2017) Test suite parallelization in open-source projects: a study on its usage and impact. In: Proceedings of the international conference on automated software engineering (ASE), pp 153\u2013158","DOI":"10.1109\/ASE.2017.8115695"},{"key":"10307_CR14","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321\u2013357","journal-title":"J Artif Intell Res"},{"issue":"6","key":"10307_CR15","first-page":"1471","volume":"21","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G (2020) The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21(6):1471\u20132164","journal-title":"BMC Genomics"},{"key":"10307_CR16","unstructured":"Coverage.py (2022) \u2014 Coverage.py 6.4.1 documentation. https:\/\/coverage.readthedocs.io\/en\/stable\/"},{"key":"10307_CR17","unstructured":"Dillon E, LaRiviere J, Lundberg S, Roth J, Syrgkanis V (2021) Be careful when interpreting predictive models in search of causal insights, https:\/\/towardsdatascience.com\/be-careful-when-interpreting-predictive-models-in-search-of-causalinsights-e68626e664b6"},{"key":"10307_CR18","unstructured":"Docker documentation (2022) https:\/\/docs.docker.com\/"},{"key":"10307_CR19","doi-asserted-by":"crossref","unstructured":"Durieux T, Goues CL, Hilton M, Abreu R (2020) Empirical study of restarted and flaky builds on Travis CI. In: Proceedings of the international conference on mining software repositories (MSR), pp 254\u2013264","DOI":"10.1145\/3379597.3387460"},{"key":"10307_CR20","doi-asserted-by":"crossref","unstructured":"Eck M, Palomba F, Castelluccio M, Bacchelli A (2019) Understanding flaky tests: the developer\u2019s perspective. In: Proceedings of the joint meeting of the European software engineering conference and the symposium on the foundations of software engineering (ESEC\/FSE), pp 830\u2013840","DOI":"10.1145\/3338906.3338945"},{"key":"10307_CR21","doi-asserted-by":"crossref","unstructured":"Gambi A, Bell J, Zeller A (2018) Practical test dependency detection. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 1\u201311","DOI":"10.1109\/ICST.2018.00011"},{"key":"10307_CR22","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1016\/j.jss.2017.12.013","volume":"138","author":"V Garousi","year":"2018","unstructured":"Garousi V, Ku\u0307\u010bu\u0307k B (2018) Smells in software test code: a survey of knowledge in industry and academia. J Syst Softw 138:52\u201381","journal-title":"J Syst Softw"},{"issue":"1","key":"10307_CR23","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","volume":"63","author":"P Geurts","year":"2006","unstructured":"Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3\u201342","journal-title":"Mach Learn"},{"issue":"12","key":"10307_CR24","doi-asserted-by":"publisher","first-page":"1284","DOI":"10.1109\/32.106988","volume":"17","author":"GK Gill","year":"1991","unstructured":"Gill GK, Kemerer CF (1991) Cyclomatic complexity density and software maintenance productivity. Trans Softw Eng 17(12):1284","journal-title":"Trans Softw Eng"},{"key":"10307_CR25","unstructured":"Glossary (2022) \u2014 Python 3.10.4 documenation. https:\/\/docs.python.org\/3\/glossary.html#term-global-interpreter-lock"},{"key":"10307_CR26","doi-asserted-by":"crossref","unstructured":"Gruber M, Lukasczyk S, Kroi\u00df F, Fraser G (2021) An empirical study of flaky tests in Python. In: Proceedings of the international conference on software testing, verification and validation (ICST)","DOI":"10.1109\/ICST49551.2021.00026"},{"key":"10307_CR27","doi-asserted-by":"crossref","unstructured":"Haben G, Habchi S, Papadakis M, Cordy M, Le Traon Y (2021) A replication study on the usability of code vocabulary in predicting flaky tests. In: Proceedings of the international conference on mining software repositories (MSR)","DOI":"10.1109\/MSR52588.2021.00034"},{"key":"10307_CR28","doi-asserted-by":"crossref","unstructured":"Harman M, O\u2019hearn P (2018) From start-ups to scale-ups: opportunities and open problems for static and dynamic program analysis. In: Proceedings of the international working conference on source code analysis and manipulation (SCAM), pp 1\u201323","DOI":"10.1109\/SCAM.2018.00009"},{"key":"10307_CR29","doi-asserted-by":"crossref","unstructured":"Hilton M, Bell J, Marinov D (2018) A large-scale study of test coverage evolution. In: Proceedings of the international conference on automated software engineering (ASE), pp 53\u201363","DOI":"10.1145\/3238147.3238183"},{"key":"10307_CR30","unstructured":"I\/O statistics fields (2022) https:\/\/www.kernel.org\/doc\/Documentation\/iostats.txt"},{"issue":"4","key":"10307_CR31","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1109\/TSMC.1985.6313426","volume":"15","author":"JM Keller","year":"1985","unstructured":"Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. Trans Syst Man Cybernet 15(4):580\u2013585","journal-title":"Trans Syst Man Cybernet"},{"key":"10307_CR32","doi-asserted-by":"crossref","unstructured":"Lam W, Godefroid P, Nath S, Santhiar A, Thummalapenta S (2019) Root causing flaky tests in a large-scale industrial setting. In: Proceedings of the international symposium on software testing and analysis (ISSTA), pp 204\u2013215","DOI":"10.1145\/3293882.3330570"},{"key":"10307_CR33","doi-asserted-by":"crossref","unstructured":"Lam W, Mu\u015flu K, Sajnani H, Thummalapenta S (2020) A study on the lifecycle of flaky tests. In: Proceedings of the international conference on software engineering (ICSE), pp 1471\u20131482","DOI":"10.1145\/3377811.3381749"},{"key":"10307_CR34","doi-asserted-by":"crossref","unstructured":"Lam W, Oei R, Shi A, Marinov D, Xie T (2019) IDFlakies: a framework for detecting and partially classifying flaky tests. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 312\u2013322","DOI":"10.1109\/ICST.2019.00038"},{"key":"10307_CR35","doi-asserted-by":"crossref","unstructured":"Lam W, Shi A, Oei R, Zhang S, Ernst MD, Xie T (2020) Dependent-test-aware regression testing techniques. In: Proceedings of the international symposium on software testing and analysis (ISSTA), pp 298\u2013311","DOI":"10.1145\/3395363.3397364"},{"issue":"1","key":"10307_CR36","doi-asserted-by":"publisher","first-page":"2522","DOI":"10.1038\/s42256-019-0138-9","volume":"2","author":"SM Lundberg","year":"2020","unstructured":"Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2 (1):2522\u20135839","journal-title":"Nat Mach Intell"},{"key":"10307_CR37","doi-asserted-by":"crossref","unstructured":"Luo Q, Hariri F, Eloussi L, Marinov D (2014) An empirical analysis of flaky tests. In: Proceedings of the symposium on the foundations of software engineering (FSE), pp 643\u2013653","DOI":"10.1145\/2635868.2635920"},{"key":"10307_CR38","doi-asserted-by":"crossref","unstructured":"Machalica M, Samylkin A, Porth M, Chandra S (2019) Predictive test selection. In: Proceedings of the international conference on software engineering: software engineering in practice (ICSE-SEIP), pp 91\u2013100","DOI":"10.1109\/ICSE-SEIP.2019.00018"},{"key":"10307_CR39","doi-asserted-by":"crossref","unstructured":"Memon A, Gao Z, Nguyen B, Dhanda S, Nickell E, Siemborski R, Micco J (2017) Taming Google-scale continuous testing. In: Proceedings of the international conference on software engineering: software engineering in practice (ICSE-SEIP), pp 233\u2013242","DOI":"10.1109\/ICSE-SEIP.2017.16"},{"key":"10307_CR40","unstructured":"New EC2 M5zn instances (2022) \u2014 Fastest Intel Xeon scalable CPU in the cloud \u2014 AWS news blog. https:\/\/aws.amazon.com\/blogs\/aws\/new-ec2-m5zn-instances-fastest-intel-xeon-scalable-cpu-in-the-cloud\/"},{"key":"10307_CR41","unstructured":"Open source project criticality score (beta) (2022) https:\/\/github.com\/ossf\/criticality_score"},{"key":"10307_CR42","doi-asserted-by":"crossref","unstructured":"Parry O, Kapfhammer GM, Hilton M, McMinn P (2020) Flake it \u2018till you make it: using automated repair to induce and fix latent test flakiness. In: Proceedings of the international workshop on automated program repair (APR), pp 11\u201312","DOI":"10.1145\/3387940.3392177"},{"issue":"1","key":"10307_CR43","first-page":"1","volume":"31","author":"O Parry","year":"2021","unstructured":"Parry O, Kapfhammer GM, Hilton M, McMinn P (2021) A survey of flaky tests. Trans Softw Eng Methodol 31(1):1\u201374","journal-title":"Trans Softw Eng Methodol"},{"key":"10307_CR44","doi-asserted-by":"crossref","unstructured":"Parry O, Kapfhammer GM, Hilton M, McMinn P (2022) Evaluating features for machine learning detection of order- and non-order-dependent flaky tests. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 93\u2013104","DOI":"10.1109\/ICST53961.2022.00021"},{"key":"10307_CR45","doi-asserted-by":"crossref","unstructured":"Parry O, Kapfhammer GM, Hilton M, McMinn P (2022) Surveying the developer experience of flaky tests. In: Proceedings of the international conference on software engineering: software engineering in practice (ICSE-SEIP)","DOI":"10.1145\/3510457.3513037"},{"key":"10307_CR46","doi-asserted-by":"crossref","unstructured":"Peitek N, Apel S, Parnin C, Brechmann A, Siegmund J (2021) Program comprehension and code complexity metrics: an fMRI study International conference on software engineering (ICSE), pp 524\u2013536","DOI":"10.1109\/ICSE43902.2021.00056"},{"key":"10307_CR47","doi-asserted-by":"crossref","unstructured":"Pinto G, Miranda B, Dissanayake S, Amorim MD, Treude C, Bertolino A, D\u2019amorim M (2020) What is the vocabulary of flaky tests?. In: Proceedings of the international conference on mining software repositories (MSR), pp 492\u2013502","DOI":"10.1145\/3379597.3387482"},{"key":"10307_CR48","doi-asserted-by":"crossref","unstructured":"Pontillo V, Palomba F, Ferrucci F (2021) Toward static test flakiness prediction: a feasibility study. In: Proceedings of the international workshop on machine learning techniques for software quality evoluton, pp 19\u201324","DOI":"10.1145\/3472674.3473981"},{"key":"10307_CR49","doi-asserted-by":"crossref","unstructured":"Pontillo V, Palomba F, Ferrucci F (2022) Static test flakiness prediction: how far can we go?","DOI":"10.1007\/s10664-022-10227-1"},{"key":"10307_CR50","unstructured":"Psutil documentation (2022) \u2014 Psutil 5.7.3 documenation. https:\/\/psutil.readthedocs.io\/en\/stable\/"},{"key":"10307_CR51","unstructured":"Pytest (2022) Helps you write better programs \u2014 Pytest documentation. https:\/\/docs.pytest.org\/en\/7.1.x\/"},{"key":"10307_CR52","doi-asserted-by":"crossref","unstructured":"Romano A, Song Z, Grandhi S, Yang W, Wang W (2021) An empirical analysis of UI-based flaky tests. In: Proceedings of the international conference on software engineering (ICSE)","DOI":"10.1109\/ICSE43902.2021.00141"},{"issue":"3","key":"10307_CR53","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1109\/21.97458","volume":"21","author":"SR Safavian","year":"1991","unstructured":"Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. Trans Syst Man Cybernet 21(3):660\u2013674","journal-title":"Trans Syst Man Cybernet"},{"key":"10307_CR54","unstructured":"Scikit-learn (2022) Machine learning in Python \u2014 Scikit-learn 1.1.1 documenation. https:\/\/scikit-learn.org\/stable\/"},{"key":"10307_CR55","doi-asserted-by":"crossref","unstructured":"Shi A, Bell J, Marinov D (2019) Mitigating the effects of flaky tests on mutation testing. In: Proceedings of the international symposium on software testing and analysis (ISSTA), pp 296\u2013306","DOI":"10.1145\/3293882.3330568"},{"key":"10307_CR56","doi-asserted-by":"crossref","unstructured":"Shi A, Gyori A, Legunsen O, Marinov D (2016) Detecting assumptions on deterministic implementations of non-deterministic specifications. In: Proceedings of the international conference on software testing, verification and validation (ICST), pp 80\u201390","DOI":"10.1109\/ICST.2016.40"},{"issue":"1","key":"10307_CR57","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1198\/106186006X94072","volume":"15","author":"T Shi","year":"2006","unstructured":"Shi T, Horvath S (2006) Unsupervised learning with random forest predictors. J Comput Graph Stat 15(1):118\u2013138","journal-title":"J Comput Graph Stat"},{"key":"10307_CR58","doi-asserted-by":"crossref","unstructured":"Shi A, Lam W, Oei R, Xie T, Marinov D (2019) iFixFlakies: a framework for automatically fixing order-dependent flaky tests. In: Proceedings of the joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC\/FSE), pp 545\u2013555","DOI":"10.1145\/3338906.3338925"},{"key":"10307_CR59","doi-asserted-by":"crossref","unstructured":"Terragni V, Salza P, Ferrucci F (2020) A container-based infrastructure for fuzzy-driven root causing of flaky tests. In: Proceedings of the international conference on software engineering: new ideas and emerging results (ICSE-NIER), pp 69\u201372","DOI":"10.1145\/3377816.3381742"},{"key":"10307_CR60","first-page":"769","volume":"6","author":"I Tomek","year":"1976","unstructured":"Tomek I (1976) Two modifications of CNN. Trans Syst Man Cybernet 6:769\u2013772","journal-title":"Trans Syst Man Cybernet"},{"key":"10307_CR61","unstructured":"Unittest (2022) \u2014 Unit testing framework \u2014 Python 3.10.4 documenation. https:\/\/docs.python.org\/3\/library\/unittest.html"},{"key":"10307_CR62","unstructured":"Virtual environments and packages (2022) \u2014 Python 3.10.4 documenation. https:\/\/docs.python.org\/3\/tutorial\/venv.html"},{"key":"10307_CR63","unstructured":"Vysali S, Mcintosh S, Adams B (2020) Quantifying, characterizing, and mitigating flakily covered program elements. Transactions on Software Engineering"},{"key":"10307_CR64","doi-asserted-by":"crossref","unstructured":"Wei A, Yi P, Li Z, Xie T, Marinov D, Lam W (2022) Preempting flaky tests via non-idempotent-outcome tests. In: Proceedings of the international conference on tools and algorithms for the construction and analysis of systems (TACAS)","DOI":"10.1145\/3510003.3510170"},{"key":"10307_CR65","unstructured":"Welcome to radon\u2019s documenation! (2022) \u2014 Radon 4.1.0 documenation. https:\/\/radon.readthedocs.io\/en\/stable\/index.html"},{"key":"10307_CR66","unstructured":"Welcome to the SHAP documenation! (2022) \u2014 SHAP latest documenation. https:\/\/shap.readthedocs.io\/en\/stable\/index.html"},{"key":"10307_CR67","first-page":"18","volume":"14","author":"KD Welker","year":"2001","unstructured":"Welker KD (2001) The software maintainability index revisited. CrossTalk 14:18\u201321","journal-title":"CrossTalk"},{"issue":"5","key":"10307_CR68","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3444944","volume":"15","author":"L Yao","year":"2021","unstructured":"Yao L, Chu Z, Li S, Li Y, Gao J, Zhang A (2021) A survey on causal inference. Trans Knowl Discov Data (TKDD) 15(5):1\u201346","journal-title":"Trans Knowl Discov Data (TKDD)"},{"issue":"10","key":"10307_CR69","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.1016\/j.automatica.2012.06.066","volume":"48","author":"VM Zavala","year":"2012","unstructured":"Zavala VM, Flores-Tlacuahuac A (2012) Stability of multiobjective predictive control: a utopia-tracking approach. Automatica 48(10):2627\u20132632","journal-title":"Automatica"},{"issue":"2","key":"10307_CR70","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1109\/32.988498","volume":"28","author":"A Zeller","year":"2002","unstructured":"Zeller A, Hildebrandt R (2002) Simplifying and isolating failure-inducing input. Trans Softw Eng 28(2):183\u2013200","journal-title":"Trans Softw Eng"},{"key":"10307_CR71","doi-asserted-by":"crossref","unstructured":"Zhang S, Jalali D, Wuttke J, Mu\u015flu K, Lam W, Ernst MD, Notkin D (2014) Empirically revisiting the test independence assumption. In: Proceedings of the international symposium on software testing and analysis (ISSTA), pp 385\u2013396","DOI":"10.1145\/2610384.2610404"},{"key":"10307_CR72","doi-asserted-by":"crossref","unstructured":"Zhang P, Jiang Y, Wei A, Stodden V, Marinov D, Shi A (2021) Domain-specific fixes for flaky tests with wrong assumptions on underdetermined specifications. In: Proceedings of the international conference on software engineering (ICSE), pp 50\u201361","DOI":"10.1109\/ICSE43902.2021.00018"},{"key":"10307_CR73","unstructured":"airflow\/test (2022) airflow\/test_local_client.py at c743b95. https:\/\/github.com\/apache\/airflow\/blob\/c743b95a02ba1ec04013635a56ad042ce98823d2\/tests\/api\/client\/test_local_client.py#L127"},{"key":"10307_CR74","unstructured":"apache\/airflow at c743b95 (2022) https:\/\/github.com\/apache\/airflow\/tree\/c743b95a02ba1ec04013635a56ad042ce98823d2"},{"key":"10307_CR75","unstructured":"ipython\/test (2022) ipython\/test_async_helpers.py at 95d2b79. https:\/\/github.com\/ipython\/ipython\/blob\/95d2b79a2bd889da7a29e7c3cf5f49c1d25ff43d\/IPython\/core\/tests\/test_async_helpers.py#L135"},{"key":"10307_CR76","unstructured":"pytest-CANNIER (2022) https:\/\/github.com\/flake-it\/pytest-cannier"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-023-10307-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-023-10307-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-023-10307-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,26]],"date-time":"2023-05-26T21:04:10Z","timestamp":1685135050000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-023-10307-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,28]]},"references-count":76,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5]]}},"alternative-id":["10307"],"URL":"https:\/\/doi.org\/10.1007\/s10664-023-10307-w","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,28]]},"assertion":[{"value":"9 February 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The research leading to these results received funding from the Engineering and Physical Sciences Research Council (EPSRC) (Award Number: EP\/R513313\/1).","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}}],"article-number":"72"}}