{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T04:08:55Z","timestamp":1750392535406,"version":"3.41.0"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","funder":[{"name":"National Science Foundation","award":["1910264, 2150217, 233977"],"award-info":[{"award-number":["1910264, 2150217, 233977"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>Software testing is difficult, tedious, and may consume 28%\u201350% of software engineering labor. Automatic test generators aim to ease this burden but have important trade-offs. Fuzzers use an implicit oracle that can detect obviously invalid results, but the oracle problem has no general solution, and an implicit oracle cannot automatically evaluate correctness. Test suite generators like EvoSuite use the program under test as the oracle and therefore cannot evaluate correctness. Property-based testing tools evaluate correctness, but users have difficulty coming up with properties to test and understanding whether their properties are correct. Consequently, practitioners create many test suites manually and often use an example-based oracle to tediously specify correct input and output examples. To help bridge the gaps among various oracle and tool types, we present the Composite Oracle, which organizes various oracle types into a hierarchy and renders a single test result per example execution. To understand the Composite Oracle\u2019s practical properties, we built TerzoN, a test suite generator that includes a particular instantiation of the Composite Oracle. TerzoN displays all the test results in an integrated view composed from the results of three types of oracles and finds some types of test assertion inconsistencies that might otherwise lead to misleading test results. We evaluated TerzoN in a randomized controlled trial with 14 professional software engineers with a popular industry tool, fast-check, as the control. Participants using TerzoN elicited 72% more bugs (p &lt; 0.01), accurately described more than twice the number of bugs (p &lt; 0.01) and tested 16% more quickly (p &lt; 0.05) relative to fast-check.<\/jats:p>","DOI":"10.1145\/3729359","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1983-2005","source":"Crossref","is-referenced-by-count":0,"title":["TerzoN: Human-in-the-Loop Software Testing with a Composite Oracle"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2366-8436","authenticated-orcid":false,"given":"Matthew C.","family":"Davis","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-6046-1815","authenticated-orcid":false,"given":"Amy","family":"Wei","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4769-0219","authenticated-orcid":false,"given":"Brad A.","family":"Myers","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9672-5297","authenticated-orcid":false,"given":"Joshua","family":"Sunshine","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183440.3195001"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2013.02.061"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-017-9570-9"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-10072-8"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2014.2372785"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2776152"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477132.3483540"},{"key":"e_1_2_1_8_1","volume-title":"Using thematic analysis in psychology. Qualitative research in psychology, 3, 2","author":"Braun Virginia","year":"2006","unstructured":"Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology, 3, 2 (2006), 77\u2013101."},{"key":"e_1_2_1_9_1","volume-title":"SUS-A quick and dirty usability scale. Usability evaluation in industry, 189, 194","author":"Brooke John","year":"1996","unstructured":"John Brooke. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry, 189, 194 (1996), 4\u20137."},{"key":"e_1_2_1_10_1","unstructured":"Stephen Cass. 2024. Top Programming Languages 2024 - IEEE Spectrum. https:\/\/spectrum.ieee.org\/top-programming-languages-2024"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/351240.351266"},{"key":"e_1_2_1_12_1","unstructured":"Anastasia Danilova. 2022. How to Conduct Security Studies with Software Developers. Ph. D. Dissertation. Universit\u00e4ts-und Landesbibliothek Bonn."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3587157"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3611643.3616327"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","unstructured":"Matthew C Davis and Amy Wei. 2025. Reproduction Package for Article \u201cTerzoN: Human-in-the-Loop Software Testing with a Composite Oracle\u201d. https:\/\/doi.org\/10.1145\/3580446 10.1145\/3580446","DOI":"10.1145\/3580446"},{"key":"e_1_2_1_16_1","volume-title":"NaNofuzz - Visual Studio Marketplace. https:\/\/marketplace.visualstudio.com\/items?itemName=penrose.nanofuzz [Online","author":"Davis Matthew C.","year":"2024","unstructured":"Matthew C. Davis, Amy Wei, Sangheon Choi, and Sam Estep. 2024. NaNofuzz - Visual Studio Marketplace. https:\/\/marketplace.visualstudio.com\/items?itemName=penrose.nanofuzz [Online; accessed 2024-09-01]"},{"key":"e_1_2_1_17_1","unstructured":"Nicolas Dubien. 2024. dubzzz\/fast-check. https:\/\/github.com\/dubzzz\/fast-check original-date: 2017-10-30T23:41:11Z"},{"key":"e_1_2_1_18_1","unstructured":"Nicolas Dubien. 2024. fast-check. https:\/\/www.npmjs.com\/package\/fast-check"},{"key":"e_1_2_1_19_1","unstructured":"Nicolas Dubien. 2024. fast-check official documentation | fast-check. https:\/\/fast-check.dev\/"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CHASE52884.2021.00026"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/581339.581359"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1131421.1131423"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2012.14"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699688"},{"key":"e_1_2_1_25_1","volume-title":"https:\/\/github.com\/features\/codespaces [Online","author":"Codespaces GitHub","year":"2023","unstructured":"GitHub. 2023. GitHub Codespaces. https:\/\/github.com\/features\/codespaces [Online; accessed 2023-07-21]"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3639581"},{"key":"e_1_2_1_27_1","volume-title":"Proc. Workshop on the Human Aspects of Types and Reasoning Assistants (HATRA).","author":"Goldstein Harrison","year":"2022","unstructured":"Harrison Goldstein, Joseph W Cutler, Adam Stein, Benjamin C Pierce, and Andrew Head. 2022. Some Problems with Properties. In Proc. Workshop on the Human Aspects of Types and Reasoning Assistants (HATRA)."},{"key":"e_1_2_1_28_1","unstructured":"Google DevOps Research and Assessment. 2024. 2024 State of DevOps Report. Google."},{"key":"e_1_2_1_29_1","volume-title":"2018 IEEE\/ACM 26th International Conference on Program Comprehension (ICPC). 348\u20133483","author":"Grano Giovanni","year":"2018","unstructured":"Giovanni Grano, Simone Scalabrino, Harald C Gall, and Rocco Oliveto. 2018. An empirical investigation on the readability of manual and generated test cases. In 2018 IEEE\/ACM 26th International Conference on Program Comprehension (ICPC). 348\u20133483."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2013.59"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1080\/00031305.1998.10480559"},{"key":"e_1_2_1_32_1","unstructured":"ISO. 2018. Ergonomics of human-system interaction\u2014Part 11: Usability: Definitions and concepts ISO 9241\u201311: 2018 (en)."},{"key":"e_1_2_1_33_1","volume-title":"A concise introduction to software engineering","author":"Jalote Pankaj","unstructured":"Pankaj Jalote. 2008. A concise introduction to software engineering. Springer Science & Business Media."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-013-9279-3"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/QSIC.2014.33"},{"key":"e_1_2_1_36_1","volume-title":"USENIX Security Symposium. 2777\u20132794","author":"Li Yuwei","year":"2021","unstructured":"Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, and Peng Cheng. 2021. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers.. In USENIX Security Symposium. 2777\u20132794."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3608128"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.21105\/joss.01891"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2946563"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.2307\/2288652"},{"key":"e_1_2_1_41_1","volume-title":"Overview of Microsoft IntelliTester. https:\/\/learn.microsoft.com\/en-us\/visualstudio\/test\/intellitest-manual\/ [Online","year":"2023","unstructured":"Microsoft. 2023. Overview of Microsoft IntelliTester. https:\/\/learn.microsoft.com\/en-us\/visualstudio\/test\/intellitest-manual\/ [Online; accessed 2023-01-27]"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/174800.174808"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2016.200"},{"key":"e_1_2_1_44_1","article-title":"Generating unit tests for documentation","author":"Nassif Mathieu","year":"2021","unstructured":"Mathieu Nassif, Alexa Hernandez, Ashvitha Sridharan, and Martin P Robillard. 2021. Generating unit tests for documentation. IEEE Transactions on Software Engineering.","journal-title":"IEEE Transactions on Software Engineering."},{"key":"e_1_2_1_45_1","volume-title":"Proceedings.. 116\u2013125","author":"Ng Sebastian P","year":"2004","unstructured":"Sebastian P Ng, Tafline Murnane, Karl Reed, D Grant, and Tsong Yueh Chen. 2004. A preliminary survey on software testing practices in Australia. In 2004 Australian Software Engineering Conference. Proceedings.. 116\u2013125."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/191666.191729"},{"key":"e_1_2_1_47_1","volume-title":"Concise Guide to Software Testing","author":"O\u2019Regan Gerard","unstructured":"Gerard O\u2019Regan. 2019. Fundamentals of Software Testing. In Concise Guide to Software Testing. Springer, 59\u201378."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2007.37"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2884781.2884847"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2018.03.052"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00193"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1176\/appi.ajp.2012.12070999"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2771783.2771801"},{"key":"e_1_2_1_54_1","doi-asserted-by":"crossref","unstructured":"Robert Rosenthal and Ralph L Rosnow. 2008. Essentials of behavioral research: Methods and data analysis.","DOI":"10.1093\/acprof:oso\/9780195385540.001.0001"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSRE.1997.630851"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.1998.671118"},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium. 230\u2013239","author":"Rothermel Karen J","year":"2000","unstructured":"Karen J Rothermel, Curtis R Cook, Margaret M Burnett, Justin Schonfeld, Thomas RG Green, and Gregg Rothermel. 2000. WYSIWYT testing in the spreadsheet paradigm: An empirical evaluation. In Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium. 230\u2013239."},{"key":"e_1_2_1_58_1","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering. 287\u2013298","author":"Roy Devjeet","year":"2020","unstructured":"Devjeet Roy, Ziyi Zhang, Maggie Ma, Venera Arnaoudova, Annibale Panichella, Sebastiano Panichella, Danielle Gonzalez, and Mehdi Mirakhorli. 2020. DeepTC-Enhancer: Improving the readability of automatically generated tests. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering. 287\u2013298."},{"key":"e_1_2_1_59_1","volume-title":"Rethinking productivity in software engineering","author":"Sadowski Caitlin","unstructured":"Caitlin Sadowski and Thomas Zimmermann. 2019. Rethinking productivity in software engineering. Springer Nature."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICICoS56336.2022.9930600"},{"key":"e_1_2_1_61_1","volume-title":"What if writing tests was a joyful experience? https:\/\/blog.janestreet.com\/the-joy-of-expect-tests\/ [Online","author":"Somers James","year":"2023","unstructured":"James Somers. 2023. What if writing tests was a joyful experience? https:\/\/blog.janestreet.com\/the-joy-of-expect-tests\/ [Online; accessed 2023-01-22]"},{"key":"e_1_2_1_62_1","volume-title":"Jest - Delightful Javascript Testing. https:\/\/jestjs.io\/ [Online","author":"Source Facebook Open","year":"2024","unstructured":"Facebook Open Source. 2023. Jest - Delightful Javascript Testing. https:\/\/jestjs.io\/ [Online; accessed 2024-07-08]"},{"key":"e_1_2_1_63_1","volume-title":"https:\/\/github.com\/firsttris\/vscode-jest-runner [Online","author":"Teufel Tristan","year":"2022","unstructured":"Tristan Teufel and contributors. 2022. Jest Runner. https:\/\/github.com\/firsttris\/vscode-jest-runner [Online; accessed 2022-11-10]"},{"key":"e_1_2_1_64_1","volume-title":"Software testing and quality assurance: theory and practice","author":"Tripathy Priyadarshi","unstructured":"Priyadarshi Tripathy and Kshirasagar Naik. 2011. Software testing and quality assurance: theory and practice. John Wiley & Sons."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","unstructured":"Vasudev Vikram Caroline Lemieux Joshua Sunshine and Rohan Padhye. 2024. Can Large Language Models Write Good Property-Based Tests? https:\/\/doi.org\/10.48550\/arXiv.2307.04346 arXiv:2307.04346 [cs] 10.48550\/arXiv.2307.04346","DOI":"10.48550\/arXiv.2307.04346"},{"key":"e_1_2_1_66_1","volume-title":"Using Relational Problems to Teach Property-Based Testing. The art science and engineering of programming, 5, 2","author":"Wrenn John","year":"2021","unstructured":"John Wrenn, Tim Nelson, and Shriram Krishnamurthi. 2021. Using Relational Problems to Teach Property-Based Testing. The art science and engineering of programming, 5, 2 (2021)."},{"key":"e_1_2_1_67_1","volume-title":"american fuzzy lop. https:\/\/lcamtuf.coredump.cx\/afl\/ [Online","author":"Zalewski Micha\u0142","year":"2023","unstructured":"Micha\u0142 Zalewski. 2014. american fuzzy lop. https:\/\/lcamtuf.coredump.cx\/afl\/ [Online; accessed 2023-06-27]"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:27:18Z","timestamp":1750346838000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729359"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":67,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3729359"],"URL":"https:\/\/doi.org\/10.1145\/3729359","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}