{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T21:07:31Z","timestamp":1774732051135,"version":"3.50.1"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2022,7,12]],"date-time":"2022-07-12T00:00:00Z","timestamp":1657584000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61772259, 62172205, 61872177, 62072194, 62172202"],"award-info":[{"award-number":["61772259, 62172205, 61872177, 62072194, 62172202"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2022,10,31]]},"abstract":"<jats:p>\n            <jats:bold>Background.<\/jats:bold>\n            Mutation testing is a commonly used defect injection technique for evaluating the effectiveness of a test suite. However, it is usually computationally expensive. Therefore, many mutation reduction strategies, which aim to reduce the number of mutants, have been proposed.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Problem.<\/jats:bold>\n            It is important to measure the ability of a mutation reduction strategy to maintain test suite effectiveness evaluation. However, existing evaluation indicators are unable to measure the \u201corder-preserving ability\u201d, i.e., to what extent the mutation score order among test suites is maintained before and after mutation reduction. As a result, misleading conclusions can be achieved when using existing indicators to evaluate the reduction effectiveness.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Objective.<\/jats:bold>\n            We aim to propose evaluation indicators to measure the \u201corder-preserving ability\u201d of a mutation reduction strategy, which is important but missing in our community.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Method.<\/jats:bold>\n            Given a test suite on a\n            <jats:bold>Software Under Test (SUT)<\/jats:bold>\n            with a set of original mutants, we leverage the test suite to generate a group of test suites that have a partial order relationship in defect detecting ability. When evaluating a reduction strategy, we first construct two partial order relationships among the generated test suites in terms of mutation score, one with the original mutants and another with the reduced mutants. Then, we measure the extent to which the partial order under the original mutants remains unchanged in the partial order under the reduced mutants. The more partial order is unchanged, the stronger the\n            <jats:bold>\n              Order Preservation (\n              <jats:italic>OP<\/jats:italic>\n              )\n            <\/jats:bold>\n            of the mutation reduction strategy is, and the more effective the reduction strategy is. Furthermore, we propose\n            <jats:bold>\n              Effort-aware Relative Order Preservation (\n              <jats:italic>EROP<\/jats:italic>\n              )\n            <\/jats:bold>\n            to measure how much gain a mutation reduction strategy can provide compared with a random reduction strategy.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Result.<\/jats:bold>\n            The experimental results show that OP and EROP are able to efficiently measure the \u201corder-preserving ability\u201d of a mutation reduction strategy. As a result, they have a better ability to distinguish various mutation reduction strategies compared with the existing evaluation indicators. In addition, we find that\n            <jats:bold>Subsuming Mutant Selection (SMS)<\/jats:bold>\n            and\n            <jats:bold>Clustering Mutant Selection (CMS)<\/jats:bold>\n            are more effective than the other strategies under OP and EROP.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Conclusion.<\/jats:bold>\n            We suggest, for the researchers, that OP and EROP should be used to measure the effectiveness of a mutant reduction strategy, and for the practitioners, that SMS and CMS should be given priority in practice.\n          <\/jats:p>","DOI":"10.1145\/3522578","type":"journal-article","created":{"date-parts":[[2022,3,15]],"date-time":"2022-03-15T13:40:39Z","timestamp":1647351639000},"page":"1-46","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Mutant Reduction Evaluation: What is There and What is Missing?"],"prefix":"10.1145","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2902-2249","authenticated-orcid":false,"given":"Peng","family":"Zhang","sequence":"first","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6771-2139","authenticated-orcid":false,"given":"Yang","family":"Wang","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3831-5505","authenticated-orcid":false,"given":"Xutong","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2282-7175","authenticated-orcid":false,"given":"Yanhui","family":"Li","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1153-2013","authenticated-orcid":false,"given":"Yibiao","family":"Yang","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0494-5285","authenticated-orcid":false,"given":"Ziyuan","family":"Wang","sequence":"additional","affiliation":[{"name":"Nanjing University of Posts and Telecommunications, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1177-6907","authenticated-orcid":false,"given":"Xiaoyu","family":"Zhou","sequence":"additional","affiliation":[{"name":"Southeast University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2352-2226","authenticated-orcid":false,"given":"Lin","family":"Chen","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4645-2526","authenticated-orcid":false,"given":"Yuming","family":"Zhou","sequence":"additional","affiliation":[{"name":"Nanjing University, Jiangsu Province, China"}]}],"member":"320","published-online":{"date-parts":[[2022,7,12]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2010.62"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/bs.adcom.2018.03.015"},{"key":"e_1_3_4_4_2","first-page":"10","article-title":"Sufficient mutation operators for measuring testing effectiveness","author":"Namin A. S.","year":"2008","unstructured":"A. S. Namin, J. H. Andrews, and D. J. Murdoch. 2008. Sufficient mutation operators for measuring testing effectiveness. ICSE 2008, 10\u201318.","journal-title":"ICSE"},{"key":"e_1_3_4_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/227607.227610"},{"issue":"4","key":"e_1_3_4_6_2","article-title":"Assessment of C++ object-oriented mutation operators: A selective mutation approach","volume":"27","author":"Delgado-Prez P.","year":"2017","unstructured":"P. Delgado-Prez, S. Segura, and I. Medina-Bulo. 2017. Assessment of C++ object-oriented mutation operators: A selective mutation approach. Software Testing, Verification and Reliability 27, 4\u20135 (2017), n\/a-n\/a.","journal-title":"Software Testing, Verification and Reliability"},{"key":"e_1_3_4_7_2","first-page":"11","article-title":"Designing deletion mutation operators","volume":"2014","author":"Delamaro M. E.","year":"2014","unstructured":"M. E. Delamaro, J. Offutt, and P. Ammann. 2014. Designing deletion mutation operators. ICST 2014, 11\u201320.","journal-title":"ICST"},{"key":"e_1_3_4_8_2","first-page":"203","article-title":"Experimental evaluation of SDL and one-op mutation for C.","author":"Delamaro M. E.","year":"2014","unstructured":"M. E. Delamaro, L. Deng, V. H. S. Durelli, N. Li, and J. Offutt. 2014. Experimental evaluation of SDL and one-op mutation for C. ICST 2014, 203\u2013212.","journal-title":"ICST"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.3002496"},{"key":"e_1_3_4_10_2","first-page":"90","article-title":"An empirical evaluation of the first and second order mutation testing strategies","volume":"2010","author":"Papadakis M.","year":"2010","unstructured":"M. Papadakis and N. Malevris. 2010. An empirical evaluation of the first and second order mutation testing strategies. ICST 2010, 90\u201399.","journal-title":"ICST"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2950290.2950322"},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2016.05.001"},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2009.04.016"},{"key":"e_1_3_4_14_2","first-page":"176","article-title":"Mutant subsumption graphs","volume":"2014","author":"Kurtz B.","year":"2014","unstructured":"B. Kurtz, P. Ammann, M. E. Delamaro, J. Offutt, and L. Deng. 2014. Mutant subsumption graphs. ICSTW 2014, 176\u2013185.","journal-title":"ICSTW"},{"key":"e_1_3_4_15_2","first-page":"284","article-title":"Inferring mutant utility from program context","volume":"2017","author":"Just R.","year":"2017","unstructured":"R. Just, B. Kurtz, and P. Ammann. 2017. Inferring mutant utility from program context. ISSTA 2017, 284\u2013294.","journal-title":"ISSTA"},{"key":"e_1_3_4_16_2","first-page":"355","volume-title":"Proceedings of the Workshop on Software Testing and Test Documentation","author":"Lipton R. J.","year":"1978","unstructured":"R. J. Lipton and F. G. Sayward. 1978. The status of research on program mutation. In Proceedings of the Workshop on Software Testing and Test Documentation 1978, 355\u2013373."},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.1002\/stvr.4370040104"},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/0164-1212(94)00098-0"},{"key":"e_1_3_4_19_2","first-page":"1","article-title":"Static analysis of mutant subsumption","volume":"2015","author":"Kurtz B.","year":"2015","unstructured":"B. Kurtz, P. Ammann, and J. Offutt. 2015. Static analysis of mutant subsumption. ICST 2015, 1\u201310.","journal-title":"ICST"},{"key":"e_1_3_4_20_2","first-page":"435","article-title":"Is operator-based mutant selection superior to random mutant selection?","volume":"2010","author":"Zhang L.","year":"2010","unstructured":"L. Zhang, S. Hou, J. Hu, T. Xie, and H. Mei. 2010. Is operator-based mutant selection superior to random mutant selection? ICSE 2010, 435\u2013444.","journal-title":"ICSE"},{"key":"e_1_3_4_21_2","first-page":"92","article-title":"Operator-based and random mutant selection: Better together","volume":"2013","author":"Zhang L.","year":"2013","unstructured":"L. Zhang, M. Gligoric, D. Marinov, and S. Khurshid. 2013. Operator-based and random mutant selection: Better together. ASE 2013, 92\u2013102.","journal-title":"ASE"},{"key":"e_1_3_4_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/2660767"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1002\/stvr.1692"},{"key":"e_1_3_4_24_2","first-page":"604","article-title":"Performance, effectiveness, and reliability issues in software testing","author":"Mathur A. P.","year":"1991","unstructured":"A. P. Mathur. 1991. Performance, effectiveness, and reliability issues in software testing. COMPSAC 1991, 604\u2013605.","journal-title":"COMPSAC"},{"key":"e_1_3_4_25_2","article-title":"Mutation Clustering","author":"Hussain S.","year":"2008","unstructured":"S. Hussain. 2008. Mutation Clustering. M.S. thesis London, UK King's College. 2008.","journal-title":"M.S. thesis London, UK King's College"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.3010361"},{"key":"e_1_3_4_27_2","first-page":"157","article-title":"A systematic literature review of techniques and metrics to reduce the cost of mutation testing","author":"Pizzoleto A.","year":"2019","unstructured":"A. Pizzoleto, F. Ferrari, J. Offutt, L. Fernandes, and M. Ribeiro. 2019. A systematic literature review of techniques and metrics to reduce the cost of mutation testing. The Journal of Systems and Software 2019, 157.","journal-title":"The Journal of Systems and Software"},{"key":"e_1_3_4_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TR.2017.2705662"},{"key":"e_1_3_4_29_2","unstructured":"Apache Commons. http:\/\/commons.apache.org."},{"key":"e_1_3_4_30_2","first-page":"511","article-title":"On the limits of mutation reduction strategies","author":"Gopinath R.","year":"2016","unstructured":"R. Gopinath, M. A. Alipour, I. Ahmed, C. Jensen, and A. Groce. 2016. On the limits of mutation reduction strategies. ICSE 2016, 511\u2013522.","journal-title":"ICSE"},{"key":"e_1_3_4_31_2","first-page":"430","article-title":"Assessing and improving the mutation testing practice of PIT","author":"Laurent T.","year":"2017","unstructured":"T. Laurent, M. Papadakis, M. Kintis, C. Henard, Y. L. Traon, and A. Ventresque. 2017. Assessing and improving the mutation testing practice of PIT. ICST 2017, 430\u2013435.","journal-title":"ICST"},{"key":"e_1_3_4_32_2","first-page":"654","article-title":"Are mutants a valid substitute for real faults in software testing?","volume":"2014","author":"Just R.","year":"2014","unstructured":"R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraser. 2014. Are mutants a valid substitute for real faults in software testing? FSE 2014, 654\u2013665.","journal-title":"FSE"},{"key":"e_1_3_4_33_2","unstructured":"PIT. http:\/\/pitest.org\/."},{"key":"e_1_3_4_34_2","unstructured":"Cobertura. https:\/\/github.com\/cobertura."},{"key":"e_1_3_4_35_2","unstructured":"PIT mutators. https:\/\/pitest.org\/quickstart\/mutators."},{"key":"e_1_3_4_36_2","first-page":"560","article-title":"Code coverage and test suite effectiveness: Empirical study with real bugs in large systems","author":"Kochhar P. S.","year":"2015","unstructured":"P. S. Kochhar, F. Thung, and D. Lo. 2015. Code coverage and test suite effectiveness: Empirical study with real bugs in large systems. SANER 2015, 560\u2013564.","journal-title":"SANER"},{"key":"e_1_3_4_37_2","article-title":"Revisiting the relationship between fault detection, test adequacy criteria, and test set size","author":"Chen Y. T.","year":"2020","unstructured":"Y. T. Chen, R. Gopinath, A. Tadakamalla, M. D. Ernst, R. Holmes, G. Fraser, P. Ammann, and R. Just. 2020. Revisiting the relationship between fault detection, test adequacy criteria, and test set size. ASE 2020, n\/a-n\/a.","journal-title":"ASE"},{"key":"e_1_3_4_38_2","first-page":"435","article-title":"Coverage is not strongly correlated with test suite effectiveness","author":"Inozemtseva L.","year":"2014","unstructured":"L. Inozemtseva and R. Holmes. 2014. Coverage is not strongly correlated with test suite effectiveness. ICSE 2014, 435\u2013445.","journal-title":"ICSE"},{"key":"e_1_3_4_39_2","first-page":"354","article-title":"Threats to the validity of mutation-based test assessment","volume":"2016","author":"Papadakis M.","year":"2016","unstructured":"M. Papadakis, C. Henard, M. Harman, Y. Jia, and Y. L. Traon. 2016. Threats to the validity of mutation-based test assessment. ISSTA 2016, 354\u2013365.","journal-title":"ISSTA"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/tse.2016.2584050"},{"key":"e_1_3_4_41_2","article-title":"Statistical power analysis for the behavioral sciences","author":"Cohen J.","year":"1988","unstructured":"J. Cohen. 1988. Statistical power analysis for the behavioral sciences. Routledge. ISBN 978-1-134-74270-7.","journal-title":"Routledge"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00053399"},{"key":"e_1_3_4_43_2","doi-asserted-by":"publisher","DOI":"10.1080\/03610928908830150"},{"key":"e_1_3_4_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-017-9582-5"},{"key":"e_1_3_4_45_2","first-page":"300","article-title":"Evaluating mutation testing alternatives: A collateral experiment","volume":"2010","author":"Kintis M.","year":"2010","unstructured":"M. Kintis, M. Papadakis, and N. Malevris. 2010. Evaluating mutation testing alternatives: A collateral experiment. APSEC 2010, 300\u2013309.","journal-title":"APSEC"},{"key":"e_1_3_4_46_2","first-page":"32","article-title":"Mutant quality indicators","author":"Papadakis M.","year":"2018","unstructured":"M. Papadakis, T. T. Chekam, and Y. L. Traon. 2018. Mutant quality indicators. ICSTW 2018, 32\u201339.","journal-title":"ICSTW"},{"key":"e_1_3_4_47_2","first-page":"437","article-title":"Defects4J: A database of existing faults to enable controlled testing studies for Java programs","author":"Just R.","year":"2014","unstructured":"R. Just, D. Jalali, and M. D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. ISSTA 2014, 437\u2013440.","journal-title":"ISSTA"},{"key":"e_1_3_4_48_2","first-page":"936","article-title":"Trivial compiler equivalence: A large scale empirical study of a simple, fast and effective equivalent mutant detection technique","author":"Papadakis M.","year":"2015","unstructured":"M. Papadakis, Y. Jia, M. Harman, and Y. L. Traon. 2015. Trivial compiler equivalence: A large scale empirical study of a simple, fast and effective equivalent mutant detection technique. ICSE 2015, 936\u2013946.","journal-title":"ICSE"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522578","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3522578","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:33Z","timestamp":1750183773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522578"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,12]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,10,31]]}},"alternative-id":["10.1145\/3522578"],"URL":"https:\/\/doi.org\/10.1145\/3522578","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,12]]},"assertion":[{"value":"2021-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}