{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T11:32:47Z","timestamp":1780054367648,"version":"3.54.0"},"reference-count":90,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2023,7,22]],"date-time":"2023-07-22T00:00:00Z","timestamp":1689984000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2020YFB2010901"],"award-info":[{"award-number":["2020YFB2010901"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Key R&D Program of Zhejiang","award":["2022C01018"],"award-info":[{"award-number":["2022C01018"]}]},{"name":"NSFC Program","award":["62102359, 61833015, 62293511, and U1911401"],"award-info":[{"award-number":["62102359, 61833015, 62293511, and U1911401"]}]},{"name":"Fundamental Research Funds for Central Universities"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2023,9,30]]},"abstract":"<jats:p>\n            Recently, there has been significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is DL testing\u2014that is, given a property of test, defects of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the neuron coverage metrics, which are commonly used by most existing DL testing approaches, are not necessarily correlated with model quality (e.g., robustness, the most studied model property), and are also not an effective measurement on the confidence of the model quality after testing. In this work, we address this gap by proposing a novel testing framework called\n            <jats:sc>QuoTe<\/jats:sc>\n            (i.e.,\n            <jats:italic>Qu<\/jats:italic>\n            ality-\n            <jats:italic>o<\/jats:italic>\n            riented\n            <jats:italic>Te<\/jats:italic>\n            sting). A key part of\n            <jats:sc>QuoTe<\/jats:sc>\n            is a quantitative measurement on (1) the value of each test case in enhancing the model property of interest (often via retraining) and (2) the convergence quality of the model property improvement.\n            <jats:sc>QuoTe<\/jats:sc>\n            utilizes the proposed metric to automatically select or generate valuable test cases for improving model quality. The proposed metric is also a lightweight yet strong indicator of how well the improvement converged. Extensive experiments on both image and tabular datasets with a variety of model architectures confirm the effectiveness and efficiency of\n            <jats:sc>QuoTe<\/jats:sc>\n            in improving DL model quality\u2014that is, robustness and fairness.\n            <jats:styled-content style=\"color:#000000\">As a generic quality-oriented testing framework, future adaptations can be made to other domains (e.g., text) as well as other model properties.<\/jats:styled-content>\n          <\/jats:p>","DOI":"10.1145\/3582573","type":"journal-article","created":{"date-parts":[[2023,2,10]],"date-time":"2023-02-10T12:11:15Z","timestamp":1676031075000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["<scp>QuoTe<\/scp>\n            : Quality-oriented Testing for Deep Learning Systems"],"prefix":"10.1145","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4322-4285","authenticated-orcid":false,"given":"Jialuo","family":"Chen","sequence":"first","affiliation":[{"name":"Zhejiang University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7113-7635","authenticated-orcid":false,"given":"Jingyi","family":"Wang","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2099-4973","authenticated-orcid":false,"given":"Xingjun","family":"Ma","sequence":"additional","affiliation":[{"name":"Fudan University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1893-6259","authenticated-orcid":false,"given":"Youcheng","family":"Sun","sequence":"additional","affiliation":[{"name":"University of Manchester"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3545-1392","authenticated-orcid":false,"given":"Jun","family":"Sun","sequence":"additional","affiliation":[{"name":"Singapore Management University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5039-5651","authenticated-orcid":false,"given":"Peixin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4221-2162","authenticated-orcid":false,"given":"Peng","family":"Cheng","sequence":"additional","affiliation":[{"name":"Zhejiang University"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,7,22]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3338906.3338937"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tjem.2018.08.001"},{"key":"e_1_3_4_4_2","article-title":"Black-box safety analysis and retraining of DNNs based on feature extraction and clustering","author":"Attaoui Mohammed Oualid","year":"2022","unstructured":"Mohammed Oualid Attaoui, Hazem Fahmy, Fabrizio Pastore, and Lionel Briand. 2022. Black-box safety analysis and retraining of DNNs based on feature extraction and clustering. arXiv preprint arXiv:2201.05077 (2022).","journal-title":"arXiv preprint arXiv:2201.05077"},{"key":"e_1_3_4_5_2","first-page":"1","volume-title":"Noise Reduction in Speech Processing","author":"Benesty Jacob","year":"2009","unstructured":"Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. 2009. Pearson correlation coefficient. In Noise Reduction in Speech Processing. Springer, 1\u20134."},{"key":"e_1_3_4_6_2","article-title":"Tesla\u2019s self-driving system cleared in deadly crash","author":"Boudette Neal E.","year":"2017","unstructured":"Neal E. Boudette. 2017. Tesla\u2019s self-driving system cleared in deadly crash. New York Times. Retrieved February 21, 2023 from https:\/\/www.nytimes.com\/2017\/01\/19\/business\/tesla-model-s-autopilot-fatal-crash.html.","journal-title":"New York Times."},{"key":"e_1_3_4_7_2","first-page":"12861","volume-title":"Advances in Neural Information Processing Systems","author":"Brendel Wieland","year":"2019","unstructured":"Wieland Brendel, Jonas Rauber, Matthias K\u00fcmmerer, Ivan Ustyuzhaninov, and Matthias Bethge. 2019. Accurate, reliable and fast robustness evaluation. In Advances in Neural Information Processing Systems. 12861\u201312871."},{"key":"e_1_3_4_8_2","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1109\/AITest.2019.000-6","volume-title":"Proceedings of the 2019 IEEE International Conference on Artificial Intelligence Testing (AITest\u201919)","author":"Byun Taejoon","year":"2019","unstructured":"Taejoon Byun, Vaibhav Sharma, Abhishek Vijayakumar, Sanjai Rayadurgam, and Darren Cofer. 2019. Input prioritization for testing neural networks. In Proceedings of the 2019 IEEE International Conference on Artificial Intelligence Testing (AITest\u201919). IEEE, Los Alamitos, CA, 63\u201370."},{"key":"e_1_3_4_9_2","first-page":"209","volume-title":"Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201908)","author":"Cadar Cristian","year":"2008","unstructured":"Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201908). 209\u2013224."},{"key":"e_1_3_4_10_2","article-title":"On evaluating adversarial robustness","author":"Carlini Nicholas","year":"2019","unstructured":"Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. 2019. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705 (2019).","journal-title":"arXiv preprint arXiv:1902.06705"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2017.49"},{"key":"e_1_3_4_12_2","first-page":"2493","article-title":"Natural language processing (almost) from scratch","author":"Collobert Ronan","year":"2011","unstructured":"Ronan Collobert, Jason Weston, L\u00e9on Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (2011), 2493\u20132537.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_4_13_2","first-page":"2206","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Croce Francesco","year":"2020","unstructured":"Francesco Croce and Matthias Hein. 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the International Conference on Machine Learning. 2206\u20132216."},{"key":"e_1_3_4_14_2","first-page":"73","volume-title":"Proceedings of the 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS\u201920)","author":"Dong Yizhen","year":"2020","unstructured":"Yizhen Dong, Peixin Zhang, Jingyi Wang, Shuang Liu, Jun Sun, Jianye Hao, Xinyu Wang, Li Wang, Jinsong Dong, and Ting Dai. 2020. An empirical study on correlation between coverage and robustness for deep neural networks. In Proceedings of the 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS\u201920). IEEE, Los Alamitos, CA, 73\u201382."},{"key":"e_1_3_4_15_2","unstructured":"Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. Retrieved February 21 2023 from http:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00108"},{"key":"e_1_3_4_17_2","volume-title":"Proceedings of the 2022 IEEE\/ACM 44st International Conference on Software Engineering","author":"Fahmy Hazem","year":"2022","unstructured":"Hazem Fahmy, Fabrizio Pastore, and Lionel Briand. 2022. HUDD: A tool to debug DNNs for safety analysis. In Proceedings of the 2022 IEEE\/ACM 44st International Conference on Software Engineering."},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3395363.3397357"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aaw4399"},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3106237.3106277"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375607"},{"key":"e_1_3_4_22_2","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS\u201914) . 2672\u20132680."},{"key":"e_1_3_4_23_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Goodfellow Ian","year":"2015","unstructured":"Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the International Conference on Learning Representations. http:\/\/arxiv.org\/abs\/1412.6572."},{"key":"e_1_3_4_24_2","first-page":"6645","volume-title":"Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Graves Alex","year":"2013","unstructured":"Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, Los Alamitos, CA, 6645\u20136649."},{"key":"e_1_3_4_25_2","article-title":"Google engineer apologizes after photos app tags two black people as gorillas","author":"Grush Loren","year":"2015","unstructured":"Loren Grush. 2015. Google engineer apologizes after photos app tags two black people as gorillas. The Verge. Retrieved February 21, 2023 from https:\/\/www.theverge.com\/2015\/7\/1\/8880363\/google-apologizes-photos-app-tags-two-black-people-gorillas.","journal-title":"The Verge."},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3236024.3264835"},{"key":"e_1_3_4_27_2","volume-title":"Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE\u201920)","author":"Harel-Canada Fabrice","year":"2020","unstructured":"Fabrice Harel-Canada, Lingxiao Wang, Muhammad Ali Gulzar, Quanquan Gu, and Miryung Kim. 2020. Is neuron coverage a meaningful measure for testing deep neural networks? In Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE\u201920)."},{"key":"e_1_3_4_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_4_29_2","article-title":"An empirical study on data distribution-aware test selection for deep learning enhancement","author":"Hu Qiang","year":"2022","unstructured":"Qiang Hu, Yuejun Guo, Maxime Cordy, Xiaofei Xie, Lei Ma, Mike Papadakis, and Yves Le Traon. 2022. An empirical study on data distribution-aware test selection for deep learning enhancement. ACM Transactions on Software Engineering and Methodology 31, 4 (2022), Article 78, 30.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_3_4_30_2","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-3-319-63387-9_1","volume-title":"Proceedings of the International Conference on Computer Aided Verification","author":"Huang Xiaowei","year":"2017","unstructured":"Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In Proceedings of the International Conference on Computer Aided Verification. 3\u201329."},{"key":"e_1_3_4_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380395"},{"key":"e_1_3_4_32_2","doi-asserted-by":"crossref","unstructured":"Todd Huster Cho-Yu Jason Chiang and Ritu Chadha. 2019. Limitations of the Lipschitz constant as a defense against adversarial examples. In ECML PKDD 2018 Workshops . Lecture Notes in Computer Science Vol. 11329. Springer 16\u201329.","DOI":"10.1007\/978-3-030-13453-2_2"},{"key":"e_1_3_4_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00159"},{"key":"e_1_3_4_34_2","article-title":"Towards proving the adversarial robustness of deep neural networks","author":"Katz Guy","year":"2017","unstructured":"Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Towards proving the adversarial robustness of deep neural networks. arXiv preprint arXiv:1709.02802 (2017).","journal-title":"arXiv preprint arXiv:1709.02802"},{"key":"e_1_3_4_35_2","article-title":"Adversarial training with Voronoi constraints","author":"Khoury Marc","year":"2019","unstructured":"Marc Khoury and Dylan Hadfield-Menell. 2019. Adversarial training with Voronoi constraints. arXiv preprint arXiv:1905.01019 (2019).","journal-title":"arXiv preprint arXiv:1905.01019"},{"key":"e_1_3_4_36_2","first-page":"1039","volume-title":"Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE\u20192019)","author":"Kim Jinhan","year":"2019","unstructured":"Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding deep learning system testing using surprise adequacy. In Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering (ICSE\u20192019). IEEE, Los Alamitos, CA, 1039\u20131049."},{"key":"e_1_3_4_37_2","first-page":"202","volume-title":"Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD\u201996)","year":"1996","unstructured":"Ron Kohavi. 1996. Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD\u201996). 202\u2013207."},{"key":"e_1_3_4_38_2","unstructured":"Alex Krizhevsky and Geoffrey Hinton. 2009. Learning Multiple Layers of Features from Tiny Images . Technical Report. University of Toronto."},{"key":"e_1_3_4_39_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_4_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3395363.3397346"},{"key":"e_1_3_4_42_2","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1109\/IVS.2011.5940562","volume-title":"Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV\u201911)","author":"Levinson Jesse","year":"2011","unstructured":"Jesse Levinson, Jake Askeland, Jan Becker, Jennifer Dolson, David Held, Soeren Kammel, J. Zico Kolter, et\u00a0al. 2011. Towards fully autonomous driving: Systems and algorithms. In Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV\u201911). IEEE, Los Alamitos, CA, 163\u2013168."},{"key":"e_1_3_4_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3533767.3534408"},{"key":"e_1_3_4_44_2","first-page":"89","volume-title":"Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER\u201919)","author":"Li Zenan","year":"2019","unstructured":"Zenan Li, Xiaoxing Ma, Chang Xu, and Chun Cao. 2019. Structural coverage criteria for neural networks could be misleading. In Proceedings of the 2019 IEEE\/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER\u201919). IEEE, Los Alamitos, CA, 89\u201392."},{"key":"e_1_3_4_45_2","first-page":"614","volume-title":"Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201919)","author":"Ma Lei","year":"2019","unstructured":"Lei Ma, Felix Juefei-Xu, Minhui Xue, Bo Li, Li Li, Yang Liu, and Jianjun Zhao. 2019. DeepCT: Tomographic combinatorial testing for deep learning systems. In Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201919). IEEE, Los Alamitos, CA, 614\u2013618."},{"key":"e_1_3_4_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238202"},{"key":"e_1_3_4_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3417330"},{"key":"e_1_3_4_48_2","first-page":"107332","article-title":"Understanding adversarial attacks on deep learning based medical image analysis systems","author":"Ma Xingjun","year":"2020","unstructured":"Xingjun Ma, Yuhao Niu, Lin Gu, Yisen Wang, Yitian Zhao, James Bailey, and Feng Lu. 2020. Understanding adversarial attacks on deep learning based medical image analysis systems. Pattern Recognition 110 (2020), 107332.","journal-title":"Pattern Recognition"},{"key":"e_1_3_4_49_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Madry Aleksander","year":"2018","unstructured":"Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In Proceedings of the International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rJzIBfZAb."},{"key":"e_1_3_4_50_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2014.03.001"},{"key":"e_1_3_4_51_2","volume-title":"Machine Learning: A Probabilistic Perspective","author":"Murphy Kevin P.","year":"2012","unstructured":"Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge, MA."},{"key":"e_1_3_4_52_2","unstructured":"Yuval Netzer Tao Wang Adam Coates Alessandro Bissacco Bo Wu and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning ."},{"key":"e_1_3_4_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/1297846.1297902"},{"key":"e_1_3_4_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/EuroSP.2016.36"},{"key":"e_1_3_4_55_2","doi-asserted-by":"crossref","unstructured":"Omkar M. Parkhi Andrea Vedaldi and Andrew Zisserman. 2015. Deep face recognition. In Proceedings of the 2015 British Machine Vision Conference (BMVC\u201915) .","DOI":"10.5244\/C.29.41"},{"key":"e_1_3_4_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132785"},{"key":"e_1_3_4_57_2","first-page":"355","volume-title":"Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE\u201921)","author":"Riccio Vincenzo","year":"2021","unstructured":"Vincenzo Riccio, Nargiz Humbatova, Gunel Jahangirova, and Paolo Tonella. 2021. DeepMetis: Augmenting a deep learning test set to increase its mutation score. In Proceedings of the 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE\u201921). IEEE, Los Alamitos, CA, 355\u2013367."},{"key":"e_1_3_4_58_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-020-09881-0"},{"key":"e_1_3_4_59_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-008-9102-8"},{"key":"e_1_3_4_60_2","doi-asserted-by":"publisher","DOI":"10.1213\/ANE.0000000000002864"},{"key":"e_1_3_4_61_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2012.06.002"},{"key":"e_1_3_4_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416621"},{"key":"e_1_3_4_63_2","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).","journal-title":"arXiv preprint arXiv:1409.1556"},{"key":"e_1_3_4_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3290354"},{"key":"e_1_3_4_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238172"},{"key":"e_1_3_4_66_2","volume-title":"Computer Vision: Algorithms and Applications","author":"Szeliski Richard","year":"2010","unstructured":"Richard Szeliski. 2010. Computer Vision: Algorithms and Applications. Springer Science & Business Media."},{"key":"e_1_3_4_67_2","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180220"},{"key":"e_1_3_4_68_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-30942-8_39"},{"key":"e_1_3_4_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238165"},{"key":"e_1_3_4_70_2","first-page":"300","volume-title":"Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE\u201921)","author":"Wang Jingyi","year":"2021","unstructured":"Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, and Peng Cheng. 2021. RobOT: Robustness-oriented testing for deep learning systems. In Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE\u201921). IEEE, Los Alamitos, CA, 300\u2013311."},{"key":"e_1_3_4_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00126"},{"key":"e_1_3_4_72_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00078"},{"key":"e_1_3_4_73_2","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180177"},{"key":"e_1_3_4_74_2","unstructured":"Yisen Wang Xingjun Ma James Bailey Jinfeng Yi Bowen Zhou and Quanquan Gu. 2019. On the convergence and robustness of adversarial training. In Proceedings of the 36th International Conference on Machine Learning . 6586\u20136595. http:\/\/proceedings.mlr.press\/v97\/wang19i.html."},{"key":"e_1_3_4_75_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Wang Yisen","year":"2019","unstructured":"Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, and Quanquan Gu. 2019. Improving adversarial robustness requires revisiting misclassified examples. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_4_76_2","first-page":"397","volume-title":"Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE\u201921)","author":"Wang Zan","year":"2021","unstructured":"Zan Wang, Hanmo You, Junjie Chen, Yingyi Zhang, Xuyuan Dong, and Wenbin Zhang. 2021. Prioritizing test inputs for deep neural networks via mutation analysis. In Proceedings of the 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE\u201921). IEEE, Los Alamitos, CA, 397\u2013409."},{"key":"e_1_3_4_77_2","article-title":"Towards fast computation of certified robustness for ReLU networks","author":"Weng Tsui-Wei","year":"2018","unstructured":"Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, and Luca Daniel. 2018. Towards fast computation of certified robustness for ReLU networks. arXiv preprint arXiv:1804.09699 (2018).","journal-title":"arXiv preprint arXiv:1804.09699"},{"key":"e_1_3_4_78_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Weng Tsui-Wei","year":"2018","unstructured":"Tsui-Wei Weng, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, and Luca Daniel. 2018. Evaluating the robustness of neural networks: An extreme value theory approach. In Proceedings of the International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=BkUHlMZ0b."},{"key":"e_1_3_4_79_2","first-page":"408","volume-title":"Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems","author":"Wicker Matthew","year":"2018","unstructured":"Matthew Wicker, Xiaowei Huang, and Marta Kwiatkowska. 2018. Feature-guided black-box safety testing of deep neural networks. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 408\u2013426."},{"key":"e_1_3_4_80_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-29044-2"},{"key":"e_1_3_4_81_2","article-title":"FASHION-MNIST: A novel image dataset for benchmarking machine learning algorithms","author":"Xiao Han","year":"2017","unstructured":"Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. FASHION-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).","journal-title":"arXiv preprint arXiv:1708.07747"},{"key":"e_1_3_4_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3293882.3330579"},{"key":"e_1_3_4_83_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-011-5268-1"},{"key":"e_1_3_4_84_2","article-title":"Improving neural network verification through spurious region guided refinement","author":"Yang Pengfei","year":"2020","unstructured":"Pengfei Yang, Renjue Li, Jianlin Li, Cheng-Chao Huang, Jingyi Wang, Jun Sun, Bai Xue, and Lijun Zhang. 2020. Improving neural network verification through spurious region guided refinement. arXiv preprint arXiv:2010.07722 (2020).","journal-title":"arXiv preprint arXiv:2010.07722"},{"key":"e_1_3_4_85_2","article-title":"DeepRepair: Style-guided repairing for deep neural networks in the real-world operational environment","author":"Yu Bing","year":"2022","unstructured":"Bing Yu, Hua Qi, Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, and Jianjun Zhao. 2022. DeepRepair: Style-guided repairing for deep neural networks in the real-world operational environment. IEEE Transactions on Reliability 71, 4 (2022), 1401\u20131416.","journal-title":"IEEE Transactions on Reliability"},{"key":"e_1_3_4_86_2","first-page":"7472","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Zhang Hongyang","year":"2019","unstructured":"Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In Proceedings of the International Conference on Machine Learning. PMLR, 7472\u20137482."},{"key":"e_1_3_4_87_2","article-title":"Machine learning testing: Survey, landscapes and horizons","author":"Zhang Jie M.","year":"2022","unstructured":"Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2022. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering 48, 1 (2022), 1\u201336.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_4_88_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380331"},{"key":"e_1_3_4_89_2","article-title":"Fairness testing of deep image classification with adequacy metrics","author":"Zhang Peixin","year":"2021","unstructured":"Peixin Zhang, Jingyi Wang, Jun Sun, and Xinyu Wang. 2021. Fairness testing of deep image classification with adequacy metrics. arXiv preprint arXiv:2111.08856 (2021).","journal-title":"arXiv preprint arXiv:2111.08856"},{"key":"e_1_3_4_90_2","article-title":"Automatic fairness testing of neural classifiers through adversarial sampling","author":"Zhang Peixin","year":"2022","unstructured":"Peixin Zhang, Jingyi Wang, Jun Sun, Xinyu Wang, Guoliang Dong, Xingen Wang, Ting Dai, and Jin Song Dong. 2022. Automatic fairness testing of neural classifiers through adversarial sampling. IEEE Transactions on Software Engineering 48, 9 (2022), 3593\u20133612.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_3_4_91_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3582573","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3582573","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:13Z","timestamp":1750183753000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3582573"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,22]]},"references-count":90,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,9,30]]}},"alternative-id":["10.1145\/3582573"],"URL":"https:\/\/doi.org\/10.1145\/3582573","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,22]]},"assertion":[{"value":"2022-03-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-16","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}