{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:50:35Z","timestamp":1767340235856,"version":"build-2065373602"},"reference-count":63,"publisher":"MDPI AG","issue":"13","license":[{"start":{"date-parts":[[2023,7,5]],"date-time":"2023-07-05T00:00:00Z","timestamp":1688515200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Science and Technology Council (NSTC)","award":["110-2221-E-001-009-MY2","111-2634-F-002-022","111-2221-E-001-002","111-2634-F-006-012","111-2634-F-A49-010"],"award-info":[{"award-number":["110-2221-E-001-009-MY2","111-2634-F-002-022","111-2221-E-001-002","111-2634-F-006-012","111-2634-F-A49-010"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Adversarial attacks have become one of the most serious security issues in widely used deep neural networks. Even though real-world datasets usually have large intra-variations or multiple modes, most adversarial defense methods, such as adversarial training, which is currently one of the most effective defense methods, mainly focus on the single-mode setting and thus fail to capture the full data representation to defend against adversarial attacks. To confront this challenge, we propose a novel multi-prototype metric learning regularization for adversarial training which can effectively enhance the defense capability of adversarial training by preventing the latent representation of the adversarial example changing a lot from its clean one. With extensive experiments on CIFAR10, CIFAR100, MNIST, and Tiny ImageNet, the evaluation results show the proposed method improves the performance of different state-of-the-art adversarial training methods without additional computational cost. Furthermore, besides Tiny ImageNet, in the multi-prototype CIFAR10 and CIFAR100 where we reorganize the whole datasets of CIFAR10 and CIFAR100 into two and ten classes, respectively, the proposed method outperforms the state-of-the-art approach by 2.22% and 1.65%, respectively. Furthermore, the proposed multi-prototype method also outperforms its single-prototype version and other commonly used deep metric learning approaches as regularization for adversarial training and thus further demonstrates its effectiveness.<\/jats:p>","DOI":"10.3390\/s23136173","type":"journal-article","created":{"date-parts":[[2023,7,6]],"date-time":"2023-07-06T00:54:41Z","timestamp":1688604881000},"page":"6173","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Towards Adversarial Robustness for Multi-Mode Data through Metric Learning"],"prefix":"10.3390","volume":"23","author":[{"given":"Sarwar","family":"Khan","sequence":"first","affiliation":[{"name":"Research Center for Information Technology Innovation, Academia Sinica, Taipei 11529, Taiwan"},{"name":"Social Networks Human-Centered Computing, Taiwan International Graduate Program, Academia Sinica, Taipei 11529, Taiwan"},{"name":"Department of Computer Science, National Chengchi University, Taipei 11605, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun-Cheng","family":"Chen","sequence":"additional","affiliation":[{"name":"Research Center for Information Technology Innovation, Academia Sinica, Taipei 11529, Taiwan"},{"name":"Social Networks Human-Centered Computing, Taiwan International Graduate Program, Academia Sinica, Taipei 11529, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wen-Hung","family":"Liao","sequence":"additional","affiliation":[{"name":"Social Networks Human-Centered Computing, Taiwan International Graduate Program, Academia Sinica, Taipei 11529, Taiwan"},{"name":"Department of Computer Science, National Chengchi University, Taipei 11605, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chu-Song","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106319, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Szeliski, R. (2010). Computer Vision: Algorithms and Applications, Springer Science, Business Media.","DOI":"10.1007\/978-1-84882-935-0"},{"key":"ref_2","unstructured":"Gonzalez, R., and Woods, R. (2008). Digital Image Processing, Pearson. [3rd ed.]. Available online: http:\/\/www.amazon.com\/Digital-Image-Processing-3rd-Edition\/dp\/013168728X."},{"key":"ref_3","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the Advances in Neural information Processing Systems 25, Lake Tahoe, NV, USA."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016\u20131, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_5","unstructured":"Szegedy, C., Toshev, A., and Erhan, D. (2013, January 5\u201310). Deep neural networks for object detection. Proceedings of the Advances in Neural information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_6","unstructured":"Sohn, K. (2016, January 5\u201310). Improved deep metric learning with multi-class n-pair loss objective. Proceedings of the 30th international Conference on Neural information Processing Systems, Barcelona, Spain."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_8","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Xu, M., Zhang, Z., Hu, H., Wang, J., Wang, L., Wei, F., Bai, X., and Liu, Z. (2021). End-to-End Semi-Supervised Object Detection with Soft Teacher. arXiv.","DOI":"10.1109\/ICCV48922.2021.00305"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wang, C., Bochkovskiy, A., and Liao, H. (2021, January 20\u201325). Scaled-YOLOv4: Scaling Cross Stage Partial Network. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"ref_12","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster r-cnn: Towards real-time object detection with region proposal networks. Proceedings of the Advances in Neural information Processing Systems, Montreal, QC, Canada."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., and Berg, A. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_14","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for MobileNetV3. Proceedings of the IEEE\/CVF international Conference on Computer Vision (ICCV), Seoul, Republic of Korea."},{"key":"ref_15","unstructured":"Chen, L., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_16","unstructured":"Goodfellow, I., Shlens, J., and Szegedy, C. (2015, January 7\u20139). Explaining and harnessing adversarial examples. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Carlini, N., and Wagner, D. (2017, January 22\u201326). Towards evaluating the robustness of neural networks. Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA.","DOI":"10.1109\/SP.2017.49"},{"key":"ref_18","unstructured":"Athalye, A., Carlini, N., and Wagner, D. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv."},{"key":"ref_19","unstructured":"Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (May, January 30). Towards deep learning models resistant to adversarial attacks. Proceedings of the international Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., and Wei, Y. (2020, January 14\u201319). Circle loss: A unified perspective of pair similarity optimization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00643"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kaya, M., and Bilge, H. (2019). Deep metric learning: A survey. Symmetry, 11.","DOI":"10.3390\/sym11091066"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Oh Song, H., Xiang, Y., Jegelka, S., and Savarese, S. (2016\u20131, January 26). Deep metric learning via lifted structured feature embedding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.434"},{"key":"ref_23","unstructured":"Roth, K., Milbich, T., Sinha, S., Gupta, P., Ommer, B., and Cohen, J. (2020, January 13\u201318). Revisiting training strategies and generalization performance in deep metric learning. Proceedings of the International Conference on Machine Learning, Virtual Event."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Duan, Y., Zheng, W., Lin, X., Lu, J., and Zhou, J. (2018, January 18\u201322). Deep adversarial metric learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00294"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, J., Zhou, F., Wen, S., Liu, X., and Lin, Y. (2017, January 22\u201329). Deep metric learning with angular loss. Proceedings of the IEEE international Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.283"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hoffer, E., and Ailon, N. (2015, January 12\u201314). Deep metric learning using triplet network. Proceedings of the International Workshop on Similarity-Based Pattern Recognition, Copenhagen, Denmark.","DOI":"10.1007\/978-3-319-24261-3_7"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Movshovitz-Attias, Y., Toshev, A., Leung, T., Ioffe, S., and Singh, S. (2017, January 22\u201329). No fuss distance metric learning using proxies. Proceedings of the IEEE international Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.47"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kim, S., Kim, D., Cho, M., and Kwak, S. (2020, January 14\u201319). Proxy Anchor Loss for Deep Metric Learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00330"},{"key":"ref_29","unstructured":"Qian, Q., Shang, L., Sun, B., Hu, J., Li, H., and Jin, R. (November, January 27). Softtriple loss: Deep metric learning without triplet sampling. Proceedings of the IEEE international Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_30","unstructured":"Mao, C., Zhong, Z., Yang, J., Vondrick, C., and Ray, B. (2019, January 8\u201314). Metric learning for adversarial robustness. Proceedings of the Advances in Neural information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_31","unstructured":"Addepalli, S., and Jain, S. (December, January 28). Others Efficient and effective augmentation strategy for adversarial training. Proceedings of the Advances in Neural information Processing Systems, New Orleans, LA, USA."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lee, S., and Lee, H. (2023, January 2\u20137). inducing Data Amplification Using Auxiliary Datasets in Adversarial Training. Proceedings of the IEEE\/CVF Winter Conference on Applications Of Computer Vision (WACV), Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00453"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wei, Z., Wang, Y., Guo, Y., and Wang, Y. (2023, January 18\u201322). Cfa: Class-wise calibrated fair adversarial training. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00792"},{"key":"ref_34","unstructured":"LeCun, Y., and Cortes, C. (2021, April 06). MNIST Handwritten Digit Database. Available online: http:\/\/yann.lecun.com\/exdb\/mnist\/."},{"key":"ref_35","first-page":"2579","article-title":"Visualizing Data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_37","unstructured":"Shafahi, A., Najibi, M., Ghiasi, M., Xu, Z., Dickerson, J., Studer, C., Davis, L., Taylor, G., and Goldstein, T. (2019, January 8\u201314). Adversarial training for free!. Proceedings of the Advances in Neural information Processing Systems 32, Vancouver, BC, Canada."},{"key":"ref_38","unstructured":"Wong, E., Rice, L., and Kolter, J. (2020, January 26\u201330). Fast is better than free: Revisiting adversarial training. Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia."},{"key":"ref_39","unstructured":"Krizhevsky, A., Nair, V., and Hinton, G. (2023, April 23). CIFAR-10 (Canadian Institute for Advanced Research). Available online: http:\/\/www.cs.toronto.edu\/~kriz\/cifar.html."},{"key":"ref_40","first-page":"3","article-title":"Tiny imagenet visual recognition challenge","volume":"7","author":"Le","year":"2015","journal-title":"CS 231N"},{"key":"ref_41","unstructured":"Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014, January 14\u201316). intriguing properties of neural networks. Proceedings of the international Conference on Learning Representations, Banff, AB, Canada."},{"key":"ref_42","unstructured":"Kurakin, A., Goodfellow, I., and Bengio, S. (2016, January 2\u20134). Adversarial examples in the physical world. Proceedings of the international Conference on Learning Representations, San Juan, Puerto Rico."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z., and Swami, A. (2016, January 21\u201324). The limitations of deep learning in adversarial settings. Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P), Saarbrucken, Germany.","DOI":"10.1109\/EuroSP.2016.36"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Nguyen, A., Yosinski, J., and Clune, J. (2015, January 7\u201312). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298640"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., and Li, J. (2018, January 18\u201322). Boosting adversarial attacks with momentum. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00957"},{"key":"ref_46","unstructured":"Papernot, N., McDaniel, P., and Goodfellow, I. (2016). Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv."},{"key":"ref_47","unstructured":"Liu, Y., Chen, X., Liu, C., and Song, D. (2016, January 2\u20134). Delving into transferable adversarial examples and black-box attacks. Proceedings of the international Conference on Learning Representations, San Juan, Puerto Rico."},{"key":"ref_48","unstructured":"Samangouei, P., Kabkab, M., and Chellappa, R. (May, January 30). Defense-gan: Protecting classifiers against adversarial attacks using generative models. Proceedings of the international Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_49","unstructured":"Yan, Z., Guo, Y., and Zhang, C. (2018, January 3\u20138). Deep defense: Training dnns with improved adversarial robustness. Proceedings of the Advances in Neural information Processing Systems, Montreal, QC, Canada."},{"key":"ref_50","unstructured":"Zheng, S., Song, Y., Leung, T., and Goodfellow, I. (July, January 26). Improving the robustness of deep neural networks via stability training. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Liu, D., Wu, L., Zhao, H., Boussaid, F., Bennamoun, M., and Xie, X. (2022). Jacobian Norm with Selective input Gradient Regularization for Improved and interpretable Adversarial Defense. arXiv.","DOI":"10.2139\/ssrn.4452072"},{"key":"ref_52","unstructured":"Tram\u00e8r, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., and McDaniel, P. (May, January 30). Ensemble adversarial training: Attacks and defenses. Proceedings of the international Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Papernot, N., McDaniel, P., Wu, X., Jha, S., and Swami, A. (2016, January 22\u201326). Distillation as a defense to adversarial perturbations against deep neural networks. Proceedings of the 2016 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA.","DOI":"10.1109\/SP.2016.41"},{"key":"ref_54","unstructured":"Carlini, N., and Wagner, D. (2016). Defensive distillation is not robust to adversarial examples. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Xie, C., Wu, Y., Maaten, L., Yuille, A., and He, K. (2019, January 15\u201320). Feature denoising for improving adversarial robustness. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00059"},{"key":"ref_56","unstructured":"Goldberger, J., Hinton, G., Roweis, S., and Salakhutdinov, R. (2015, January 7\u201312). Neighbourhood components analysis. Proceedings of the Advances in Neural information Processing Systems, Montreal, QC, Canada."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3587467","article-title":"Generative metric learning for adversarially robust open-world person re-identification","volume":"19","author":"Liu","year":"2023","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_58","first-page":"2081","article-title":"Cross-entropy adversarial view adaptation for person re-identification","volume":"30","author":"Wu","year":"2019","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_59","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (July, January 26). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1145\/3352020.3352024","article-title":"Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark","volume":"53","author":"Coleman","year":"2019","journal-title":"SIGOPS Oper. Syst. Rev."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., and Komodakis, N. (2016). Wide residual networks. arXiv.","DOI":"10.5244\/C.30.87"},{"key":"ref_62","unstructured":"Kim, M., Tack, J., and Hwang, S. (2020, January 6\u201312). Adversarial self-supervised contrastive learning. Proceedings of the Advances in Neural information Processing Systems, Online Conference."},{"key":"ref_63","unstructured":"Jiang, Z., Chen, T., Chen, T., and Wang, Z. (2020, January 6\u201312). Robust pre-training by adversarial contrastive learning. Proceedings of the Advances in Neural information Processing Systems, Online Conference."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6173\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:06:49Z","timestamp":1760126809000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/13\/6173"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,5]]},"references-count":63,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["s23136173"],"URL":"https:\/\/doi.org\/10.3390\/s23136173","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,7,5]]}}}