{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T07:41:25Z","timestamp":1768462885420,"version":"3.49.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"10","license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2020YFA0711400"],"award-info":[{"award-number":["2020YFA0711400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62171323 and 62271155"],"award-info":[{"award-number":["62171323 and 62271155"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Shanghai Municipal Science and Technology Major Project","award":["2021SHZDZX0100"],"award-info":[{"award-number":["2021SHZDZX0100"]}]},{"name":"Changjiang Scholars Program of China"},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]},{"name":"State Key Program of National Natural Science of China","award":["62331001"],"award-info":[{"award-number":["62331001"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,10,31]]},"abstract":"<jats:p>The quest for enhancing the interpretability of neural networks has become a prominent focus in recent research endeavors. Prototype-based neural networks have emerged as a promising avenue for imbuing models with interpretability by gauging the similarity between image components and category prototypes to inform decision-making. However, these networks face challenges as they share similarity activations during both the inference and explanation processes, creating a tradeoff between accuracy and interpretability. To address this issue and ensure that a network achieves high accuracy and robust interpretability in the classification process, this article introduces a groundbreaking prototype-based neural network termed the \u201cDecoupling Prototypical Network\u201d (DProtoNet). This novel architecture comprises encoder, inference, and interpretation modules. In the encoder module, we introduce decoupling feature masks to facilitate the generation of feature vectors and prototypes, enhancing the generalization capabilities of the model. The inference module leverages these feature vectors and prototypes to make predictions based on similarity comparisons, thereby preserving an interpretable inference structure. Meanwhile, the interpretation module advances the field by presenting a novel approach: a \u201cmultiple dynamic masks decoder\u201d that replaces conventional upsampling similarity activations. This decoder operates by perturbing images with mask vectors of varying sizes and learning saliency maps through consistent activation. This methodology offers a precise and innovative means of interpreting prototype-based networks. DProtoNet effectively separates the inference and explanation components within prototype-based networks. By eliminating the constraints imposed by shared similarity activations during the inference and explanation phases, our approach concurrently elevates accuracy and interpretability. Experimental evaluations on diverse public natural datasets, including CUB-200-2011, Stanford Cars, and medical datasets like RSNA and iChallenge-PM, corroborate the substantial enhancements achieved by our method compared to previous state-of-the-art approaches. Furthermore, ablation studies are conducted to provide additional evidence of the effectiveness of our proposed components.<\/jats:p>","DOI":"10.1145\/3674837","type":"journal-article","created":{"date-parts":[[2024,7,10]],"date-time":"2024-07-10T15:12:20Z","timestamp":1720624340000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Decoupling Deep Learning for Enhanced Image Recognition Interpretability"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2680-5822","authenticated-orcid":false,"given":"Yitao","family":"Peng","sequence":"first","affiliation":[{"name":"School of Electronic and Information Engineering, Tongji University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5250-170X","authenticated-orcid":false,"given":"Lianghua","family":"He","sequence":"additional","affiliation":[{"name":"Shanghai Eye Disease Prevention and Treatment Center, Shanghai, China and School of Electronic and Information Engineering, Tongji University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8081-8512","authenticated-orcid":false,"given":"Die","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, Fudan University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4257-2528","authenticated-orcid":false,"given":"Yihang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering, Tongji University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5791-145X","authenticated-orcid":false,"given":"Longzhen","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering, Tongji University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2809-1421","authenticated-orcid":false,"given":"Shaohua","family":"Shang","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering, Tongji University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,10,16]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"crossref","first-page":"839","DOI":"10.1109\/WACV.2018.00097","volume-title":"Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)","author":"Chattopadhay Aditya","year":"2018","unstructured":"Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N. Balasubramanian. 2018. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 839\u2013847."},{"key":"e_1_3_1_3_2","first-page":"8930","volume-title":"Proceedings of the 33rd International Conference on Neural Information Processing Systems","author":"Chen Chaofan","year":"2019","unstructured":"Chaofan Chen, Oscar Li, Daniel Tao, Alina Barnett, Cynthia Rudin, and Jonathan K Su. 2019. This looks like that: Deep learning for interpretable image recognition. Proceedings of the 33rd International Conference on Neural Information Processing Systems. 8930\u20138941."},{"key":"e_1_3_1_4_2","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1109\/CVPR.2009.5206848","volume-title":"Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition","author":"Deng Jia","year":"2009","unstructured":"Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248\u2013255."},{"key":"e_1_3_1_5_2","first-page":"10265","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Donnelly Jon","year":"2022","unstructured":"Jon Donnelly, Alina Jade Barnett, and Chaofan Chen. 2022. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10265\u201310275."},{"key":"e_1_3_1_6_2","volume-title":"PALM: Pathologic Myopia Challenge","author":"Fu Huazhu","year":"2019","unstructured":"Huazhu Fu, Fei Li, Jos\u00e9 Ignacio Orlando, Hrvoje Bogunovic, Xu Sun, Jingan Liao, Yanwu Xu, Shaochong Zhang, and Xiulan Zhang. 2019. PALM: Pathologic Myopia Challenge. IEEE Dataport."},{"key":"e_1_3_1_7_2","unstructured":"Ruigang Fu Qingyong Hu Xiaohu Dong Yulan Guo Yinghui Gao and Biao Li. 2020. Axiom-based Grad-CAM: Towards accurate visualization and explanation of CNNs. arXiv:2008.02312. Retrieved from https:\/\/arxiv.org\/abs\/2008.02312"},{"key":"e_1_3_1_8_2","first-page":"350","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Gabruseva Tatiana","year":"2020","unstructured":"Tatiana Gabruseva, Dmytro Poplavskiy, and Alexandr Kalinin. 2020. Deep learning for automatic pneumonia detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 350\u2013351."},{"key":"e_1_3_1_9_2","first-page":"65","volume-title":"Neurocomputing","author":"Guo Peng","year":"2022","unstructured":"Peng Guo, Guoqing Du, Longsheng Wei, Huaiying Lu, Siwei Chen, Changxin Gao, Ying Chen, Jinsheng Li, and Dapeng Luo. 2022. Multiscale face recognition in cluttered backgrounds based on visual attention. Neurocomputing 469 (2022), 65\u201380."},{"key":"e_1_3_1_10_2","first-page":"770","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778."},{"key":"e_1_3_1_11_2","first-page":"4700","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Huang Gao","year":"2017","unstructured":"Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700\u20134708."},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","first-page":"5875","DOI":"10.1109\/TIP.2021.3089943","article-title":"Layercam: Exploring hierarchical class activation maps for localization","volume":"30","author":"Jiang Peng-Tao","year":"2021","unstructured":"Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, and Yunchao Wei. 2021. Layercam: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing 30 (2021), 5875\u20135888.","journal-title":"IEEE Transactions on Image Processing"},{"key":"e_1_3_1_13_2","first-page":"10233","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Keswani Monish","year":"2022","unstructured":"Monish Keswani, Sriranjani Ramakrishnan, Nishant Reddy, and Vineeth N. Balasubramanian. 2022. Proto2Proto: Can you recognize the car, the way I do? In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10233\u201310243."},{"key":"e_1_3_1_14_2","first-page":"15719","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Kim Eunji","year":"2021","unstructured":"Eunji Kim, Siwon Kim, Minji Seo, and Sungroh Yoon. 2021. XProtoNet: Diagnosis in chest radiography with global and local explanations. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 15719\u201315728."},{"key":"e_1_3_1_15_2","first-page":"554","volume-title":"Proceedings of the IEEE International Conference on Computer Vision Workshops","author":"Krause Jonathan","year":"2013","unstructured":"Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 2013. 3D object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 554\u2013561."},{"key":"e_1_3_1_16_2","first-page":"2453","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Laradji Issam","year":"2021","unstructured":"Issam Laradji, Pau Rodriguez, Oscar Manas, Keegan Lensink, Marco Law, Lironne Kurzman, William Parker, David Vazquez, and Derek Nowrouzezahrai. 2021. A weakly supervised consistency-based learning method for COVID-19 segmentation in CT images. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 2453\u20132462."},{"issue":"4","key":"e_1_3_1_17_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3574136","article-title":"DDIFN: A dual-discriminator multi-modal medical image fusion network","volume":"19","author":"Liu Hui","year":"2023","unstructured":"Hui Liu, Shanshan Li, Jicheng Zhu, Kai Deng, Meng Liu, and Liqiang Nie. 2023. DDIFN: A dual-discriminator multi-modal medical image fusion network. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 4 (2023), 1\u201317.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"issue":"5","key":"e_1_3_1_18_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3585388","article-title":"TEVL: Trilinear encoder for video-language representation learning","volume":"19","author":"Man Xin","year":"2023","unstructured":"Xin Man, Jie Shao, Feiyu Chen, Mingxing Zhang, and Heng Tao Shen. 2023. TEVL: Trilinear encoder for video-language representation learning. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 5s (2023), 1\u201320.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_19_2","first-page":"198","volume-title":"Proceedings of the European Conference on Artificial Intelligence","author":"Nauta Meike","year":"2023","unstructured":"Meike Nauta, Johannes H. Hegeman, Jeroen Geerdink, J\u00f6rg Schl\u00f6tterer, Maurice van Keulen, and Christin Seifert. 2023. Interpreting and correcting medical image classification with pip-net. In Proceedings of the European Conference on Artificial Intelligence. Springer, 198\u2013215."},{"key":"e_1_3_1_20_2","first-page":"2744","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Nauta Meike","year":"2023","unstructured":"Meike Nauta, J\u00f6rg Schl\u00f6tterer, Maurice van Keulen, and Christin Seifert. 2023. PIP-Net: Patch-based intuitive prototypes for interpretable image classification. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2744\u20132753."},{"key":"e_1_3_1_21_2","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1007\/978-3-031-44064-9_21","volume-title":"Proceedings of the World Conference on Explainable Artificial Intelligence","author":"Nauta Meike","year":"2023","unstructured":"Meike Nauta and Christin Seifert. 2023. The Co-12 recipe for evaluating interpretable part-prototype image classifiers. In Proceedings of the World Conference on Explainable Artificial Intelligence. Springer, 397\u2013420."},{"issue":"13","key":"e_1_3_1_22_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3583558","article-title":"From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable AI","volume":"55","author":"Nauta Meike","year":"2023","unstructured":"Meike Nauta, Jan Trienes, Shreyasi Pathak, Elisa Nguyen, Michelle Peters, Yasmin Schmitt, J\u00f6rg Schl\u00f6tterer, Maurice van Keulen, and Christin Seifert. 2023. From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable AI. ACM Computing Surveys 55, 13s (2023), 1\u201342.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_1_23_2","first-page":"14933","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Nauta Meike","year":"2021","unstructured":"Meike Nauta, Ron Van Bree, and Christin Seifert. 2021. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14933\u201314943."},{"key":"e_1_3_1_24_2","unstructured":"Vitali Petsiuk Abir Das and Kate Saenko. 2018. Rise: Randomized input sampling for explanation of black-box models. arXiv:1806.07421. Retrieved from https:\/\/arxiv.org\/abs\/1806.07421"},{"key":"e_1_3_1_25_2","first-page":"11443","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Petsiuk Vitali","year":"2021","unstructured":"Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I Morariu, Ashutosh Mehra, Vicente Ordonez, and Kate Saenko. 2021. Black-box explanation of object detectors via saliency maps. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11443\u201311452."},{"key":"e_1_3_1_26_2","first-page":"983","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Desai Saurabh","year":"2020","unstructured":"Saurabh Desai and Harish Guruprasad Ramaswamy. 2020. Ablation-CAM: Visual explanations for deep convolutional network via gradient-free localization. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 983\u2013991."},{"issue":"5","key":"e_1_3_1_27_2","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1038\/s42256-019-0048-x","article-title":"Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead","volume":"1","author":"Rudin Cynthia","year":"2019","unstructured":"Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206\u2013215.","journal-title":"Nature Machine Intelligence"},{"key":"e_1_3_1_28_2","first-page":"351","volume-title":"Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922)","author":"Rymarczyk Dawid","year":"2022","unstructured":"Dawid Rymarczyk, \u0141ukas Struski, Micha\u00c5\u201a G\u00f3rszczak, Koryna Lewandowska, Jacek Tabor, and Bartosz Zieli\u0144ski. 2022. Interpretable image classification with differentiable prototypes assignment. In Proceedings of the 17th European Conference on Computer Vision (ECCV \u201922). Springer, 351\u2013368."},{"key":"e_1_3_1_29_2","doi-asserted-by":"crossref","first-page":"1420","DOI":"10.1145\/3447548.3467245","volume-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","author":"Rymarczyk Dawid","year":"2021","unstructured":"Dawid Rymarczyk, \u0141ukasz Struski, Jacek Tabor, and Bartosz Zieli\u0144ski. 2021. ProtoPShare: Prototypical parts sharing for similarity discovery in interpretable image classification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1420\u20131430."},{"key":"e_1_3_1_30_2","first-page":"618","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Selvaraju Ramprasaath R","year":"2017","unstructured":"Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618\u2013626."},{"key":"e_1_3_1_31_2","first-page":"3145","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Shrikumar Avanti","year":"2017","unstructured":"Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In Proceedings of the International Conference on Machine Learning. PMLR, 3145\u20133153."},{"key":"e_1_3_1_32_2","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https:\/\/arxiv.org\/abs\/1409.1556"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","first-page":"85198","DOI":"10.1109\/ACCESS.2021.3087583","article-title":"An interpretable deep learning model for COVID-19 detection with chest x-ray images","volume":"9","author":"Singh Gurmail","year":"2021","unstructured":"Gurmail Singh and Kin-Choong Yow. 2021. An interpretable deep learning model for COVID-19 detection with chest x-ray images. IEEE Access 9 (2021), 85198\u201385208.","journal-title":"IEEE Access"},{"key":"e_1_3_1_34_2","doi-asserted-by":"crossref","first-page":"41482","DOI":"10.1109\/ACCESS.2021.3064838","article-title":"These do not look like those: An interpretable deep learning model for image recognition","volume":"9","author":"Singh Gurmail","year":"2021","unstructured":"Gurmail Singh and Kin-Choong Yow. 2021. These do not look like those: An interpretable deep learning model for image recognition. IEEE Access 9 (2021), 41482\u201341493.","journal-title":"IEEE Access"},{"key":"e_1_3_1_35_2","first-page":"3319","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Sundararajan Mukund","year":"2017","unstructured":"Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the International Conference on Machine Learning. PMLR, 3319\u20133328."},{"issue":"2","key":"e_1_3_1_36_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3546194","article-title":"Optimized deep-neural network for content-based medical image retrieval in a brownfield IoMT network","volume":"18","author":"Tiwari Arti","year":"2022","unstructured":"Arti Tiwari and Millie Pant. 2022. Optimized deep-neural network for content-based medical image retrieval in a brownfield IoMT network. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2s (2022), 1\u201326.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_37_2","volume-title":"The Caltech-UCSD Birds-200-2011 Dataset","author":"Wah Catherine","year":"2011","unstructured":"Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. 2011. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology."},{"key":"e_1_3_1_38_2","first-page":"24","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Wang Haofan","year":"2020","unstructured":"Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, and Xia Hu. 2020. Score-CAM: Score-weighted visual explanations for convolutional neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 24\u201325."},{"key":"e_1_3_1_39_2","first-page":"895","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Wang Jiaqi","year":"2021","unstructured":"Jiaqi Wang, Huafeng Liu, Xinyue Wang, and Liping Jing. 2021. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 895\u2013904."},{"issue":"1","key":"e_1_3_1_40_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3522713","article-title":"Boosting hyperspectral image classification with dual hierarchical learning","volume":"19","author":"Wang Shuo","year":"2023","unstructured":"Shuo Wang, Huixia Ben, Yanbin Hao, Xiangnan He, and Meng Wang. 2023. Boosting hyperspectral image classification with dual hierarchical learning. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 1 (2023), 1\u201319.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"issue":"1","key":"e_1_3_1_41_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3531016","article-title":"BMIF: Privacy-preserving blockchain-based medical image fusion","volume":"19","author":"Xiang Tao","year":"2023","unstructured":"Tao Xiang, Honghong Zeng, Biwen Chen, and Shangwei Guo. 2023. BMIF: Privacy-preserving blockchain-based medical image fusion. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 1s (2023), 1\u201323.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_42_2","first-page":"1s","article-title":"BiRe-ID: Binary neural network for efficient person re-ID","volume":"18","author":"Xu Sheng","year":"2022","unstructured":"Sheng Xu, Chang Liu, Baochang Zhang, Jinhu L\u00fc, Guodong Guo, and David Doermann. 2022. BiRe-ID: Binary neural network for efficient person re-ID. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 1s (2022), 1\u201322.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"issue":"4","key":"e_1_3_1_43_2","first-page":"2019","article-title":"Interpreting image classifiers by generating discrete masks","volume":"44","author":"Yuan Hao","year":"2020","unstructured":"Hao Yuan, Lei Cai, Xia Hu, Jie Wang, and Shuiwang Ji. 2020. Interpreting image classifiers by generating discrete masks. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 4 (2020), 2019\u20132030.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_44_2","first-page":"2921","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhou Bolei","year":"2016","unstructured":"Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921\u20132929."},{"issue":"2","key":"e_1_3_1_45_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3550278","article-title":"Aligning image semantics and label concepts for image multi-label classification","volume":"19","author":"Zhou Wei","year":"2023","unstructured":"Wei Zhou, Zhiwu Xia, Peng Dou, Tao Su, and Haifeng Hu. 2023. Aligning image semantics and label concepts for image multi-label classification. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 2 (2023), 1\u201323.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3674837","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3674837","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:06:02Z","timestamp":1750291562000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3674837"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"references-count":44,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10,31]]}},"alternative-id":["10.1145\/3674837"],"URL":"https:\/\/doi.org\/10.1145\/3674837","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,16]]},"assertion":[{"value":"2023-10-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-17","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}