{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T08:28:00Z","timestamp":1772872080040,"version":"3.50.1"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"7","license":[{"start":{"date-parts":[[2024,5,15]],"date-time":"2024-05-15T00:00:00Z","timestamp":1715731200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972068 and 61932020"],"award-info":[{"award-number":["61972068 and 61932020"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Joint Funds of Liaoning Science and Technology Program","award":["2023JH2\/101800032"],"award-info":[{"award-number":["2023JH2\/101800032"]}]},{"DOI":"10.13039\/501100018617","name":"Liaoning Revitalization Talents Program","doi-asserted-by":"crossref","award":["XLYC2007023"],"award-info":[{"award-number":["XLYC2007023"]}],"id":[{"id":"10.13039\/501100018617","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Taishan Scholars Program of Shandong Province","award":["tsqn202312188 and tstp20221128"],"award-info":[{"award-number":["tsqn202312188 and tstp20221128"]}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>\n            While significant progress has been made in recent years in the field of salient object detection, there are still limitations in heterogeneous modality fusion and salient feature integrity learning. The former is primarily attributed to a paucity of attention from researchers to the fusion of cross-scale information between different modalities during processing multi-modal heterogeneous data, coupled with an absence of methods for adaptive control of their respective contributions. The latter constraint stems from the shortcomings in existing approaches concerning the prediction of salient region\u2019s integrity. To address these problems, we propose a Heterogeneous Fusion and Integrity Learning Network for RGB-D Salient Object Detection (HFIL-Net). In response to the first challenge, we design an Advanced Semantic Guidance Aggregation (ASGA) module, which utilizes three fusion blocks to achieve the aggregation of three types of information: within-scale cross-modal, within-modal cross-scale, and cross-modal cross-scale. In addition, we embed the local fusion factor matrices in the ASGA module and utilize the global fusion factor matrices in the Multi-modal Information Adaptive Fusion module to control the contributions adaptively from different perspectives during the fusion process. For the second issue, we introduce the Feature Integrity Learning and Refinement Module. It leverages the idea of \u201dpart-whole\u201d relationships from capsule networks to learn feature integrity and further refine the learned features through attention mechanisms. Extensive experimental results demonstrate that our proposed HFIL-Net outperforms over 17 state-of-the-art detection methods in testing across seven challenging standard datasets. Codes and results are available on\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/BojueGao\/HFIL-Net\">https:\/\/github.com\/BojueGao\/HFIL-Net<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3656476","type":"journal-article","created":{"date-parts":[[2024,4,5]],"date-time":"2024-04-05T12:00:03Z","timestamp":1712318403000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Heterogeneous Fusion and Integrity Learning Network for RGB-D Salient Object Detection"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-1951-3957","authenticated-orcid":false,"given":"Haorao","family":"Gao","sequence":"first","affiliation":[{"name":"School of Information and Communication Engineering, Dalian Minzu University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-2773-9969","authenticated-orcid":false,"given":"Yiming","family":"Su","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, Dalian Minzu University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0946-0789","authenticated-orcid":false,"given":"Fasheng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, Dalian Minzu University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3882-2205","authenticated-orcid":false,"given":"Haojie","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,5,15]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"1597","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition","author":"Achanta Radhakrishna","year":"2009","unstructured":"Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned salient region detection. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 1597\u20131604."},{"key":"e_1_3_2_3_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3597612","article-title":"Dynamic message propagation network for RGB-D and video salient object detection","volume":"20","author":"Chen Baian","year":"2023","unstructured":"Baian Chen, Zhilei Chen, Xiaowei Hu, Jun Xu, Haoran Xie, Jing Qin, and Mingqiang Wei. 2023. Dynamic message propagation network for RGB-D and video salient object detection. ACM Trans. Multimedia Comput. Commun. Appl. 20, 1 (2023), 1\u201321.","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"issue":"9","key":"e_1_3_2_4_2","doi-asserted-by":"crossref","first-page":"6308","DOI":"10.1109\/TCSVT.2022.3166914","article-title":"CGMDRNet: Cross-guided modality difference reduction network for RGB-T salient object detection","volume":"32","author":"Chen Gang","year":"2022","unstructured":"Gang Chen, Feng Shao, Xiongli Chai, Hangwei Chen, Qiuping Jiang, Xiangchao Meng, and Yo-Sung Ho. 2022. CGMDRNet: Cross-guided modality difference reduction network for RGB-T salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 32, 9 (2022), 6308\u20136323.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"issue":"4","key":"e_1_3_2_5_2","doi-asserted-by":"crossref","first-page":"1787","DOI":"10.1109\/TCSVT.2022.3215979","article-title":"Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection","volume":"33","author":"Chen Gang","year":"2022","unstructured":"Gang Chen, Feng Shao, Xiongli Chai, Hangwei Chen, Qiuping Jiang, Xiangchao Meng, and Yo-Sung Ho. 2022. Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 33, 4 (2022), 1787\u20131801.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","unstructured":"Hao Chen and Feihong Shen. 2023. Hierarchical cross-modal transformer for RGB-D salient object detection. arXiv preprint arXiv:2302.08052 (2023). DOI:10.48550\/arXiv.2302.08052","DOI":"10.48550\/arXiv.2302.08052"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","first-page":"107740","DOI":"10.1016\/j.patcog.2020.107740","article-title":"EF-Net: A novel enhancement and fusion network for RGB-D saliency detection","volume":"112","author":"Chen Qian","year":"2021","unstructured":"Qian Chen, Keren Fu, Ze Liu, Geng Chen, Hongwei Du, Bensheng Qiu, and Ling Shao. 2021. EF-Net: A novel enhancement and fusion network for RGB-D saliency detection. Pattern Recogn. 112 (2021), 107740.","journal-title":"Pattern Recogn."},{"key":"e_1_3_2_8_2","article-title":"3-d convolutional neural networks for rgb-d salient object detection and beyond","author":"Chen Qian","year":"2024","unstructured":"Qian Chen, Zhenxi Zhang, Yanye Lu, Keren Fu, and Qijun Zhao. 2024. 3-d convolutional neural networks for rgb-d salient object detection and beyond. IEEE Trans. Neural Netw. Learn. Syst. 35, 3 (2024), 4309\u20134323.","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"e_1_3_2_9_2","doi-asserted-by":"crossref","first-page":"4253","DOI":"10.1109\/TMM.2022.3172852","article-title":"Depth-induced gap-reducing network for RGB-D salient object detection: An interaction, guidance and refinement approach","volume":"25","author":"Cheng Xiaolong","year":"2023","unstructured":"Xiaolong Cheng, Xuan Zheng, Jialun Pei, He Tang, Zehua Lyu, and Chuanbo Chen. 2023. Depth-induced gap-reducing network for RGB-D salient object detection: An interaction, guidance and refinement approach. IEEE Trans. Multimedia 25 (2023), 4253\u20134266.","journal-title":"IEEE Trans. Multimedia"},{"key":"e_1_3_2_10_2","first-page":"23","volume-title":"Proceedings of the International Conference on Internet Multimedia Computing and Service","author":"Cheng Yupeng","year":"2014","unstructured":"Yupeng Cheng, Huazhu Fu, Xingxing Wei, Jiangjian Xiao, and Xiaochun Cao. 2014. Depth enhanced saliency detection method. In Proceedings of the International Conference on Internet Multimedia Computing and Service. 23\u201327."},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","first-page":"6800","DOI":"10.1109\/TIP.2022.3216198","article-title":"CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection","volume":"31","author":"Cong Runmin","year":"2022","unstructured":"Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, and Yao Zhao. 2022. CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection. IEEE Trans. Image Process. 31 (2022), 6800\u20136815.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"6971","DOI":"10.1109\/TMM.2022.3216476","article-title":"Does thermal really always matter for RGB-T salient object detection?","volume":"25","author":"Cong Runmin","year":"2023","unstructured":"Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, and Sam Kwong. 2023. Does thermal really always matter for RGB-T salient object detection? IEEE Trans. Multimedia 25 (2023), 6971\u20136982.","journal-title":"IEEE Trans. Multimedia"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","first-page":"104537","DOI":"10.1016\/j.autcon.2022.104537","article-title":"Automatic damage segmentation in pavement videos by fusing similar feature extraction siamese network (SFE-SNet) and pavement damage segmentation capsule network (PDS-CapsNet)","volume":"143","author":"Dong Jiaxiu","year":"2022","unstructured":"Jiaxiu Dong, Niannian Wang, Hongyuan Fang, Rui Wu, Chengzhi Zheng, Duo Ma, and Haobang Hu. 2022. Automatic damage segmentation in pavement videos by fusing similar feature extraction siamese network (SFE-SNet) and pavement damage segmentation capsule network (PDS-CapsNet). Autom. Constr. 143 (2022), 104537.","journal-title":"Autom. Constr."},{"key":"e_1_3_2_14_2","first-page":"4548","volume-title":"Proceedings of the International Conference on Computer Vision","author":"Fan Deng-Ping","year":"2017","unstructured":"Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, and Ali Borji. 2017. Structure-measure: A new way to evaluate foreground maps. In Proceedings of the International Conference on Computer Vision. 4548\u20134557."},{"key":"e_1_3_2_15_2","first-page":"698","volume-title":"Proceedings of the 27th International Joint Conference on Artificial Intelligence","author":"Fan Deng-Ping","year":"2018","unstructured":"Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, and Ali Borji. 2018. Enhanced-alignment measure for binary foreground map evaluation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 698\u2013704."},{"issue":"5","key":"e_1_3_2_16_2","doi-asserted-by":"crossref","first-page":"2075","DOI":"10.1109\/TNNLS.2020.2996406","article-title":"Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks","volume":"32","author":"Fan Deng-Ping","year":"2021","unstructured":"Deng-Ping Fan, Zheng Lin, Zhao Zhang, Menglong Zhu, and Ming-Ming Cheng. 2021. Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Trans. Neural Netw. Learn. Syst. 32, 5 (2021), 2075\u20132089.","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"e_1_3_2_17_2","first-page":"275","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Fan Deng-Ping","year":"2020","unstructured":"Deng-Ping Fan, Yingjie Zhai, Ali Borji, Jufeng Yang, and Ling Shao. 2020. BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network. In Proceedings of the European Conference on Computer Vision. 275\u2013292."},{"issue":"4","key":"e_1_3_2_18_2","doi-asserted-by":"crossref","first-page":"2091","DOI":"10.1109\/TCSVT.2021.3082939","article-title":"Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection","volume":"32","author":"Gao Wei","year":"2022","unstructured":"Wei Gao, Guibiao Liao, Siwei Ma, Ge Li, Yongsheng Liang, and Weisi Lin. 2022. Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection. IEEE Trans.Circ. Syst. Vid. Technol. 32, 4 (2022), 2091\u20132106.","journal-title":"IEEE Trans.Circ. Syst. Vid. Technol."},{"key":"e_1_3_2_19_2","first-page":"44","volume-title":"Proceedings of the International Conference on Artificial Neural Network","author":"Hinton Geoffrey E.","year":"2011","unstructured":"Geoffrey E. Hinton, Alex Krizhevsky, and Sida D. Wang. 2011. Transforming auto-encoders. In Proceedings of the International Conference on Artificial Neural Network. 44\u201351."},{"key":"e_1_3_2_20_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Hinton Geoffrey E.","year":"2018","unstructured":"Geoffrey E. Hinton, Sara Sabour, and Nicholas Frosst. 2018. Matrix capsules with EM routing. In Proceedings of the International Conference on Learning Representations. 1\u201315."},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"2321","DOI":"10.1109\/TIP.2022.3154931","article-title":"DMRA: Depth-induced multi-scale recurrent attention network for RGB-D saliency detection","volume":"31","author":"Ji Wei","year":"2022","unstructured":"Wei Ji, Ge Yan, Jingjing Li, Yongri Piao, Shunyu Yao, Miao Zhang, Li Cheng, and Huchuan Lu. 2022. DMRA: Depth-induced multi-scale recurrent attention network for RGB-D saliency detection. IEEE Trans. Image Process. 31 (2022), 2321\u20132336.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_22_2","first-page":"1115","volume-title":"Proceedings of the International Conference on Image Processing","author":"Ju Ran","year":"2014","unstructured":"Ran Ju, Ling Ge, Wenjing Geng, Tongwei Ren, and Gangshan Wu. 2014. Depth saliency based on anisotropic center-surround difference. In Proceedings of the International Conference on Image Processing. 1115\u20131119."},{"key":"e_1_3_2_23_2","first-page":"1","volume-title":"Proceedings of the International Conference on Medical Imaging with Deep Learning","author":"LaLonde Rodney","year":"2018","unstructured":"Rodney LaLonde and Ulas Bagci. 2018. Capsules for object segmentation. In Proceedings of the International Conference on Medical Imaging with Deep Learning. 1\u20139."},{"key":"e_1_3_2_24_2","first-page":"630","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Lee Minhyeok","year":"2022","unstructured":"Minhyeok Lee, Chaewon Park, Suhwan Cho, and Sangyoun Lee. 2022. Spsn: Superpixel prototype sampling network for rgb-d salient object detection. In Proceedings of the European Conference on Computer Vision. 630\u2013647."},{"issue":"1","key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1109\/TCYB.2020.2969255","article-title":"ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection","volume":"51","author":"Li Chongyi","year":"2021","unstructured":"Chongyi Li, Runmin Cong, Sam Kwong, Junhui Hou, Huazhu Fu, Guopu Zhu, Dingwen Zhang, and Qingming Huang. 2021. ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection. IEEE Trans. Cybernet. 51, 1 (2021), 88\u2013100.","journal-title":"IEEE Trans. Cybernet."},{"key":"e_1_3_2_26_2","first-page":"225","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Li Chongyi","year":"2020","unstructured":"Chongyi Li, Runmin Cong, Yongri Piao, Qianqian Xu, and Chen Change Loy. 2020. RGB-D salient object detection with cross-modality modulation and selection. In Proceedings of the European Conference on Computer Vision. 225\u2013241."},{"issue":"4","key":"e_1_3_2_27_2","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1007\/s11263-022-01734-1","article-title":"Delving into calibrated depth for accurate RGB-D salient object detection","volume":"131","author":"Li Jingjing","year":"2023","unstructured":"Jingjing Li, Wei Ji, Miao Zhang, Yongri Piao, Huchuan Lu, and Li Cheng. 2023. Delving into calibrated depth for accurate RGB-D salient object detection. Int. J. Comput. Vis 131, 4 (2023), 855\u2013876.","journal-title":"Int. J. Comput. Vis"},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.neunet.2021.12.003","article-title":"Feature correlation-steered capsule network for object detection","volume":"147","author":"Lin Zhongqi","year":"2022","unstructured":"Zhongqi Lin, Jingdun Jia, Feng Huang, and Wanlin Gao. 2022. Feature correlation-steered capsule network for object detection. Neural Netw. 147 (2022), 25\u201341.","journal-title":"Neural Netw."},{"key":"e_1_3_2_29_2","first-page":"4722","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Nian","year":"2021","unstructured":"Nian Liu, Ni Zhang, Kaiyuan Wan, Ling Shao, and Junwei Han. 2021. Visual saliency transformer. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 4722\u20134732."},{"issue":"7","key":"e_1_3_2_30_2","first-page":"3688","article-title":"Part-object relational visual saliency","volume":"44","author":"Liu Yi","year":"2022","unstructured":"Yi Liu, Dingwen Zhang, Qiang Zhang, and Jungong Han. 2022. Part-object relational visual saliency. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7 (2022), 3688\u20133704.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_2_31_2","first-page":"1232","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Yi","year":"2019","unstructured":"Yi Liu, Qiang Zhang, Dingwen Zhang, and Jungong Han. 2019. Employing deep part-object relationships for salient object detection. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 1232\u20131241."},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","first-page":"5423","DOI":"10.1109\/TIP.2023.3318953","article-title":"Deep hypersphere feature regularization for weakly supervised RGB-D salient object detection","volume":"32","author":"Liu Zhiyu","year":"2023","unstructured":"Zhiyu Liu, Munawar Hayat, Hong Yang, Duo Peng, and Yinjie Lei. 2023. Deep hypersphere feature regularization for weakly supervised RGB-D salient object detection. IEEE Trans. Image Process. 32 (2023), 5423\u20135437.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_33_2","first-page":"10012","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Liu Ze","year":"2021","unstructured":"Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10012\u201310022."},{"issue":"7","key":"e_1_3_2_34_2","doi-asserted-by":"crossref","first-page":"4486","DOI":"10.1109\/TCSVT.2021.3127149","article-title":"SwinNet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection","volume":"32","author":"Liu Zhengyi","year":"2022","unstructured":"Zhengyi Liu, Yacheng Tan, Qian He, and Yun Xiao. 2022. SwinNet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 32, 7 (2022), 4486\u20134497.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"key":"e_1_3_2_35_2","first-page":"4481","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Liu Zhengyi","year":"2021","unstructured":"Zhengyi Liu, Yuan Wang, Zhengzheng Tu, Yun Xiao, and Bin Tang. 2021. TriTransNet: RGB-D salient object detection with a triplet transformer embedding network. In Proceedings of the ACM International Conference on Multimedia. 4481\u20134490."},{"key":"e_1_3_2_36_2","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1109\/TIP.2022.3232209","article-title":"Boosting broader receptive fields for salient object detection","volume":"32","author":"Ma Mingcan","year":"2023","unstructured":"Mingcan Ma, Changqun Xia, Chenxi Xie, Xiaowu Chen, and Jia Li. 2023. Boosting broader receptive fields for salient object detection. IEEE Trans. Image Process. 32 (2023), 1026\u20131038.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_37_2","first-page":"248","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition","author":"Margolin Ran","year":"2014","unstructured":"Ran Margolin, Lihi Zelnik-Manor, and Ayellet Tal. 2014. How to evaluate foreground maps? In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 248\u2013255."},{"key":"e_1_3_2_38_2","first-page":"454","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition","author":"Niu Yuzhen","year":"2012","unstructured":"Yuzhen Niu, Yujie Geng, Xueqing Li, and Feng Liu. 2012. Leveraging stereopsis for saliency analysis. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 454\u2013461."},{"key":"e_1_3_2_39_2","doi-asserted-by":"crossref","first-page":"892","DOI":"10.1109\/TIP.2023.3234702","article-title":"CAVER: Cross-modal view-mixed transformer for bi-modal salient object detection","volume":"32","author":"Pang Youwei","year":"2023","unstructured":"Youwei Pang, Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu. 2023. CAVER: Cross-modal view-mixed transformer for bi-modal salient object detection. IEEE Trans. Image Process. 32 (2023), 892\u2013904.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_40_2","first-page":"92","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Peng Houwen","year":"2014","unstructured":"Houwen Peng, Bing Li, Weihua Xiong, Weiming Hu, and Rongrong Ji. 2014. RGBD salient object detection: A benchmark and algorithms. In Proceedings of the European Conference on Computer Vision. 92\u2013109."},{"key":"e_1_3_2_41_2","first-page":"733","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition","author":"Perazzi Federico","year":"2012","unstructured":"Federico Perazzi, Philipp Kr\u00e4henb\u00fchl, Yael Pritch, and Alexander Hornung. 2012. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 733\u2013740."},{"key":"e_1_3_2_42_2","first-page":"7254","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Piao Yongri","year":"2019","unstructured":"Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, and Huchuan Lu. 2019. Depth-induced multi-scale recurrent attention network for saliency detection. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7254\u20137263."},{"key":"e_1_3_2_43_2","first-page":"10725","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision and Pattern Recognition","author":"Rajasegaran Jathushan","year":"2019","unstructured":"Jathushan Rajasegaran, Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Suranga Seneviratne, and Ranga Rodrigo. 2019. Deepcaps: Going deeper with capsule networks. In Proceedings of the IEEE\/CVF International Conference on Computer Vision and Pattern Recognition. 10725\u201310733."},{"key":"e_1_3_2_44_2","first-page":"3859","article-title":"Dynamic routing between capsules","volume":"30","author":"Sabour Sara","year":"2017","unstructured":"Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic routing between capsules. In Advances in Neural Information Processing Systems, Vol. 30, 3859\u20133869.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_45_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TIM.2023.3236346","article-title":"A potential vision-based measurements technology: Information flow fusion detection method using RGB-thermal infrared images","volume":"72","author":"Song Kechen","year":"2023","unstructured":"Kechen Song, Yanqi Bao, Han Wang, Liming Huang, and Yunhui Yan. 2023. A potential vision-based measurements technology: Information flow fusion detection method using RGB-thermal infrared images. IEEE Trans. Instrum. Meas. 72 (2023), 1\u201313.","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"e_1_3_2_46_2","article-title":"CATNet: A cascaded and aggregated transformer network for RGB-D salient object detection","author":"Sun Fuming","year":"2024","unstructured":"Fuming Sun, Peng Ren, Bowen Yin, Fasheng Wang, and Haojie Li. 2024. CATNet: A cascaded and aggregated transformer network for RGB-D salient object detection. IEEE Trans. Multimedia 26 (2024), 2249\u20132262.","journal-title":"IEEE Trans. Multimedia"},{"key":"e_1_3_2_47_2","doi-asserted-by":"crossref","first-page":"5678","DOI":"10.1109\/TIP.2021.3087412","article-title":"Multi-interactive dual-decoder for RGB-thermal salient object detection","volume":"30","author":"Tu Zhengzheng","year":"2021","unstructured":"Zhengzheng Tu, Zhun Li, Chenglong Li, Yang Lang, and Jin Tang. 2021. Multi-interactive dual-decoder for RGB-thermal salient object detection. IEEE Trans. Image Process. 30 (2021), 5678\u20135691.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"1285","DOI":"10.1109\/TIP.2022.3140606","article-title":"Learning discriminative cross-modality features for RGB-D saliency detection","volume":"31","author":"Wang Fengyun","year":"2022","unstructured":"Fengyun Wang, Jinshan Pan, Shoukun Xu, and Jinhui Tang. 2022. Learning discriminative cross-modality features for RGB-D saliency detection. IEEE Trans. Image Process. 31 (2022), 1285\u20131297.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_49_2","first-page":"1","article-title":"Cross-modal and cross-level attention interaction network for salient object detection","author":"Wang Fasheng","year":"2023","unstructured":"Fasheng Wang, Yiming Su, Ruimin Wang, Jing Sun, Fuming Sun, and Haojie Li. 2023. Cross-modal and cross-level attention interaction network for salient object detection. IEEE Trans. Artif. Intell. (2023), 1\u201315.","journal-title":"IEEE Trans. Artif. Intell."},{"key":"e_1_3_2_50_2","doi-asserted-by":"crossref","first-page":"119047","DOI":"10.1016\/j.eswa.2022.119047","article-title":"DCMNet: Discriminant and cross-modality network for RGB-D salient object detection","volume":"214","author":"Wang Fasheng","year":"2023","unstructured":"Fasheng Wang, Ruimin Wang, and Fuming Sun. 2023. DCMNet: Discriminant and cross-modality network for RGB-D salient object detection. Expert Syst. Appl. 214 (2023), 119047.","journal-title":"Expert Syst. Appl."},{"issue":"19","key":"e_1_3_2_51_2","doi-asserted-by":"crossref","first-page":"27879","DOI":"10.1007\/s11042-022-12760-z","article-title":"Context and saliency aware correlation filter for visual target tracking","volume":"81","author":"Wang Fasheng","year":"2022","unstructured":"Fasheng Wang, Shuangshuang Yin, Jimmy T. Mbelwa, and Fuming Sun. 2022. Context and saliency aware correlation filter for visual target tracking. Multimed. Tools. Appl. 81, 19 (2022), 27879\u201327893.","journal-title":"Multimed. Tools. Appl."},{"issue":"5","key":"e_1_3_2_52_2","doi-asserted-by":"crossref","first-page":"2949","DOI":"10.1109\/TCSVT.2021.3099120","article-title":"CGFNet: Cross-guided fusion network for RGB-T salient object detection","volume":"32","author":"Wang Jie","year":"2022","unstructured":"Jie Wang, Kechen Song, Yanqi Bao, Liming Huang, and Yunhui Yan. 2022. CGFNet: Cross-guided fusion network for RGB-T salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 32, 5 (2022), 2949\u20132961.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"issue":"3","key":"e_1_3_2_53_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3624747","article-title":"Attention-guided multi-modality interaction network for RGB-D salient object detection","volume":"20","author":"Wang Ruimin","year":"2024","unstructured":"Ruimin Wang, Fasheng Wang, Yiming Su, Jing Sun, Fuming Sun, and Haojie Li. 2024. Attention-guided multi-modality interaction network for RGB-D salient object detection. ACM Trans. Multimedia Comput. Commun. Appl. 20, 3, Article NO. 68 (2024), 1\u201322.","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"issue":"7","key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"1531","DOI":"10.1109\/TPAMI.2018.2840724","article-title":"A deep network solution for attention and aesthetics aware photo cropping","volume":"41","author":"Wang Wenguan","year":"2019","unstructured":"Wenguan Wang, Jianbing Shen, and Haibin Ling. 2019. A deep network solution for attention and aesthetics aware photo cropping. IEEE Trans. Pattern Anal. Mach. Intell. 41, 7 (2019), 1531\u20131544.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"7","key":"e_1_3_2_55_2","doi-asserted-by":"crossref","first-page":"2413","DOI":"10.1109\/TPAMI.2020.2966453","article-title":"Paying attention to video object pattern understanding","volume":"43","author":"Wang Wenguan","year":"2020","unstructured":"Wenguan Wang, Jianbing Shen, Xiankai Lu, Steven CH Hoi, and Haibin Ling. 2020. Paying attention to video object pattern understanding. IEEE Trans. Pattern Anal. Mach. Intell. 43, 7 (2020), 2413\u20132428.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"7","key":"e_1_3_2_56_2","doi-asserted-by":"crossref","first-page":"1846","DOI":"10.1093\/comjnl\/bxab026","article-title":"Learning saliency aware correlation filter for visual tracking","volume":"65","author":"Wang Yanbo","year":"2022","unstructured":"Yanbo Wang, Fasheng Wang, Chang Wang, Jianjun He, and Fuming Sun. 2022. Learning saliency aware correlation filter for visual tracking. Comput. J. 65, 7 (2022), 1846\u20131859.","journal-title":"Comput. J."},{"key":"e_1_3_2_57_2","first-page":"3672","volume-title":"Proceedings of the Asian Conference on Computer Vision (ACCV \u201922)","author":"Wang Yang","year":"2022","unstructured":"Yang Wang and Yanqing Zhang. 2022. Three-stage bidirectional interaction network for efficient RGB-D salient object detection. In Proceedings of the Asian Conference on Computer Vision (ACCV \u201922). 3672\u20133689."},{"issue":"12","key":"e_1_3_2_58_2","doi-asserted-by":"crossref","first-page":"10261","DOI":"10.1109\/TPAMI.2021.3134684","article-title":"MobileSal: Extremely efficient RGB-D salient object detection","volume":"44","author":"Wu Yu-Huan","year":"2022","unstructured":"Yu-Huan Wu, Yun Liu, Jun Xu, Jia-Wang Bian, Yu-Chao Gu, and Ming-Ming Cheng. 2022. MobileSal: Extremely efficient RGB-D salient object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44, 12 (2022), 10261\u201310269.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","first-page":"2160","DOI":"10.1109\/TIP.2023.3263111","article-title":"Hidanet: RGB-D salient object detection via hierarchical depth awareness","volume":"32","author":"Wu Zongwei","year":"2023","unstructured":"Zongwei Wu, Guillaume Allibert, Fabrice Meriaudeau, Chao Ma, and C\u00e9dric Demonceaux. 2023. Hidanet: RGB-D salient object detection via hierarchical depth awareness. IEEE Trans. Image Process. 32 (2023), 2160\u20132173.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2023.3315511"},{"key":"e_1_3_2_61_2","doi-asserted-by":"crossref","first-page":"105917","DOI":"10.1016\/j.compbiomed.2022.105917","article-title":"An improved capsule network for glioma segmentation on MRI images: A curriculum learning approach","volume":"148","author":"Zade Amin Amiri Tehrani","year":"2022","unstructured":"Amin Amiri Tehrani Zade, Maryam Jalili Aziz, Saeed Masoudnia, Alireza Mirbagheri, and Alireza Ahmadian. 2022. An improved capsule network for glioma segmentation on MRI images: A curriculum learning approach. Comput. Biol. Med. 148 (2022), 105917.","journal-title":"Comput. Biol. Med."},{"key":"e_1_3_2_62_2","first-page":"126","article-title":"Dual swin-transformer based mutual interactive network for RGB-D salient object detection","volume":"559","author":"Zeng Chao","year":"2023","unstructured":"Chao Zeng, Sam Kwong, and Horace Ip. 2023. Dual swin-transformer based mutual interactive network for RGB-D salient object detection. Neurocomputing 559 (2023), 126\u2013779.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_63_2","first-page":"7223","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Zeng Yu","year":"2019","unstructured":"Yu Zeng, Yunzhi Zhuge, Huchuan Lu, and Lihe Zhang. 2019. Joint learning of saliency detection and weakly supervised semantic segmentation. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7223\u20137233."},{"key":"e_1_3_2_64_2","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1007\/s11263-018-1112-4","article-title":"Leveraging prior-knowledge for weakly supervised object detection under a collaborative self-paced curriculum learning framework","volume":"127","author":"Zhang Dingwen","year":"2019","unstructured":"Dingwen Zhang, Junwei Han, Long Zhao, and Deyu Meng. 2019. Leveraging prior-knowledge for weakly supervised object detection under a collaborative self-paced curriculum learning framework. Int. J. Comput. Vis. 127 (2019), 363\u2013380.","journal-title":"Int. J. Comput. Vis."},{"key":"e_1_3_2_65_2","doi-asserted-by":"crossref","first-page":"5142","DOI":"10.1109\/TMM.2022.3187856","article-title":"C2DFNet: Criss-cross dynamic filter network for RGB-D salient object detection","volume":"25","author":"Zhang Miao","year":"2023","unstructured":"Miao Zhang, Shunyu Yao, Beiqi Hu, Yongri Piao, and Wei Ji. 2023. C2DFNet: Criss-cross dynamic filter network for RGB-D salient object detection. IEEE Trans. Multimedia 25 (2023), 5142\u20135154.","journal-title":"IEEE Trans. Multimedia"},{"key":"e_1_3_2_66_2","doi-asserted-by":"crossref","first-page":"2593","DOI":"10.1109\/TIP.2023.3270801","article-title":"Position-aware relation learning for RGB-thermal salient object detection","volume":"32","author":"Zhou Heng","year":"2023","unstructured":"Heng Zhou, Chunna Tian, Zhenxi Zhang, Chengyang Li, Yuxuan Ding, Yongqiang Xie, and Zhongbo Li. 2023. Position-aware relation learning for RGB-thermal salient object detection. IEEE Trans. Image Process. 32 (2023), 2593\u20132607.","journal-title":"IEEE Trans. Image Process."},{"issue":"3","key":"e_1_3_2_67_2","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1109\/TCSVT.2021.3077058","article-title":"ECFFNet: Effective and consistent feature fusion network for RGB-T salient object detection","volume":"32","author":"Zhou Wujie","year":"2022","unstructured":"Wujie Zhou, Qinling Guo, Jingsheng Lei, Lu Yu, and Jenq-Neng Hwang. 2022. ECFFNet: Effective and consistent feature fusion network for RGB-T salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 32, 3 (2022), 1224\u20131235.","journal-title":"IEEE Trans. Circ. Syst. Vid. Technol."},{"key":"e_1_3_2_68_2","first-page":"199","volume-title":"Proceedings of the International Conference on Multimedia and Expo","author":"Zhu Chunbiao","year":"2019","unstructured":"Chunbiao Zhu, Xing Cai, Kan Huang, Thomas H. Li, and Ge Li. 2019. PDNet: Prior-model guided depth-enhanced network for salient object detection. In Proceedings of the International Conference on Multimedia and Expo. 199\u2013204."},{"key":"e_1_3_2_69_2","first-page":"3008","volume-title":"Proceedings of the International Conference on Computer Vision and Pattern Recognition","author":"Zhu Chunbiao","year":"2017","unstructured":"Chunbiao Zhu and Ge Li. 2017. A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 3008\u20133014."},{"issue":"3","key":"e_1_3_2_70_2","first-page":"3738","article-title":"Salient object detection via integrity learning","volume":"45","author":"Zhuge Mingchen","year":"2023","unstructured":"Mingchen Zhuge, Deng-Ping Fan, Nian Liu, Dingwen Zhang, Dong Xu, and Ling Shao. 2023. Salient object detection via integrity learning. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3 (2023), 3738\u20133752.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3656476","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3656476","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:49:00Z","timestamp":1750286940000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3656476"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,15]]},"references-count":69,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3656476"],"URL":"https:\/\/doi.org\/10.1145\/3656476","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,15]]},"assertion":[{"value":"2023-12-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-03","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}