{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T13:19:13Z","timestamp":1753881553601,"version":"3.41.2"},"reference-count":66,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,5,5]],"date-time":"2021-05-05T00:00:00Z","timestamp":1620172800000},"content-version":"vor","delay-in-days":124,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61502429"],"award-info":[{"award-number":["61502429"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004731","name":"Natural Science Foundation of Zhejiang Province","doi-asserted-by":"publisher","award":["LY18F020012"],"award-info":[{"award-number":["LY18F020012"]}],"id":[{"id":"10.13039\/501100004731","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computational Intelligence and Neuroscience"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>In recent years, the prediction of salient regions in RGB\u2010D images has become a focus of research. Compared to its RGB counterpart, the saliency prediction of RGB\u2010D images is more challenging. In this study, we propose a novel deep multimodal fusion autoencoder for the saliency prediction of RGB\u2010D images. The core trainable autoencoder of the RGB\u2010D saliency prediction model employs two raw modalities (RGB and depth\/disparity information) as inputs and their corresponding eye\u2010fixation attributes as labels. The autoencoder comprises four main networks: color channel network, disparity channel network, feature concatenated network, and feature learning network. The autoencoder can mine the complex relationship and make the utmost of the complementary characteristics between both color and disparity cues. Finally, the saliency map is predicted via a feature combination subnetwork, which combines the deep features extracted from a prior learning and convolutional feature learning subnetworks. We compare the proposed autoencoder with other saliency prediction models on two publicly available benchmark datasets. The results demonstrate that the proposed autoencoder outperforms these models by a significant margin.<\/jats:p>","DOI":"10.1155\/2021\/6610997","type":"journal-article","created":{"date-parts":[[2021,5,5]],"date-time":"2021-05-05T22:22:03Z","timestamp":1620253323000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Deep Multimodal Fusion Autoencoder for Saliency Prediction of RGB\u2010D Images"],"prefix":"10.1155","volume":"2021","author":[{"given":"Kengda","family":"Huang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3055-2493","authenticated-orcid":false,"given":"Wujie","family":"Zhou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Meixin","family":"Fang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2021,5,5]]},"reference":[{"key":"e_1_2_8_1_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.2019.2957386"},{"key":"e_1_2_8_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCDS.2021.3051010"},{"key":"e_1_2_8_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2020.107766"},{"key":"e_1_2_8_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2011.272"},{"key":"e_1_2_8_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2020.2999462"},{"key":"e_1_2_8_6_2","doi-asserted-by":"crossref","unstructured":"MakantasisK. DoulamisA. andDoulamisN. Vision-based maritime surveillance system using fused visual attention maps and online adaptable tracker Proceedings of the 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) July 2013 Paris France 1\u20134 https:\/\/doi.org\/10.1109\/WIAMIS.2013.6616150 2-s2.0-84887260393.","DOI":"10.1109\/WIAMIS.2013.6616150"},{"key":"e_1_2_8_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/tci.2020.2993640"},{"key":"e_1_2_8_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.3025166"},{"key":"e_1_2_8_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.09.009"},{"key":"e_1_2_8_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2016.2542580"},{"key":"e_1_2_8_11_2","doi-asserted-by":"crossref","unstructured":"ZhaoR. OuyangW. LiH. andWangX. Saliency detection by multi-context deep learning Proceedings of the 2015 IEEE Conference On Computer Vision And Pattern Recognition (CVPR) June 2015 Boston MA USA 1265\u20131274 https:\/\/doi.org\/10.1109\/CVPR.2015.7298731 2-s2.0-84959212183.","DOI":"10.1109\/CVPR.2015.7298731"},{"key":"e_1_2_8_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2021.3077058"},{"key":"e_1_2_8_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2017.2763780"},{"key":"e_1_2_8_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2017.2679898"},{"key":"e_1_2_8_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-015-2512-x"},{"key":"e_1_2_8_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3077767"},{"key":"e_1_2_8_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/tits.2016.2535402"},{"key":"e_1_2_8_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/lsp.2014.2320956"},{"key":"e_1_2_8_19_2","doi-asserted-by":"crossref","unstructured":"MancasM. GlowinskiD. VolpeG. ColettaP. andCamurriA. Gesture saliency: a context-aware analysis Proceedings of the International Gesture Workshop February 2019 Berlin Heidelberg Springer 146\u2013157 https:\/\/doi.org\/10.1007\/978-3-642-12553-9_13 2-s2.0-78650320092.","DOI":"10.1007\/978-3-642-12553-9_13"},{"key":"e_1_2_8_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2017.12.002"},{"key":"e_1_2_8_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/tcsvt.2017.2650910"},{"key":"e_1_2_8_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-016-4126-3"},{"key":"e_1_2_8_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-016-3548-2"},{"key":"e_1_2_8_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2015.2506340"},{"key":"e_1_2_8_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2017.2721546"},{"key":"e_1_2_8_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2017.2660440"},{"key":"e_1_2_8_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.730558"},{"key":"e_1_2_8_28_2","doi-asserted-by":"crossref","unstructured":"HouX.andZhangL. Saliency detection: a spectral residual approach Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition June 2007 Minneapolis MN USA IEEE 1\u20138 https:\/\/doi.org\/10.1109\/CVPR.2007.383267 2-s2.0-35148814949.","DOI":"10.1109\/CVPR.2007.383267"},{"key":"e_1_2_8_29_2","doi-asserted-by":"crossref","unstructured":"HarelJ. KochC. andPeronaP. Graph-based visual saliency Proceedings of the Advances in Neural Information Processing Systems 19 Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems January 2006 Columbia Canada 545\u2013552.","DOI":"10.7551\/mitpress\/7503.003.0073"},{"key":"e_1_2_8_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2011.2169775"},{"key":"e_1_2_8_31_2","doi-asserted-by":"crossref","unstructured":"ZhangL. GuZ. andLiH. SDSP: a novel saliency detection method by combining simple priors Proccedings of the 2013 IEEE International Conference on Image Processing September 2013 Melbourne Australia 171\u2013175 https:\/\/doi.org\/10.1109\/ICIP.2013.6738036 2-s2.0-84897732492.","DOI":"10.1109\/ICIP.2013.6738036"},{"key":"e_1_2_8_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2017.2689327"},{"key":"e_1_2_8_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2018.2829605"},{"key":"e_1_2_8_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/tcyb.2015.2404432"},{"key":"e_1_2_8_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2762422"},{"key":"e_1_2_8_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2018.2856126"},{"key":"e_1_2_8_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/tmm.2017.2713982"},{"key":"e_1_2_8_38_2","doi-asserted-by":"crossref","unstructured":"VigE. DorrM. andCoxD. Large-scale optimization of hierarchical features for saliency prediction in natural images Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2014 Columbus OH USA 23\u201328 https:\/\/doi.org\/10.1109\/CVPR.2014.358 2-s2.0-84911369162.","DOI":"10.1109\/CVPR.2014.358"},{"key":"e_1_2_8_39_2","unstructured":"KummererM. TheisL. andBethgeM. Deep gaze I: boosting saliency prediction with feature maps trained on imagenet 2015 http:\/\/arxiv.org\/abs\/1411.1045."},{"key":"e_1_2_8_40_2","unstructured":"KummererM. WallisT. andBethgeM. Deepgaze II: reading fixations from deep features trained on object recognition 2016 http:\/\/arxiv.org\/abs\/1610.01563."},{"key":"e_1_2_8_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_2_8_42_2","unstructured":"SimonyanK.andZissermanA. Very deep convolutional networks for large-scale image recognition 2014 http:\/\/arxiv.org\/abs\/1409.1556."},{"key":"e_1_2_8_43_2","unstructured":"LiG.andYuY. Visual saliency based on multiscale deep features 2015 http:\/\/arxiv.org\/abs\/1503.08663."},{"key":"e_1_2_8_44_2","doi-asserted-by":"crossref","unstructured":"LiuN. HanJ. ZhangD. WenS. andLiuT. Predicting eye fixations using convolutional neural networks Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2015 Boston MA USA 362\u2013370 https:\/\/doi.org\/10.1109\/CVPR.2015.7298633 2-s2.0-84946554818.","DOI":"10.1109\/CVPR.2015.7298633"},{"key":"e_1_2_8_45_2","doi-asserted-by":"crossref","unstructured":"HuangX. ShenC. BoixX. andZhaoQ. SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) December 2015 Santiago Chile IEEE 262\u2013270 https:\/\/doi.org\/10.1109\/ICCV.2015.38 2-s2.0-84973923049.","DOI":"10.1109\/ICCV.2015.38"},{"key":"e_1_2_8_46_2","doi-asserted-by":"crossref","unstructured":"SzegedyC. LiuW. JiaY.et al. Going deeper with convolutions Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2015 Boston MA USA 1\u20139 https:\/\/doi.org\/10.1109\/CVPR.2015.7298594 2-s2.0-84937522268.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_8_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2017.2710620"},{"key":"e_1_2_8_48_2","doi-asserted-by":"crossref","unstructured":"PanJ. McGuinnessK. O\u2019ConnorN. andGiro-i NietoX. Shallow and deep convolutional networks for saliency prediction 2016 http:\/\/arxiv.org\/abs\/1603.00845.","DOI":"10.1109\/CVPR.2016.71"},{"key":"e_1_2_8_49_2","doi-asserted-by":"crossref","unstructured":"KruthiventiS. GudisaV. DholakiyaJ. andBabuR. Saliency unified: a deep architecture for simultaneous eye fixation prediction and salient object segmentation Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2016 Las Vegas NV USA 5781\u20135790 https:\/\/doi.org\/10.1109\/CVPR.2016.623 2-s2.0-84986243887.","DOI":"10.1109\/CVPR.2016.623"},{"key":"e_1_2_8_50_2","doi-asserted-by":"crossref","unstructured":"JetleyS. MurrayN. andVigE. End-to-End saliency mapping via probability distribution prediction 2016 http:\/\/arXiv.org\/abs\/1804.01793.","DOI":"10.1109\/CVPR.2016.620"},{"key":"e_1_2_8_51_2","doi-asserted-by":"crossref","unstructured":"CorniaM. BaraldiL. SerraG. andCucchiaraR. A deep multi-level network for saliency prediction Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR) December 2016 Cancun Mexico https:\/\/doi.org\/10.1109\/ICPR.2016.7900174 2-s2.0-85017036429.","DOI":"10.1109\/ICPR.2016.7900174"},{"key":"e_1_2_8_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.03.018"},{"key":"e_1_2_8_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2018.2817047"},{"key":"e_1_2_8_54_2","unstructured":"PanJ. FerrerC. McGuinnessK.et al. Visual saliency prediction with generative adversarial networks 2017 http:\/\/arxiv.org\/abs\/1701.01081."},{"key":"e_1_2_8_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2018.2834826"},{"key":"e_1_2_8_56_2","unstructured":"CorniaM. BaraldiL. SerraG. andCucchiaraR. Predicting human eye fixations via an LSTM-BASED saliency attention model 2017 http:\/\/arxiv.org\/abs\/1611.09571."},{"key":"e_1_2_8_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2787612"},{"key":"e_1_2_8_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2013.2246176"},{"key":"e_1_2_8_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2014.2305100"},{"key":"e_1_2_8_60_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2015.04.007"},{"key":"e_1_2_8_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-015-3229-6"},{"key":"e_1_2_8_62_2","doi-asserted-by":"publisher","DOI":"10.1167\/15.6.19"},{"key":"e_1_2_8_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.05.050"},{"key":"e_1_2_8_64_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48896-7_57"},{"key":"e_1_2_8_65_2","doi-asserted-by":"crossref","unstructured":"LangC. NguyenT. KattiH. YadatiK. KankanhalliM. andYanS. Depth matters: influence of depth cues on visual saliency 7573 Proceedings of the 9th European Conference on Computer Vision January 2012 https:\/\/doi.org\/10.1007\/978-3-642-33709-3_8 2-s2.0-84867871481.","DOI":"10.1007\/978-3-642-33709-3_8"},{"key":"e_1_2_8_66_2","doi-asserted-by":"crossref","unstructured":"RicheN. DuvinageM. MancasM. GosselinB. andDutoitT. Saliency and human fixations: state-of-the-art and study of comparison metrics Proceedings of the 2013 IEEE International Conference on Computer Vision December 2013 Sydney Australia 1153\u20131160 https:\/\/doi.org\/10.1109\/ICCV.2013.147 2-s2.0-84898774374.","DOI":"10.1109\/ICCV.2013.147"}],"container-title":["Computational Intelligence and Neuroscience"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/6610997.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/6610997.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/6610997","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T11:58:19Z","timestamp":1722945499000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/6610997"}},"subtitle":[],"editor":[{"given":"Anastasios D.","family":"Doulamis","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":66,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/6610997"],"URL":"https:\/\/doi.org\/10.1155\/2021\/6610997","archive":["Portico"],"relation":{},"ISSN":["1687-5265","1687-5273"],"issn-type":[{"type":"print","value":"1687-5265"},{"type":"electronic","value":"1687-5273"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2020-11-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-23","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-05-05","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"6610997"}}