{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,30]],"date-time":"2025-08-30T16:38:51Z","timestamp":1756571931504,"version":"3.41.0"},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2024,1,22]],"date-time":"2024-01-22T00:00:00Z","timestamp":1705881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shenzhen Science and Technology Program","award":["JCYJ20200109113014456, and KCXFZ20211020163403005"],"award-info":[{"award-number":["JCYJ20200109113014456, and KCXFZ20211020163403005"]}]},{"name":"FengYun Application Pioneering Project","award":["FY-APP-ZX-2022.0220"],"award-info":[{"award-number":["FY-APP-ZX-2022.0220"]}]},{"name":"Science and Technology Innovation Team Project of Guangdong Meteorological Bureau","award":["GRMCTD202104"],"award-info":[{"award-number":["GRMCTD202104"]}]},{"name":"Innovation and Development Project of China Meteorological Administration","award":["CXFZ2022J002"],"award-info":[{"award-number":["CXFZ2022J002"]}]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["62376072"],"award-info":[{"award-number":["62376072"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,5,31]]},"abstract":"<jats:p>Satellite image sequence prediction aims to precisely infer future satellite image frames with historical observations, which is a significant and challenging dense prediction task. Though existing deep learning models deliver promising performance for satellite image sequence prediction, the methods suffer from quite expensive training costs, especially in training time and GPU memory demand, due to the inefficiently modeling for temporal variations. This issue seriously limits the lightweight application in satellites such as space-borne forecast models. In this article, we propose a lightweight prediction framework TinyPredNet for satellite image sequence prediction, in which a spatial encoder and decoder model the intra-frame appearance features and a temporal translator captures inter-frame motion patterns. To efficiently model the temporal evolution of satellite image sequences, we carefully design a multi-scale temporal-cascaded structure and a channel attention-gated structure in the temporal translator. Comprehensive experiments are conducted on FengYun-4A (FY-4A) satellite dataset, which show that the proposed framework achieves very competitive performance with much lower computation cost compared to state-of-the-art methods. In addition, corresponding interpretability experiments are conducted to show how our designed structures work. We believe the proposed method can serve as a solid lightweight baseline for satellite image sequence prediction.<\/jats:p>","DOI":"10.1145\/3638773","type":"journal-article","created":{"date-parts":[[2023,12,28]],"date-time":"2023-12-28T21:58:34Z","timestamp":1703800714000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["TinyPredNet: A Lightweight Framework for Satellite Image Sequence Prediction"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9921-4249","authenticated-orcid":false,"given":"Kuai","family":"Dai","sequence":"first","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8816-3856","authenticated-orcid":false,"given":"Xutao","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4546-6870","authenticated-orcid":false,"given":"Huiwei","family":"Lin","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-1055-883X","authenticated-orcid":false,"given":"Yin","family":"Jiang","sequence":"additional","affiliation":[{"name":"Shenzhen Meteorological Bureau, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6906-2654","authenticated-orcid":false,"given":"Xunlai","family":"Chen","sequence":"additional","affiliation":[{"name":"Shenzhen Meteorological Bureau, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1807-8581","authenticated-orcid":false,"given":"Yunming","family":"Ye","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Harbin Institute of Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8156-2395","authenticated-orcid":false,"given":"Di","family":"Xian","sequence":"additional","affiliation":[{"name":"National Satellite Meteorological Center, China Meteorological Administration, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,1,22]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Md Zahangir Alom Tarek M. Taha Christopher Yakopcic Stefan Westberg Paheding Sidike Mst Shamima Nasrin Brian C. Van Esesn Abdul A. S. Awwal and Vijayan K. Asari. 2018. The history began from alexnet: A comprehensive survey on deep learning approaches. arXiv:1803.01164. Retrieved from https:\/\/arxiv.org\/abs\/1803.01164"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2023.101984"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCYB.2021.3080121"},{"key":"e_1_3_1_5_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Ballas Nicolas","year":"2016","unstructured":"Nicolas Ballas, Li Yao, Chris Pal, and Aaron C. Courville. 2016. Delving deeper into convolutional networks for learning video representations. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_6_2","unstructured":"Vitus Benson Christian Requena-Mesa Claire Robin Lazaro Alonso Jos\u00e9 Cort\u00e9s Zhihan Gao Nora Linscheid M\u00e9lanie Weynants and Markus Reichstein. 2023. Forecasting localized weather impacts on vegetation as seen from space with meteo-guided video prediction. arXiv:2303.16198. Retrieved from https:\/\/arxiv.org\/abs\/2303.16198"},{"issue":"1","key":"e_1_3_1_7_2","first-page":"4","article-title":"Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm","volume":"5","year":"2001","unstructured":"Jean-Yves Bouguet. 2001. Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5, 1\u201310 (2001), 4.","journal-title":"Intel Corporation"},{"key":"e_1_3_1_8_2","first-page":"13946","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chang Zheng","year":"2022","unstructured":"Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. STRPM: A spatiotemporal residual predictive model for high-resolution video prediction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 13946\u201313955."},{"key":"e_1_3_1_9_2","first-page":"12021","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen Jierun","year":"2023","unstructured":"Jierun Chen, Shiu-hong Kao, Hao He, Weipeng Zhuo, Song Wen, Chul-Ho Lee, and S.-H. Gary Chan. 2023. Run, don\u2019t walk: chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 12021\u201312031."},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2023.3303947"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2022.3181279","article-title":"MSTCGAN: Multiscale time conditional generative adversarial network for long-term satellite image sequence prediction","volume":"60","author":"Dai Kuai","year":"2022","unstructured":"Kuai Dai, Xutao Li, Yunming Ye, Shanshan Feng, Danyu Qin, and Rui Ye. 2022. MSTCGAN: Multiscale time conditional generative adversarial network for long-term satellite image sequence prediction. IEEE Transactions on Geoscience and Remote Sensing 60 (2022), 1\u201316.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/LGRS.2023.3261317","article-title":"Exploiting spatial-temporal dynamics for satellite image sequence prediction","volume":"20","author":"Dai Kuai","year":"2023","unstructured":"Kuai Dai, Chi Ma, Zhaolin Wang, Yongshen Long, Xutao Li, Shanshan Feng, and Yunming Ye. 2023. Exploiting spatial-temporal dynamics for satellite image sequence prediction. IEEE Geoscience and Remote Sensing Letters 20 (2023), 1\u20135.","journal-title":"IEEE Geoscience and Remote Sensing Letters"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","unstructured":"Lasse Espeholt Shreya Agrawal Casper S\u00f8nderby Manoj Kumar Jonathan Heek Carla Bromberg Cenk Gazen Jason Hickey Aaron Bell and Nal Kalchbrenner. 2021. Skillful twelve hour precipitation forecasts using large context neural networks. arXiv:2111.07470. Retrieved from https:\/\/arxiv.org\/abs\/2111.07470","DOI":"10.1038\/s41467-022-32483-x"},{"key":"e_1_3_1_15_2","first-page":"3170","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Gao Zhangyang","year":"2022","unstructured":"Zhangyang Gao, Cheng Tan, Lirong Wu, and Stan Z. Li. 2022. SimVP: Simpler yet better video prediction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3170\u20133180."},{"key":"e_1_3_1_16_2","first-page":"1034","volume-title":"Proceedings of the IEEE International Conference on Data Mining","author":"Geng Yangliao","year":"2020","unstructured":"Yangliao Geng, Qingyong Li, Tianyang Lin, Jing Zhang, Liangtao Xu, Wen Yao, Dong Zheng, Weitao Lyu, and Heng Huang. 2020. A heterogeneous spatiotemporal network for lightning prediction. In Proceedings of the IEEE International Conference on Data Mining. 1034\u20131039."},{"key":"e_1_3_1_17_2","unstructured":"Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv:1308.0850. Retrieved from https:\/\/arxiv.org\/abs\/1308.0850"},{"key":"e_1_3_1_18_2","first-page":"11474","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Guen Vincent Le","year":"2020","unstructured":"Vincent Le Guen and Nicolas Thome. 2020. Disentangling physical dynamics from unknown factors for unsupervised video prediction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11474\u201311484."},{"key":"e_1_3_1_19_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Guibas John","year":"2022","unstructured":"John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, and Bryan Catanzaro. 2022. Adaptive fourier neural operators: Efficient token mixers for transformers. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_20_2","unstructured":"Jonathan Ho Nal Kalchbrenner Dirk Weissenborn and Tim Salimans. 2019. Axial attention in multidimensional transformers. arXiv:1912.12180. Retrieved from https:\/\/arxiv.org\/abs\/1912.12180"},{"key":"e_1_3_1_21_2","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. Retrieved from https:\/\/arxiv.org\/abs\/1704.04861"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.179"},{"key":"e_1_3_1_23_2","first-page":"7333","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Kim Suyoun","year":"2021","unstructured":"Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, and Duc Le. 2021. Improved neural language model fusion for streaming recurrent neural network transducer. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 7333\u20137337."},{"issue":"12","key":"e_1_3_1_24_2","doi-asserted-by":"crossref","first-page":"3897","DOI":"10.1175\/MWR-D-21-0096.1","article-title":"Using deep learning to nowcast the spatial coverage of convection from Himawari-8 satellite data","volume":"149","author":"Lagerquist Ryan","year":"2021","unstructured":"Ryan Lagerquist, Jebb Q. Stewart, Imme Ebert-Uphoff, and Christina Kumler. 2021. Using deep learning to nowcast the spatial coverage of convection from Himawari-8 satellite data. Monthly Weather Review 149, 12 (2021), 3897\u20133921.","journal-title":"Monthly Weather Review"},{"issue":"3","key":"e_1_3_1_25_2","first-page":"2212","article-title":"Mcsip net: Multichannel satellite image prediction via deep neural network","volume":"58","author":"Lee Jae-Hyeok","year":"2019","unstructured":"Jae-Hyeok Lee, Sangmin S. Lee, Hak Gu Kim, Sa-Kwang Song, Seongchan Kim, and Yong Man Ro. 2019. Mcsip net: Multichannel satellite image prediction via deep neural network. IEEE Transactions on Geoscience and Remote Sensing 58, 3 (2019), 2212\u20132224.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"e_1_3_1_26_2","first-page":"3054","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Lee Sangmin","year":"2021","unstructured":"Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, and Yong Man Ro. 2021. Video prediction recalling long-term motion context via memory alignment learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 3054\u20133063."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6819"},{"key":"e_1_3_1_28_2","first-page":"14420","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Xinyu","year":"2023","unstructured":"Xinyu Liu, Houwen Peng, Ningxin Zheng, Yuqing Yang, Han Hu, and Yixuan Yuan. 2023. EfficientViT: Memory efficient vision transformer with cascaded group attention. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14420\u201314430."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_8"},{"issue":"5","key":"e_1_3_1_31_2","doi-asserted-by":"crossref","first-page":"2557","DOI":"10.1109\/TGRS.2018.2874950","article-title":"Estimating summertime precipitation from himawari-8 and global forecast system based on machine learning","volume":"57","author":"Min Min","year":"2019","unstructured":"Min Min, Chen Bai, Jianping Guo, Fenglin Sun, Chao Liu, Fu Wang, Hui Xu, Shihao Tang, Bo Li, Di Di, Lixin Dong, and Jun Li. 2019. Estimating summertime precipitation from himawari-8 and global forecast system based on machine learning. IEEE Transactions on Geoscience and Remote Sensing 57, 5 (2019), 2557\u20132570.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"issue":"6","key":"e_1_3_1_32_2","first-page":"18","article-title":"Cross-scale graph interaction network for semantic segmentation of remote sensing images","volume":"19","author":"Nie Jie","year":"2023","unstructured":"Jie Nie, Lei Huang, Chengyu Zheng, Xiaowei Lv, and Rui Wang. 2023. Cross-scale graph interaction network for semantic segmentation of remote sensing images. ACM Transactions on Multimedia Computing, Communications and Applications 19, 6 (2023), 18 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_33_2","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ning Shuliang","year":"2023","unstructured":"Shuliang Ning, Mengcheng Lan, Yanran Li, Chaofeng Chen, Qian Chen, Xunlai Chen, Xiaoguang Han, and Shuguang Cui. 2023. MIMO is all you need: A strong multi-in-multi-out baseline for video prediction. In Proceedings of the AAAI Conference on Artificial Intelligence."},{"key":"e_1_3_1_34_2","unstructured":"Jaideep Pathak Shashank Subramanian Peter Harrington Sanjeev Raja Ashesh Chattopadhyay Morteza Mardani Thorsten Kurth David Hall Zongyi Li Kamyar Azizzadenesheli Pedram Hassanzadeh Karthik Kashinath and Animashree Anandkumar. 2022. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv:2202.11214. Retrieved from https:\/\/arxiv.org\/abs\/2202.11214"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-021-03854-z"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.114363"},{"key":"e_1_3_1_37_2","first-page":"1132","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Requena-Mesa Christian","year":"2021","unstructured":"Christian Requena-Mesa, Vitus Benson, Markus Reichstein, Jakob Runge, and Joachim Denzler. 2021. EarthNet2021: A large-scale dataset and challenge for earth surface forecasting as a guided video prediction task. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1132\u20131142."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_1_39_2","doi-asserted-by":"crossref","first-page":"2979","DOI":"10.1145\/3292500.3330682","volume-title":"Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Sch\u00f6n Christian","year":"2019","unstructured":"Christian Sch\u00f6n, Jens Dittrich, and Richard M\u00fcller. 2019. The error is the feature: How to forecast lightning using a model prediction error. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2979\u20132988."},{"key":"e_1_3_1_40_2","unstructured":"Minseok Seo Hakjin Lee Doyi Kim and Junghoon Seo. 2023. Implicit stacked autoregressive model for video prediction. arXiv:2303.07849. Retrieved from https:\/\/arxiv.org\/abs\/2303.07849"},{"key":"e_1_3_1_41_2","first-page":"802","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Shi X. J.","year":"2015","unstructured":"X. J. Shi, Z. R. Chen, H. Wang, D. Y. Yeung, W. K. Wong, and W. C. Woo. 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proceedings of the Advances in Neural Information Processing Systems. 802\u2013810."},{"key":"e_1_3_1_42_2","first-page":"5617","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Shi X. J.","year":"2017","unstructured":"X. J. Shi, Z. H. Gao, L. Lausen, H. Wang, D. Y. Yeung, W. K. Wong, and W. C. Woo. 2017. Deep learning for precipitation nowcasting: A benchmark and a new model. In Proceedings of the Advances in Neural Information Processing Systems. 5617\u20135627."},{"issue":"7","key":"e_1_3_1_43_2","doi-asserted-by":"crossref","first-page":"4155","DOI":"10.1109\/TGRS.2013.2280094","article-title":"Prediction of satellite image sequence for weather nowcasting using cluster-based spatiotemporal regression","volume":"52","author":"Shukla Bipasha Paul","year":"2013","unstructured":"Bipasha Paul Shukla, Chandra M. Kishtawal, and Pradip K. Pal. 2013. Prediction of satellite image sequence for weather nowcasting using cluster-based spatiotemporal regression. IEEE Transactions on Geoscience and Remote Sensing 52, 7 (2013), 4155\u20134160.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2013.2280094"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/LGRS.2010.2060311"},{"key":"e_1_3_1_46_2","first-page":"13714","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Su Jiahao","year":"2020","unstructured":"Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, and Anima Anandkumar. 2020. Convolutional tensor-train LSTM for spatio-temporal learning. In Proceedings of the Advances in Neural Information Processing Systems. 13714\u201313726."},{"key":"e_1_3_1_47_2","first-page":"18727","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Sun Mingzhen","year":"2023","unstructured":"Mingzhen Sun, Weining Wang, Xinxin Zhu, and Jing Liu. 2023. MOSO: Decomposing motion, scene and object for video prediction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 18727\u201318737."},{"key":"e_1_3_1_48_2","unstructured":"Cheng Tan Zhangyang Gao Siyuan Li and Stan Z. Li. 2022. Simvp: Towards simple yet powerful spatiotemporal predictive learning. arXiv:2211.12509. Retrieved from https:\/\/arxiv.org\/abs\/2211.12509"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00293"},{"issue":"2","key":"e_1_3_1_50_2","first-page":"11","article-title":"MILL: Channel attention-based deep multiple instance learning for landslide recognition","volume":"17","author":"Tang Xiaochuan","year":"2021","unstructured":"Xiaochuan Tang, Mingzhe Liu, Hao Zhong, Yuanzhen Ju, Weile Li, and Qiang Xu. 2021. MILL: Channel attention-based deep multiple instance learning for landslide recognition. ACM Transactions on Multimedia Computing, Communications and Applications 17, 2s (2021), 11 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_51_2","first-page":"402","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Teed Zachary","year":"2020","unstructured":"Zachary Teed and Jia Deng. 2020. RAFT: Recurrent all-pairs field transforms for optical flow. In Proceedings of the European Conference on Computer Vision. 402\u2013419."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2021.01.036"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCE.2022.3181759"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58548-8_7"},{"key":"e_1_3_1_55_2","first-page":"5123","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Wang Yunbo","year":"2018","unstructured":"Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jianmin Wang, and S. Yu Philip. 2018. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In Proceedings of the International Conference on Machine Learning. 5123\u20135132."},{"key":"e_1_3_1_56_2","first-page":"9154","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Yunbo","year":"2019","unstructured":"Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2019. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9154\u20139162."},{"key":"e_1_3_1_57_2","first-page":"879","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Wang Y. B.","year":"2017","unstructured":"Y. B. Wang, M. S. Long, J. M. Wang, Z. F. Gao, and P. S. Yu. 2017. PredRNN: Recurrent neural networks for predictive learning using spatiotemporal LSTMs. In Proceedings of the Advances in Neural Information Processing Systems. 879\u2013888."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01099"},{"key":"e_1_3_1_59_2","first-page":"15435","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wu Haixu","year":"2021","unstructured":"Haixu Wu, Zhiyu Yao, Mingsheng Long, and Jianmin Wan. 2021. MotionRNN: A flexible model for video prediction with spacetime-varying motions. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 15435\u201315444."},{"key":"e_1_3_1_60_2","first-page":"8121","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Xu Haofei","year":"2022","unstructured":"Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, and Dacheng Tao. 2022. GMFlow: Learning optical flow via global matching. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 8121\u20138130."},{"issue":"2","key":"e_1_3_1_61_2","first-page":"21","article-title":"TripRes: Traffic flow prediction driven resource reservation for multimedia IoV with edge computing","volume":"17","author":"Xu Xiaolong","year":"2021","unstructured":"Xiaolong Xu, Zijie Fang, Lianyong Qi, Xuyun Zhang, Qiang He, and Xiaokang Zhou. 2021. TripRes: Traffic flow prediction driven resource reservation for multimedia IoV with edge computing. ACM Transactions on Multimedia Computing, Communications and Applications 17, 2 (2021), 21 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"},{"key":"e_1_3_1_62_2","first-page":"1","volume-title":"Proceedings of the IEEE International Conference on Communications","author":"Xu Z.","year":"2019","unstructured":"Z. Xu, J. Du, J. J. Wang, C. X. Jiang, and Y. Ren. 2019. Satellite image prediction relying on GAN and LSTM neural networks. In Proceedings of the IEEE International Conference on Communications. 1\u20136."},{"key":"e_1_3_1_63_2","first-page":"2940","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Xu Ziru","year":"2018","unstructured":"Ziru Xu, Yunbo Wang, Mingsheng Long, Jianmin Wang, and M KLiss. 2018. PredCNN: Predictive learning with cascade convolutions. In Proceedings of the International Joint Conference on Artificial Intelligence. 2940\u20132947."},{"key":"e_1_3_1_64_2","unstructured":"Wilson Yan Yunzhi Zhang Pieter Abbeel and Aravind Srinivas. 2021. Videogpt: Video generation using vq-vae and transformers. arXiv:2104.10157. Retrieved from https:\/\/arxiv.org\/abs\/2104.10157"},{"key":"e_1_3_1_65_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Yu Wei","year":"2020","unstructured":"Wei Yu, Yichao Lu, Steve Easterbrook, and Sanja Fidler. 2020. Efficient and information-preserving future frame prediction and beyond. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00564"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00716"},{"issue":"7970","key":"e_1_3_1_68_2","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1038\/s41586-023-06184-4","article-title":"Skilful nowcasting of extreme precipitation with NowcastNet","volume":"619","author":"Zhang Yuchen","year":"2023","unstructured":"Yuchen Zhang, Mingsheng Long, Kaiyuan Chen, Lanxiang Xing, Ronghua Jin, Michael I. Jordan, and Jianmin Wang. 2023. Skilful nowcasting of extreme precipitation with NowcastNet. Nature 619, 7970 (2023), 526\u2013532.","journal-title":"Nature"},{"key":"e_1_3_1_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2023.3336471"},{"key":"e_1_3_1_70_2","article-title":"Scale-semantic joint decoupling network for image-text retrieval in remote sensing","author":"Zheng Chengyu","year":"2023","unstructured":"Chengyu Zheng, Ning Song, Ruoyu Zhang, Lei Huang, Zhiqiang Wei, and Jie Nie. 2023. Scale-semantic joint decoupling network for image-text retrieval in remote sensing. ACM Transactions on Multimedia Computing, Communications and Applications 20, 1 (2023), 20.","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3638773","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3638773","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:34Z","timestamp":1750291414000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3638773"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,22]]},"references-count":69,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5,31]]}},"alternative-id":["10.1145\/3638773"],"URL":"https:\/\/doi.org\/10.1145\/3638773","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2024,1,22]]},"assertion":[{"value":"2023-07-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-23","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-01-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}