{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T02:37:15Z","timestamp":1777430235618,"version":"3.51.4"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2021,11,29]],"date-time":"2021-11-29T00:00:00Z","timestamp":1638144000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["U20B2053 and 61872022"],"award-info":[{"award-number":["U20B2053 and 61872022"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100011347","name":"State Key Laboratory of Software Development Environment","doi-asserted-by":"crossref","award":["SKLSDE-2020ZX-12"],"award-info":[{"award-number":["SKLSDE-2020ZX-12"]}],"id":[{"id":"10.13039\/501100011347","id-type":"DOI","asserted-by":"crossref"}]},{"name":"CAAI-Huawei MindSpore Open Fund"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2021,12,31]]},"abstract":"<jats:p>The spatial-temporal modeling on long sequences is of great importance in many real-world applications. Recent studies have shown the potential of applying the self-attention mechanism to improve capturing the complex spatial-temporal dependencies. However, the lack of underlying structure information weakens its general performance on long sequence spatial-temporal problem. To overcome this limitation, we proposed a novel method, named the Proximity-aware Long Sequence Learning framework, and apply it to the spatial-temporal forecasting task. The model substitutes the canonical self-attention by leveraging the proximity-aware attention, which enhances local structure clues in building long-range dependencies with a linear approximation of attention scores. The relief adjacency matrix technique can utilize the historical global graph information for consistent proximity learning. Meanwhile, the reduced decoder allows for fast inference in a non-autoregressive manner. Extensive experiments are conducted on five large-scale datasets, which demonstrate that our method achieves state-of-the-art performance and validates the effectiveness brought by local structure information.<\/jats:p>","DOI":"10.1145\/3447987","type":"journal-article","created":{"date-parts":[[2021,11,29]],"date-time":"2021-11-29T16:56:52Z","timestamp":1638205012000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["POLLA: Enhancing the Local Structure Awareness in Long Sequence Spatial-temporal Modeling"],"prefix":"10.1145","volume":"12","author":[{"given":"Haoyi","family":"Zhou","sequence":"first","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hao","family":"Peng","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jieqi","family":"Peng","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuai","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianxin","family":"Li","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,11,29]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.apenergy.2019.01.013"},{"key":"e_1_3_2_3_2","series-title":"ICML\u201919","first-page":"21","volume":"97","author":"Abu-El-Haija Sami","year":"2019","unstructured":"Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. 2019. MixHop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In ICML\u201919, Proceedings of Machine Learning Research, Vol. 97. 21\u201329."},{"key":"e_1_3_2_4_2","unstructured":"Iz Beltagy Matthew E. Peters and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv:2004.05150. Retrieved from https:\/\/arxiv.org\/abs\/2004.05150l."},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.3390\/en11071636"},{"key":"e_1_3_2_6_2","unstructured":"Rewon Child Scott Gray Alec Radford and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv:1904.10509. Retrieved from https:\/\/arxiv.org\/abs\/1904.10509."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/3157382.3157527"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-018-1291-x"},{"key":"e_1_3_2_9_2","first-page":"4171","volume-title":"NAACL-HLT\u201919, Volume 1 (Long and Short Papers)","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT\u201919, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171\u20134186."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.5555\/2998981.2999003"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2900481"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186058"},{"key":"e_1_3_2_13_2","unstructured":"Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv:1308.0850. Retrieved from https:\/\/arxiv.org\/abs\/1308.0850."},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2906365"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.2307\/j.ctv14jx6sm"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3271793"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-4380-9_35"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.3141\/1857-09"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cageo.2004.05.012"},{"key":"e_1_3_2_21_2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1007\/978-3-030-01418-6_7","volume-title":"ICANN\u201918","author":"Karatzoglou Antonios","year":"2018","unstructured":"Antonios Karatzoglou, Nikolai Schnell, and Michael Beigl. 2018. A convolutional neural network approach for modeling semantic trajectories and predicting future locations. In ICANN\u201918Lecture Notes in Computer Science, Vol. 11139. Springer, 61\u201372."},{"key":"e_1_3_2_22_2","unstructured":"Angelos Katharopoulos Apoorv Vyas Nikolaos Pappas and Fran\u00e7ois Fleuret. 2020. Transformers are RNNs: Fast autoregressive transformers with linear attention. arXiv:2006.16236. Retrieved from https:\/\/arxiv.org\/abs\/2006.16236."},{"key":"e_1_3_2_23_2","unstructured":"Seongchan Kim Seungkyun Hong Minsu Joh and Sa-Kwang Song. 2017. DeepRain: ConvLSTM network for precipitation prediction using multichannel radar data. arXiv:1711.02316. Retrieved from https:\/\/arxiv.org\/abs\/1711.02316."},{"key":"e_1_3_2_24_2","volume-title":"ICLR 2017","author":"Kipf Thomas N.","year":"2017","unstructured":"Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR 2017. OpenReview.net."},{"key":"e_1_3_2_25_2","unstructured":"Nikita Kitaev Lukasz Kaiser and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv:2001.04451. Retrieved from https:\/\/arxiv.org\/abs\/2001.04451."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.5555\/3504035.3504469"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.5555\/3454287.3454758"},{"key":"e_1_3_2_28_2","volume-title":"ICLR\u201918","author":"Li Yaguang","year":"2018","unstructured":"Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In ICLR\u201918. OpenReview.net."},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.5555\/3015812.3015841"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304250"},{"key":"e_1_3_2_31_2","unstructured":"Li Mengzhang and Zhu Zhanxing. 2020. Spatial-temporal fusion graph neural networks for traffic flow forecasting. arXiv:2012.09641 [cs.LG]. Retrieved from https:\/\/arxiv.org\/abs\/2012.09641."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3411940"},{"key":"e_1_3_2_33_2","unstructured":"Cheonbok Park Chunggi Lee Hyojin Bahng Taeyun won Kihwan Kim Seungmin Jin Sungahn Ko and Jaegul Choo. 2019. STGRAT: A spatio-temporal graph attention network for traffic forecasting. arXiv:1911.13181. Retrieved from https:\/\/arxiv.org\/abs\/1911.13181."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.01.043"},{"key":"e_1_3_2_35_2","unstructured":"Chiara Plizzari Marco Cannici and Matteo Matteucci. 2020. Spatial temporal transformer network for skeleton-based action recognition. arXiv:2008.07404. Retrieved from https:\/\/arxiv.org\/abs\/2008.07404."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/2969239.2969329"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2235192"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.5555\/2968826.2968890"},{"key":"e_1_3_2_39_2","first-page":"914","volume-title":"AAAI\u201920","author":"Song Chao","year":"2020","unstructured":"Chao Song, Youfang Lin, Shengnan Guo, and Huaiyu Wan. 2020. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In AAAI\u201920. AAAI Press, 914\u2013921."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1080\/00031305.2017.1380080"},{"key":"e_1_3_2_41_2","first-page":"4344","volume-title":"EMNLP-IJCNLP\u201919","author":"Tsai Yao-Hung Hubert","year":"2019","unstructured":"Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2019. Transformer dissection: An unified understanding for transformer\u2019s attention via the lens of kernel. In EMNLP-IJCNLP\u201919. Association for Computational Linguistics, 4344\u20134353."},{"key":"e_1_3_2_42_2","first-page":"125","volume-title":"Proceedings of the 9th ISCA Speech Synthesis Workshop","author":"van den Oord A\u00e4ron","year":"2016","unstructured":"A\u00e4ron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu. 2016. WaveNet: A generative model for raw audio. In Proceedings of the 9th ISCA Speech Synthesis Workshop. ISCA, 125."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_3_2_44_2","unstructured":"Bao Wang Xiyang Luo Fangbo Zhang Baichuan Yuan Andrea L. Bertozzi and P. Jeffrey Brantingham. 2018. Graph-based deep modeling and real time forecasting of sparse spatio-temporal data. arXiv:1804.00684. Retrieved from https:\/\/arxiv.org\/abs\/1804.00684."},{"key":"e_1_3_2_45_2","unstructured":"Leye Wang Xu Geng Xiaojuan Ma Feng Liu and Qiang Yang. 2018. Crowd flow prediction by deep spatio-temporal transfer learning. arXiv:1802.00386. Retrieved from https:\/\/arxiv.org\/abs\/1802.00386."},{"key":"e_1_3_2_46_2","unstructured":"Senzhang Wang Jiannong Cao and Philip S. Yu. 2019. Deep learning for spatio-temporal data mining: A survey. arXiv:1906.04928. Retrieved from https:\/\/arxiv.org\/abs\/1906.04928."},{"key":"e_1_3_2_47_2","unstructured":"Sinong Wang Belinda Z. Li Madian Khabsa Han Fang and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv:2006.04768. Retrieved from https:\/\/arxiv.org\/abs\/2006.04768."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1061\/(ASCE)0733-947X(2003)129:6(664)"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.5555\/3367243.3367303"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.5555\/3504035.3504947"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/1899441.1899446"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304273"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219922"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2955794"},{"key":"e_1_3_2_55_2","first-page":"1234","volume-title":"AAAI\u201920","author":"Zheng Chuanpan","year":"2020","unstructured":"Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. GMAN: A graph multi-attention network for traffic prediction. In AAAI\u201920. AAAI Press, 1234\u20131241."},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2932785"},{"key":"e_1_3_2_57_2","volume-title":"AAAI\u201921","author":"Zhou Haoyi","year":"2021","unstructured":"Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long sequence time-series forecasting. In AAAI\u201921. AAAI Press."}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447987","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3447987","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:24Z","timestamp":1750195704000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447987"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,29]]},"references-count":56,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,12,31]]}},"alternative-id":["10.1145\/3447987"],"URL":"https:\/\/doi.org\/10.1145\/3447987","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,29]]},"assertion":[{"value":"2020-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}