{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T10:57:55Z","timestamp":1761649075624,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557095","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:29:57Z","timestamp":1665883797000},"page":"3312-3321","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Sparse Attentive Memory Network for Click-through Rate Prediction with Long Sequences"],"prefix":"10.1145","author":[{"given":"Qianying","family":"Lin","sequence":"first","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"given":"Wen-Ji","family":"Zhou","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"given":"Yanshi","family":"Wang","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"given":"Qing","family":"Da","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"given":"Qing-Guo","family":"Chen","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]},{"given":"Bing","family":"Wang","sequence":"additional","affiliation":[{"name":"Alibaba Group, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526713"},{"key":"e_1_3_2_2_2_1","volume-title":"Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473","author":"Bahdanau Dzmitry","year":"2014","unstructured":"Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 ( 2014 ). Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)."},{"key":"e_1_3_2_2_3_1","unstructured":"Jos\u00e9 M. Cecilia Jos\u00e9 M. Garc\u00eda and Manuel Ujald\u00f3n. 2009. The GPU on the Matrix-Matrix Multiply: Performance Study and Contributions. In PARCO. Jos\u00e9 M. Cecilia Jos\u00e9 M. Garc\u00eda and Manuel Ujald\u00f3n. 2009. The GPU on the Matrix-Matrix Multiply: Performance Study and Contributions. In PARCO."},{"key":"e_1_3_2_2_4_1","volume-title":"attend and spell. arXiv preprint arXiv:1508.01211","author":"Chan William","year":"2015","unstructured":"William Chan , Navdeep Jaitly , Quoc V Le , and Oriol Vinyals . 2015. Listen , attend and spell. arXiv preprint arXiv:1508.01211 ( 2015 ). William Chan, Navdeep Jaitly, Quoc V Le, and Oriol Vinyals. 2015. Listen, attend and spell. arXiv preprint arXiv:1508.01211 (2015)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3465055"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3326937.3341261"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3159668"},{"key":"e_1_3_2_2_8_1","volume-title":"Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509","author":"Child Rewon","year":"2019","unstructured":"Rewon Child , Scott Gray , Alec Radford , and Ilya Sutskever . 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 ( 2019 ). Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019)."},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_3_2_2_10_1","volume-title":"Attention-over-attention neural networks for reading comprehension. arXiv preprint arXiv:1607.04423","author":"Cui Yiming","year":"2016","unstructured":"Yiming Cui , Zhipeng Chen , Si Wei , Shijin Wang , Ting Liu , and Guoping Hu. 2016. Attention-over-attention neural networks for reading comprehension. arXiv preprint arXiv:1607.04423 ( 2016 ). Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, and Guoping Hu. 2016. Attention-over-attention neural networks for reading comprehension. arXiv preprint arXiv:1607.04423 (2016)."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2015.98"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1058129.1058148"},{"key":"e_1_3_2_2_13_1","volume-title":"What does attention in neural machine translation pay attention to? arXiv preprint arXiv:1710.03348","author":"Ghader Hamidreza","year":"2017","unstructured":"Hamidreza Ghader and Christof Monz . 2017. What does attention in neural machine translation pay attention to? arXiv preprint arXiv:1710.03348 ( 2017 ). Hamidreza Ghader and Christof Monz. 2017. What does attention in neural machine translation pay attention to? arXiv preprint arXiv:1710.03348 (2017)."},{"key":"e_1_3_2_2_14_1","volume-title":"Neural turing machines. arXiv preprint arXiv:1410.5401","author":"Graves Alex","year":"2014","unstructured":"Alex Graves , Greg Wayne , and Ivo Danihelka . 2014. Neural turing machines. arXiv preprint arXiv:1410.5401 ( 2014 ). Alex Graves, Greg Wayne, and Ivo Danihelka. 2014. Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380190"},{"key":"e_1_3_2_2_16_1","volume-title":"Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939","author":"Hidasi Bal\u00e1zs","year":"2015","unstructured":"Bal\u00e1zs Hidasi , Alexandros Karatzoglou , Linas Baltrunas , and Domonkos Tikk . 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 ( 2015 ). Bal\u00e1zs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)."},{"key":"e_1_3_2_2_17_1","volume-title":"Long short-term memory. Neural computation 9, 8","author":"Hochreiter Sepp","year":"1997","unstructured":"Sepp Hochreiter and J\u00fcrgen Schmidhuber . 1997. Long short-term memory. Neural computation 9, 8 ( 1997 ), 1735--1780. Sepp Hochreiter and J\u00fcrgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780."},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959134"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2018.00035"},{"key":"e_1_3_2_2_20_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_2_21_1","volume-title":"Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451","author":"Kitaev Nikita","year":"2020","unstructured":"Nikita Kitaev , Lukasz Kaiser , and Anselm Levskaya . 2020 . Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020). Nikita Kitaev, Lukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020)."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557072"},{"key":"e_1_3_2_2_23_1","volume-title":"International conference on machine learning. PMLR, 1378--1387","author":"Kumar Ankit","year":"2016","unstructured":"Ankit Kumar , Ozan Irsoy , Peter Ondruska , Mohit Iyyer , James Bradbury , Ishaan Gulrajani , Victor Zhong , Romain Paulus , and Richard Socher . 2016 . Ask me anything: Dynamic memory networks for natural language processing . In International conference on machine learning. PMLR, 1378--1387 . Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. 2016. Ask me anything: Dynamic memory networks for natural language processing. In International conference on machine learning. PMLR, 1378--1387."},{"key":"e_1_3_2_2_24_1","first-page":"16997","article-title":"Sac: Accelerating and structuring self-attention via sparse adaptive connection","volume":"33","author":"Li Xiaoya","year":"2020","unstructured":"Xiaoya Li , Yuxian Meng , Mingxin Zhou , Qinghong Han , Fei Wu , and Jiwei Li . 2020 . Sac: Accelerating and structuring self-attention via sparse adaptive connection . Advances in Neural Information Processing Systems 33 (2020), 16997 -- 17008 . Xiaoya Li, Yuxian Meng, Mingxin Zhou, Qinghong Han, Fei Wu, and Jiwei Li. 2020. Sac: Accelerating and structuring self-attention via sparse adaptive connection. Advances in Neural Information Processing Systems 33 (2020), 16997--17008.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220023"},{"key":"e_1_3_2_2_26_1","volume-title":"Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio.","author":"Lin Zhouhan","year":"2017","unstructured":"Zhouhan Lin , Minwei Feng , Cicero Nogueira dos Santos , Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017 . A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017). Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017)."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5945"},{"key":"e_1_3_2_2_28_1","volume-title":"Kiran Kumar and Kishore Kothapalli","author":"Krishna Bharadwaj Matam Siva Rama","year":"2012","unstructured":"Siva Rama Krishna Bharadwaj Matam , Kiran Kumar and Kishore Kothapalli . 2012 . Sparse matrix matrix multiplication on hybrid CPU GPU platforms. In PARCO. Siva Rama Krishna Bharadwaj Matam, Kiran Kumar and Kishore Kothapalli. 2012. Sparse matrix matrix multiplication on hybrid CPU GPU platforms. In PARCO."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767755"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330666"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401440"},{"key":"e_1_3_2_2_32_1","volume-title":"cosFormer: Rethinking Softmax in Attention. arXiv preprint arXiv:2202.08791","author":"Qin Zhen","year":"2022","unstructured":"Zhen Qin , Weixuan Sun , Hui Deng , Dongxu Li , Yunshen Wei , Baohong Lv , Junjie Yan , Lingpeng Kong , and Yiran Zhong . 2022. cosFormer: Rethinking Softmax in Attention. arXiv preprint arXiv:2202.08791 ( 2022 ). Zhen Qin, Weixuan Sun, Hui Deng, Dongxu Li, Yunshen Wei, Baohong Lv, Junjie Yan, Lingpeng Kong, and Yiran Zhong. 2022. cosFormer: Rethinking Softmax in Attention. arXiv preprint arXiv:2202.08791 (2022)."},{"key":"e_1_3_2_2_33_1","volume-title":"Compressive transformers for long-range sequence modelling. arXiv preprint arXiv:1911.05507","author":"Rae Jack W","year":"2019","unstructured":"Jack W Rae , Anna Potapenko , Siddhant M Jayakumar , and Timothy P Lillicrap . 2019. Compressive transformers for long-range sequence modelling. arXiv preprint arXiv:1911.05507 ( 2019 ). Jack W Rae, Anna Potapenko, Siddhant M Jayakumar, and Timothy P Lillicrap. 2019. Compressive transformers for long-range sequence modelling. arXiv preprint arXiv:1911.05507 (2019)."},{"key":"e_1_3_2_2_34_1","volume-title":"Ask me even more: dynamic memory tensor networks (extended model). arXiv preprint arXiv:1703.03939","author":"Ramachandran Govardana Sachithanandam","year":"2017","unstructured":"Govardana Sachithanandam Ramachandran and Ajay Sohmshetty . 2017. Ask me even more: dynamic memory tensor networks (extended model). arXiv preprint arXiv:1703.03939 ( 2017 ). Govardana Sachithanandam Ramachandran and Ajay Sohmshetty. 2017. Ask me even more: dynamic memory tensor networks (extended model). arXiv preprint arXiv:1703.03939 (2017)."},{"key":"e_1_3_2_2_35_1","volume-title":"Stand-alone self-attention in vision models. Advances in Neural Information Processing Systems 32","author":"Ramachandran Prajit","year":"2019","unstructured":"Prajit Ramachandran , Niki Parmar , Ashish Vaswani , Irwan Bello , Anselm Levskaya , and Jon Shlens . 2019. Stand-alone self-attention in vision models. Advances in Neural Information Processing Systems 32 ( 2019 ). Prajit Ramachandran, Niki Parmar, Ashish Vaswani, Irwan Bello, Anselm Levskaya, and Jon Shlens. 2019. Stand-alone self-attention in vision models. Advances in Neural Information Processing Systems 32 (2019)."},{"key":"e_1_3_2_2_36_1","volume-title":"Toward training recurrent neural networks for lifelong learning. Neural computation 32, 1","author":"Sodhani Shagun","year":"2020","unstructured":"Shagun Sodhani , Sarath Chandar , and Yoshua Bengio . 2020. Toward training recurrent neural networks for lifelong learning. Neural computation 32, 1 ( 2020 ), 1--35. Shagun Sodhani, Sarath Chandar, and Yoshua Bengio. 2020. Toward training recurrent neural networks for lifelong learning. Neural computation 32, 1 (2020), 1--35."},{"key":"e_1_3_2_2_37_1","volume-title":"et al","author":"Tan Qiaoyu","year":"2021","unstructured":"Qiaoyu Tan , Jianwei Zhang , Ninghao Liu , Xiao Huang , Hongxia Yang , Jignren Zhou , Xia Hu , et al . 2021 . Dynamic memory based attention network for sequential recommendation. arXiv preprint arXiv:2102.09269 (2021). Qiaoyu Tan, Jianwei Zhang, Ninghao Liu, Xiao Huang, Hongxia Yang, Jignren Zhou, Xia Hu, et al . 2021. Dynamic memory based attention network for sequential recommendation. arXiv preprint arXiv:2102.09269 (2021)."},{"key":"e_1_3_2_2_38_1","volume-title":"Attention is all you need. Advances in neural information processing systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017. Attention is all you need. Advances in neural information processing systems 30 ( 2017 ). Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_2_39_1","volume-title":"Analyzing the structure of attention in a transformer language model. arXiv preprint arXiv:1906.04284","author":"Vig Jesse","year":"2019","unstructured":"Jesse Vig and Yonatan Belinkov . 2019. Analyzing the structure of attention in a transformer language model. arXiv preprint arXiv:1906.04284 ( 2019 ). Jesse Vig and Yonatan Belinkov. 2019. Analyzing the structure of attention in a transformer language model. arXiv preprint arXiv:1906.04284 (2019)."},{"key":"e_1_3_2_2_40_1","volume-title":"Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768","author":"Wang Sinong","year":"2020","unstructured":"Sinong Wang , Belinda Z Li , Madian Khabsa , Han Fang , and Hao Ma . 2020 . Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020). Sinong Wang, Belinda Z Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. Linformer: Self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020)."},{"key":"e_1_3_2_2_41_1","unstructured":"Jason Weston Sumit Chopra and Antoine Bordes. 2014. Memory Networks. In arXiv preprint arXiv:1410.3916. Jason Weston Sumit Chopra and Antoine Bordes. 2014. Memory Networks. In arXiv preprint arXiv:1410.3916."},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2016.7498326"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.3301346"},{"key":"e_1_3_2_2_44_1","volume-title":"International conference on machine learning. PMLR, 2397--2406","author":"Xiong Caiming","year":"2016","unstructured":"Caiming Xiong , Stephen Merity , and Richard Socher . 2016 . Dynamic memory networks for visual and textual question answering . In International conference on machine learning. PMLR, 2397--2406 . Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. In International conference on machine learning. PMLR, 2397--2406."},{"key":"e_1_3_2_2_45_1","volume-title":"International conference on machine learning. PMLR","author":"Xu Kelvin","year":"2015","unstructured":"Kelvin Xu , Jimmy Ba , Ryan Kiros , Kyunghyun Cho , Aaron Courville , Ruslan Salakhudinov , Rich Zemel , and Yoshua Bengio . 2015 . Show, attend and tell: Neural image caption generation with visual attention . In International conference on machine learning. PMLR , 2048--2057. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. PMLR, 2048--2057."},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"crossref","unstructured":"Zeping Yu Jianxun Lian Ahmad Mahmoody Gongshen Liu and Xing Xie. 2019. Adaptive User Modeling with Long and Short-Term Preferences for Personalized Recommendation. In IJCAI. 4213--4219. Zeping Yu Jianxun Lian Ahmad Mahmoody Gongshen Liu and Xing Xie. 2019. Adaptive User Modeling with Long and Short-Term Preferences for Personalized Recommendation. In IJCAI. 4213--4219.","DOI":"10.24963\/ijcai.2019\/585"},{"key":"e_1_3_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3289600.3290975"},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11618"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015941"},{"key":"e_1_3_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219823"},{"key":"e_1_3_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17325"}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Atlanta GA USA","acronym":"CIKM '22"},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557095","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557095","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:56Z","timestamp":1750188656000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557095"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":51,"alternative-id":["10.1145\/3511808.3557095","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557095","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}