{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T23:18:30Z","timestamp":1780355910825,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":38,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,4,9]],"date-time":"2022-04-09T00:00:00Z","timestamp":1649462400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,4,9]]},"DOI":"10.1145\/3489525.3511692","type":"proceedings-article","created":{"date-parts":[[2022,3,25]],"date-time":"2022-03-25T22:11:46Z","timestamp":1648246306000},"page":"133-144","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Performance Model and Profile Guided Design of a High-Performance Session Based Recommendation Engine"],"prefix":"10.1145","author":[{"given":"Ashwin","family":"Krishnan","sequence":"first","affiliation":[{"name":"TCS Research, Mumbai, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Manoj","family":"Nambiar","sequence":"additional","affiliation":[{"name":"TCS Research, Mumbai, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nupur","family":"Sumeet","sequence":"additional","affiliation":[{"name":"TCS Research, Mumbai, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sana","family":"Iqbal","sequence":"additional","affiliation":[{"name":"TCS Research, Mumbai, India"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,4,9]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2016. CIKM Cup 2016 Track 2: Personalized E-Commerce Search Challenge. https: \/\/competitions.codalab.org\/competitions\/11161  2016. CIKM Cup 2016 Track 2: Personalized E-Commerce Search Challenge. https: \/\/competitions.codalab.org\/competitions\/11161"},{"key":"e_1_3_2_1_2_1","unstructured":"2019. PYTORCH DOCUMENTATION. https:\/\/pytorch.org\/docs\/stable\/index.html  2019. PYTORCH DOCUMENTATION. https:\/\/pytorch.org\/docs\/stable\/index.html"},{"key":"e_1_3_2_1_3_1","unstructured":"2021. Alveo u280 data center accelerator card. https:\/\/www.xilinx.com\/content\/ dam\/xilinx\/support\/documentation\/data_sheets\/ds963-u280.pdf  2021. Alveo u280 data center accelerator card. https:\/\/www.xilinx.com\/content\/ dam\/xilinx\/support\/documentation\/data_sheets\/ds963-u280.pdf"},{"key":"e_1_3_2_1_4_1","unstructured":"2021. GSL - GNU Scientific Library. https:\/\/www.gnu.org\/software\/gsl\/  2021. GSL - GNU Scientific Library. https:\/\/www.gnu.org\/software\/gsl\/"},{"key":"e_1_3_2_1_5_1","unstructured":"2021. Intel-Optimized Math Library for Numerical Computing. https: \/\/www.intel.com\/content\/www\/us\/en\/develop\/documentation\/get-startedwith- mkl-for-dpcpp\/top.html  2021. Intel-Optimized Math Library for Numerical Computing. https: \/\/www.intel.com\/content\/www\/us\/en\/develop\/documentation\/get-startedwith- mkl-for-dpcpp\/top.html"},{"key":"e_1_3_2_1_6_1","unstructured":"2022. CUDA Toolkit Documentation v11.4.0. https:\/\/docs.nvidia.com\/cuda\/pdf\/ CUDA_Profiler_Users_Guide.pdf  2022. CUDA Toolkit Documentation v11.4.0. https:\/\/docs.nvidia.com\/cuda\/pdf\/ CUDA_Profiler_Users_Guide.pdf"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3281659"},{"key":"e_1_3_2_1_8_1","unstructured":"N. Corp. 2020. Neuchips recommendation accelerator recaccel. https:\/\/2ca8d951--4386--4e41--9cab-50c86da5f5a8.filesusr.com\/ugd\/d79931_ 9382d53600f54d21a6eabe46d1f0ffa2.pdf  N. Corp. 2020. Neuchips recommendation accelerator recaccel. https:\/\/2ca8d951--4386--4e41--9cab-50c86da5f5a8.filesusr.com\/ugd\/d79931_ 9382d53600f54d21a6eabe46d1f0ffa2.pdf"},{"key":"e_1_3_2_1_9_1","volume-title":"NISER: Normalized item and session representations to handle popularity bias. arXiv preprint arXiv:1909.04276","author":"Gupta Priyanka","year":"2019","unstructured":"Priyanka Gupta , Diksha Garg , Pankaj Malhotra , Lovekesh Vig , and Gautam Shroff . 2019 . NISER: Normalized item and session representations to handle popularity bias. arXiv preprint arXiv:1909.04276 (2019). Priyanka Gupta, Diksha Garg, Pankaj Malhotra, Lovekesh Vig, and Gautam Shroff. 2019. NISER: Normalized item and session representations to handle popularity bias. arXiv preprint arXiv:1909.04276 (2019)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00084"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Udit Gupta Samuel Hsia Mark Wilkening Javin Pombra Hsien-Hsin S Lee Gu-Yeon Wei Carole-Jean Wu David Brooks etal 2021. RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance. arXiv preprint arXiv:2105.08820 (2021).  Udit Gupta Samuel Hsia Mark Wilkening Javin Pombra Hsien-Hsin S Lee Gu-Yeon Wei Carole-Jean Wu David Brooks et al. 2021. RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance. arXiv preprint arXiv:2105.08820 (2021).","DOI":"10.1145\/3466752.3480127"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052569"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/963770.963772"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00083"},{"key":"e_1_3_2_1_15_1","volume-title":"Proceedings of Machine Learning and Systems 3","author":"Jiang Wenqi","year":"2021","unstructured":"Wenqi Jiang , Zhenhao He , Shuai Zhang , Thomas B Preu\u00dfer , Kai Zeng , Liang Feng , Jiansong Zhang , Tongxuan Liu , Yong Li , Jingren Zhou , 2021 . MicroRec: efficient recommendation inference by hardware and data structure solutions . Proceedings of Machine Learning and Systems 3 (2021). Wenqi Jiang, Zhenhao He, Shuai Zhang, Thomas B Preu\u00dfer, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, et al. 2021. MicroRec: efficient recommendation inference by hardware and data structure solutions. Proceedings of Machine Learning and Systems 3 (2021)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373087.3375887"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00070"},{"key":"e_1_3_2_1_18_1","volume-title":"TRiM: Tensor Reduction in Memory","author":"Kim Byeongho","year":"2020","unstructured":"Byeongho Kim , Jaehyun Park , Eojin Lee , Minsoo Rhu , and Jung Ho Ahn . 2020. TRiM: Tensor Reduction in Memory . IEEE Computer Architecture Letters PP ( 12 2020 ), 1--1. https:\/\/doi.org\/10.1109\/LCA.2020.3042805 10.1109\/LCA.2020.3042805 Byeongho Kim, Jaehyun Park, Eojin Lee, Minsoo Rhu, and Jung Ho Ahn. 2020. TRiM: Tensor Reduction in Memory. IEEE Computer Architecture Letters PP (12 2020), 1--1. https:\/\/doi.org\/10.1109\/LCA.2020.3042805"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358284"},{"key":"e_1_3_2_1_20_1","volume-title":"Sorting networks on FPGAs. VLDB J. 21 (02","author":"Mueller Rene","year":"2012","unstructured":"Rene Mueller , Jens Teubner , and Gustavo Alonso . 2012. Sorting networks on FPGAs. VLDB J. 21 (02 2012 ), 1--23. https:\/\/doi.org\/10.1007\/s00778-011-0232-z 10.1007\/s00778-011-0232-z Rene Mueller, Jens Teubner, and Gustavo Alonso. 2012. Sorting networks on FPGAs. VLDB J. 21 (02 2012), 1--23. https:\/\/doi.org\/10.1007\/s00778-011-0232-z"},{"key":"e_1_3_2_1_21_1","volume-title":"Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole- Jean Wu, Alisson G Azzolini, et al.","author":"Naumov Maxim","year":"2019","unstructured":"Maxim Naumov , Dheevatsa Mudigere , Hao-Jun Michael Shi , Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole- Jean Wu, Alisson G Azzolini, et al. 2019 . Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019). Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole- Jean Wu, Alisson G Azzolini, et al. 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019)."},{"key":"e_1_3_2_1_22_1","unstructured":"Zhiqiang Pan Fei Cai Wanyu Chen Honghui Chen and Maarten de Rijke. 2020. Star graph neural networks for session-based recommendation. 1195-- 1204 pages.  Zhiqiang Pan Fei Cai Wanyu Chen Honghui Chen and Maarten de Rijke. 2020. Star graph neural networks for session-based recommendation. 1195-- 1204 pages."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358010"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772773"},{"key":"e_1_3_2_1_25_1","volume-title":"Markus Hagenbuchner, and Gabriele Monfardini.","author":"Scarselli Franco","year":"2008","unstructured":"Franco Scarselli , Marco Gori , Ah Chung Tsoi , Markus Hagenbuchner, and Gabriele Monfardini. 2008 . The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61--80. Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2008. The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61--80."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2010.69"},{"key":"e_1_3_2_1_27_1","unstructured":"Joshua Vasquez. 2016. SORT FASTER WITH FPGAS. https:\/\/hackaday.com\/2016\/ 01\/20\/a-linear-time-sorting-algorithm-for-fpgas\/  Joshua Vasquez. 2016. SORT FASTER WITH FPGAS. https:\/\/hackaday.com\/2016\/ 01\/20\/a-linear-time-sorting-algorithm-for-fpgas\/"},{"key":"e_1_3_2_1_28_1","volume-title":"Benchmarking High Bandwidth Memory on FPGAs. arXiv preprint arXiv:2005.04324","author":"Wang Zeke","year":"2020","unstructured":"Zeke Wang , Hongjing Huang , Jie Zhang , and Gustavo Alonso . 2020. Benchmarking High Bandwidth Memory on FPGAs. arXiv preprint arXiv:2005.04324 ( 2020 ). Zeke Wang, Hongjing Huang, Jie Zhang, and Gustavo Alonso. 2020. Benchmarking High Bandwidth Memory on FPGAs. arXiv preprint arXiv:2005.04324 (2020)."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401142"},{"key":"e_1_3_2_1_30_1","volume-title":"Exploring Global Information for Session-based Recommendation. arXiv preprint arXiv:2011.10173","author":"Wang Ziyang","year":"2020","unstructured":"Ziyang Wang , Wei Wei , Gao Cong , Xiao-Li Li , Xian-Ling Mao , Minghui Qiu , and Shanshan Feng . 2020. Exploring Global Information for Session-based Recommendation. arXiv preprint arXiv:2011.10173 ( 2020 ). Ziyang Wang, Wei Wei, Gao Cong, Xiao-Li Li, Xian-Ling Mao, Minghui Qiu, and Shanshan Feng. 2020. Exploring Global Information for Session-based Recommendation. arXiv preprint arXiv:2011.10173 (2020)."},{"key":"e_1_3_2_1_31_1","volume-title":"Graph neural networks in recommender systems: a survey. arXiv preprint arXiv:2011.02260","author":"Wu Shiwen","year":"2020","unstructured":"Shiwen Wu , Fei Sun , Wentao Zhang , and Bin Cui . 2020. Graph neural networks in recommender systems: a survey. arXiv preprint arXiv:2011.02260 ( 2020 ). Shiwen Wu, Fei Sun, Wentao Zhang, and Bin Cui. 2020. Graph neural networks in recommender systems: a survey. arXiv preprint arXiv:2011.02260 (2020)."},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Tang Yuyuan","year":"2019","unstructured":"ShuWu, Yuyuan Tang , Yanqiao Zhu , LiangWang, Xing Xie , and Tieniu Tan . 2019 . Session-based recommendation with graph neural networks . In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 33 . 346--353. ShuWu, Yuyuan Tang, Yanqiao Zhu, LiangWang, Xing Xie, and Tieniu Tan. 2019. Session-based recommendation with graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 346--353."},{"key":"e_1_3_2_1_33_1","unstructured":"Inc. Xilinx. 2021. AXI High Bandwidth Memory Controller v1.0 LogiCORE IP Product Guide. https:\/\/www.xilinx.com\/support\/documentation\/ip_documentation\/hbm\/ v1_0\/pg276-axi-hbm.pdf  Inc. Xilinx. 2021. AXI High Bandwidth Memory Controller v1.0 LogiCORE IP Product Guide. https:\/\/www.xilinx.com\/support\/documentation\/ip_documentation\/hbm\/ v1_0\/pg276-axi-hbm.pdf"},{"key":"e_1_3_2_1_34_1","unstructured":"Inc. Xilinx. 2021. Vitis High-Level Synthesis User Guide UG1399 (v2020.2). https:\/\/www.xilinx.com\/support\/documentation\/sw_manuals\/xilinx2020_2\/ ug1399-vitis-hls.pdf  Inc. Xilinx. 2021. Vitis High-Level Synthesis User Guide UG1399 (v2020.2). https:\/\/www.xilinx.com\/support\/documentation\/sw_manuals\/xilinx2020_2\/ ug1399-vitis-hls.pdf"},{"key":"e_1_3_2_1_35_1","volume-title":"DAGNN: Demand-aware Graph Neural Networks for Session-based Recommendation. arXiv preprint arXiv:2105.14428","author":"Yang Liqi","year":"2021","unstructured":"Liqi Yang , Linhan Luo , Lifeng Xin , Xiaofeng Zhang , and Xinni Zhang . 2021 . DAGNN: Demand-aware Graph Neural Networks for Session-based Recommendation. arXiv preprint arXiv:2105.14428 (2021). Liqi Yang, Linhan Luo, Lifeng Xin, Xiaofeng Zhang, and Xinni Zhang. 2021. DAGNN: Demand-aware Graph Neural Networks for Session-based Recommendation. arXiv preprint arXiv:2105.14428 (2021)."},{"key":"#cr-split#-e_1_3_2_1_36_1.1","doi-asserted-by":"crossref","unstructured":"Guorui Zhou Kun Gai Xiaoqiang Zhu Chenru Song Ying Fan Han Zhu Xiao Ma Yanghui Yan Junqi Jin and Han Li. 2018. Deep Interest Network for Click- Through Rate Prediction. 1059--1068. https:\/\/doi.org\/10.1145\/3219819.3219823 10.1145\/3219819.3219823","DOI":"10.1145\/3219819.3219823"},{"key":"#cr-split#-e_1_3_2_1_36_1.2","doi-asserted-by":"crossref","unstructured":"Guorui Zhou Kun Gai Xiaoqiang Zhu Chenru Song Ying Fan Han Zhu Xiao Ma Yanghui Yan Junqi Jin and Han Li. 2018. Deep Interest Network for Click- Through Rate Prediction. 1059--1068. https:\/\/doi.org\/10.1145\/3219819.3219823","DOI":"10.1145\/3219819.3219823"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015941"}],"event":{"name":"ICPE '22: ACM\/SPEC International Conference on Performance Engineering","location":"Beijing China","acronym":"ICPE '22","sponsor":["SIGMETRICS ACM Special Interest Group on Measurement and Evaluation","SIGSOFT ACM Special Interest Group on Software Engineering"]},"container-title":["Proceedings of the 2022 ACM\/SPEC on International Conference on Performance Engineering"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3489525.3511692","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3489525.3511692","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:24Z","timestamp":1750186944000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3489525.3511692"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,9]]},"references-count":38,"alternative-id":["10.1145\/3489525.3511692","10.1145\/3489525"],"URL":"https:\/\/doi.org\/10.1145\/3489525.3511692","relation":{},"subject":[],"published":{"date-parts":[[2022,4,9]]},"assertion":[{"value":"2022-04-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}