{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T03:05:42Z","timestamp":1768532742569,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":60,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,6,26]],"date-time":"2019-06-26T00:00:00Z","timestamp":1561507200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,6,26]]},"DOI":"10.1145\/3330345.3330361","type":"proceedings-article","created":{"date-parts":[[2019,6,18]],"date-time":"2019-06-18T12:14:30Z","timestamp":1560860070000},"page":"171-183","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["GPU snapshot"],"prefix":"10.1145","author":[{"given":"Kyushick","family":"Lee","sequence":"first","affiliation":[{"name":"University of Texas at Austin"}]},{"given":"Michael B.","family":"Sullivan","sequence":"additional","affiliation":[{"name":"NVIDIA"}]},{"given":"Siva Kumar Sastry","family":"Hari","sequence":"additional","affiliation":[{"name":"NVIDIA"}]},{"given":"Timothy","family":"Tsai","sequence":"additional","affiliation":[{"name":"NVIDIA"}]},{"given":"Stephen W.","family":"Keckler","sequence":"additional","affiliation":[{"name":"NVIDIA"}]},{"given":"Mattan","family":"Erez","sequence":"additional","affiliation":[{"name":"University of Texas at Austin"}]}],"member":"320","published-online":{"date-parts":[[2019,6,26]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126918"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1006209.1006248"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126918"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2008.19"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.666828"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600212.2600224"},{"key":"e_1_3_2_1_7_1","volume-title":"CUG2016 Proceedings","author":"Bhimji Wahid","year":"2016","unstructured":"Wahid Bhimji , Debbie Bard , Melissa Romanus , David Paul , Andrey Ovsyannikov , Brian Friesen , Matt Bryson , Joaquin Correa , Glenn K Lockwood , Vakho Tsulaia , 2016 . Accelerating science with the NERSC burst buffer early user program . CUG2016 Proceedings (2016). Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K Lockwood, Vakho Tsulaia, et al. 2016. Accelerating science with the NERSC burst buffer early user program. CUG2016 Proceedings (2016)."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2132876.2132887"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1251203.1251223"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1188455.1188587"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2014.62"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2010.5470427"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/568522.568525"},{"key":"e_1_3_2_1_14_1","unstructured":"Fernanda Foertter. 2017. Preparing GPU-Accelerated Applications for the Summit Supercomputer. http:\/\/on-demand.gputechconf.com\/gtc\/2017\/presentation\/s7642-fernanda-foertter-preparing-gpu-accelerated-app.pdf. http:\/\/on-demand.gputechconf.com\/gtc\/2017\/presentation\/s7642-fernanda-foertter-preparing-gpu-accelerated-app.pdf GPU Technology Conference (GTC).  Fernanda Foertter. 2017. Preparing GPU-Accelerated Applications for the Summit Supercomputer. http:\/\/on-demand.gputechconf.com\/gtc\/2017\/presentation\/s7642-fernanda-foertter-preparing-gpu-accelerated-app.pdf. http:\/\/on-demand.gputechconf.com\/gtc\/2017\/presentation\/s7642-fernanda-foertter-preparing-gpu-accelerated-app.pdf GPU Technology Conference (GTC)."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2018.00047"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.76"},{"key":"e_1_3_2_1_17_1","volume-title":"Unified Memory for CUDA Beginners. https:\/\/devblogs.nvidia.com\/unified-memory-cuda-beginners\/. {Online","author":"Harris Mark","year":"2018","unstructured":"Mark Harris . 2016. Unified Memory for CUDA Beginners. https:\/\/devblogs.nvidia.com\/unified-memory-cuda-beginners\/. {Online ; accessed 18- Jan- 2018 }. Mark Harris. 2016. Unified Memory for CUDA Beginners. https:\/\/devblogs.nvidia.com\/unified-memory-cuda-beginners\/. {Online; accessed 18-Jan-2018}."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066677.1067026"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080238"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1618525.1618528"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/1894122.1894151"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-29740-3_34"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342015570921"},{"key":"e_1_3_2_1_24_1","unstructured":"Intel. 2015. Ushering in a New Era. https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/presentation\/intel-argonne-aurora-announcement-presentation.pdf. https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/presentation\/intel-argonne-aurora-announcement-presentation.pdf  Intel. 2015. Ushering in a New Era. https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/presentation\/intel-argonne-aurora-announcement-presentation.pdf. https:\/\/www.intel.com\/content\/dam\/www\/public\/us\/en\/documents\/presentation\/intel-argonne-aurora-announcement-presentation.pdf"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2247684.2247698"},{"key":"e_1_3_2_1_26_1","volume-title":"Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI).","author":"Kim Sangman","year":"2014","unstructured":"Sangman Kim , Seonggu Huh , Xinya Zhang , Yige Hu , Amir Wated , Emmett Witchel , and Mark Silberstein . 2014 . GPUnet: Networking Abstractions for GPU Programs . In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI). Sangman Kim, Seonggu Huh, Xinya Zhang, Yige Hu, Amir Wated, Emmett Witchel, and Mark Silberstein. 2014. GPUnet: Networking Abstractions for GPU Programs. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI)."},{"key":"e_1_3_2_1_27_1","volume-title":"Proceedings of Computer-Aided Design (ICCAD)","author":"Li Sheng","unstructured":"Sheng Li , Ke Chen , Jung Ho Ahn , Jay B. Brockman , and Norman P. Jouppi . 2011. CACTI-P: Architecture-level Modeling for SRAM-based Structures with Advanced Leakage Reduction Techniques . In Proceedings of Computer-Aided Design (ICCAD) . Piscataway, NJ, USA. Sheng Li, Ke Chen, Jung Ho Ahn, Jay B. Brockman, and Norman P. Jouppi. 2011. CACTI-P: Architecture-level Modeling for SRAM-based Structures with Advanced Leakage Reduction Techniques. In Proceedings of Computer-Aided Design (ICCAD). Piscataway, NJ, USA."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2012.6232369"},{"key":"e_1_3_2_1_29_1","volume-title":"Linux Symposium","volume":"120","author":"Mehnert-Spahn John","year":"2009","unstructured":"John Mehnert-Spahn , Eugen Feller , and Michael Schoettner . 2009 . Incremental checkpointing for grids . In Linux Symposium , Vol. 120 . John Mehnert-Spahn, Eugen Feller, and Michael Schoettner. 2009. Incremental checkpointing for grids. In Linux Symposium, Vol. 120."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/2337159.2337168"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.18"},{"key":"e_1_3_2_1_32_1","unstructured":"NERSC. 2013. Edison Storage and IO. http:\/\/www.nersc.gov\/users\/computational-systems\/edison\/file-storage-and-i-o\/.  NERSC. 2013. Edison Storage and IO. http:\/\/www.nersc.gov\/users\/computational-systems\/edison\/file-storage-and-i-o\/."},{"key":"e_1_3_2_1_33_1","unstructured":"NERSC. 2017. Cori Storage and IO. http:\/\/www.nersc.gov\/users\/computational-systems\/cori\/file-storage-and-i-o\/.  NERSC. 2017. Cori Storage and IO. http:\/\/www.nersc.gov\/users\/computational-systems\/cori\/file-storage-and-i-o\/."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2012.82"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.131"},{"key":"e_1_3_2_1_36_1","unstructured":"NVIDIA. 2017. CUDA C Programming Guide Appendix K: Unified Memory Programming. http:\/\/docs.nvidia.com\/cuda\/pdf\/CUDA_C_Programming_Guide.pdf. 267--286 pages. PG-02829-001_v9.1 {Online; accessed 17-Jan-2018}.  NVIDIA. 2017. CUDA C Programming Guide Appendix K: Unified Memory Programming. http:\/\/docs.nvidia.com\/cuda\/pdf\/CUDA_C_Programming_Guide.pdf. 267--286 pages. PG-02829-001_v9.1 {Online; accessed 17-Jan-2018}."},{"key":"e_1_3_2_1_37_1","unstructured":"NVIDIA. 2018. NVIDIA DGX-2: The world's most powerful AI system for the most complex AI challenges. https:\/\/www.nvidia.com\/en-us\/data-center\/dgx-2\/.  NVIDIA. 2018. NVIDIA DGX-2: The world's most powerful AI system for the most complex AI challenges. https:\/\/www.nvidia.com\/en-us\/data-center\/dgx-2\/."},{"key":"e_1_3_2_1_38_1","unstructured":"NVIDIA. 2019. The NVIDIA profiling tool (nvprof). http:\/\/docs.nvidia.com\/cuda\/profiler-users-guide\/.  NVIDIA. 2019. The NVIDIA profiling tool (nvprof). http:\/\/docs.nvidia.com\/cuda\/profiler-users-guide\/."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/1267411.1267429"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.5555\/545215.545228"},{"key":"e_1_3_2_1_41_1","unstructured":"Roy Kim. 2016. NVIDIA DGX SATURNV Ranked World's Most Efficient Supercomputer by Wide Margin. https:\/\/blogs.nvidia.com\/blog\/2016\/11\/14\/dgx-saturnv\/. https:\/\/blogs.nvidia.com\/blog\/2016\/11\/14\/dgx-saturnv\/ NVIDIA Blog.  Roy Kim. 2016. NVIDIA DGX SATURNV Ranked World's Most Efficient Supercomputer by Wide Margin. https:\/\/blogs.nvidia.com\/blog\/2016\/11\/14\/dgx-saturnv\/. https:\/\/blogs.nvidia.com\/blog\/2016\/11\/14\/dgx-saturnv\/ NVIDIA Blog."},{"key":"e_1_3_2_1_42_1","unstructured":"Samsung. 2018. Samsung PM1725a NVMe SSD. https:\/\/www.samsung.com\/semiconductor\/global.semi.static\/Samsung_PM1725a_NVMe_SSD-0.pdf.  Samsung. 2018. Samsung PM1725a NVMe SSD. https:\/\/www.samsung.com\/semiconductor\/global.semi.static\/Samsung_PM1725a_NVMe_SSD-0.pdf."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2015.67"},{"key":"e_1_3_2_1_44_1","volume-title":"Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC).","author":"Sato K.","unstructured":"K. Sato , N. Maruyama , K. Mohror , A. Moody , T. Gamblin , B. R. de Supinski , and S. Matsuoka . 2012. Design and modeling of a non-blocking checkpointing system . In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC). K. Sato, N. Maruyama, K. Mohror, A. Moody, T. Gamblin, B. R. de Supinski, and S. Matsuoka. 2012. Design and modeling of a non-blocking checkpointing system. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC)."},{"key":"e_1_3_2_1_45_1","unstructured":"Akira Nukada Shinichi Miura Akihiro Nomura Hitoshi Sato Hideyuki Jitsumoto Aleksandr Drozd Satoshi Matsuoka Toshio Endo. 2017. Overview of TSUB-AME3.0 Green Cloud Supercomputer for Convergence of HPC AI and Big-Data. https:\/\/www.titech.ac.jp\/news\/pdf\/news_17675_2.pdf.  Akira Nukada Shinichi Miura Akihiro Nomura Hitoshi Sato Hideyuki Jitsumoto Aleksandr Drozd Satoshi Matsuoka Toshio Endo. 2017. Overview of TSUB-AME3.0 Green Cloud Supercomputer for Convergence of HPC AI and Big-Data. https:\/\/www.titech.ac.jp\/news\/pdf\/news_17675_2.pdf."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451169"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/545215.545229"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/PDCAT.2009.78"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123939.3123950"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056044"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.5555\/829517.830674"},{"key":"e_1_3_2_1_52_1","volume-title":"Linux Symposium. 69","author":"Vasavada Manav","year":"2011","unstructured":"Manav Vasavada , Frank Mueller , Paul H Hargrove , and Eric Roman . 2011 . Comparing different approaches for incremental checkpointing: The showdown . In Linux Symposium. 69 . Manav Vasavada, Frank Mueller, Paul H Hargrove, and Eric Roman. 2011. Comparing different approaches for incremental checkpointing: The showdown. In Linux Symposium. 69."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291656.3291726"},{"key":"e_1_3_2_1_54_1","volume-title":"Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC). IEEE Press, 43","author":"Wang Chao","year":"2008","unstructured":"Chao Wang , Frank Mueller , Christian Engelmann , and Stephen L Scott . 2008 . Proactive process-level live migration in HPC environments . In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC). IEEE Press, 43 . Chao Wang, Frank Mueller, Christian Engelmann, and Stephen L Scott. 2008. Proactive process-level live migration in HPC environments. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC). IEEE Press, 43."},{"key":"e_1_3_2_1_55_1","volume-title":"Proceedings of the International Symposium on Parallel and Distributed Processing (IPDPS).","author":"Wang Chao","year":"2011","unstructured":"Chao Wang , Frank Mueller , Christian Engelmann , and Stephen L Scott . 2011 . Hybrid full\/incremental checkpoint\/restart for MPI jobs in HPC environments . In Proceedings of the International Symposium on Parallel and Distributed Processing (IPDPS). Chao Wang, Frank Mueller, Christian Engelmann, and Stephen L Scott. 2011. Hybrid full\/incremental checkpoint\/restart for MPI jobs in HPC environments. In Proceedings of the International Symposium on Parallel and Distributed Processing (IPDPS)."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1141277.1141620"},{"key":"e_1_3_2_1_57_1","volume-title":"IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 120--127","author":"Zhang Chenggang","year":"2013","unstructured":"Chenggang Zhang , Guodong Han , and Cho-Li Wang . 2013 . GPU-TLS: An efficient runtime for speculative loop parallelization on gpus . In IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 120--127 . Chenggang Zhang, Guodong Han, and Cho-Li Wang. 2013. GPU-TLS: An efficient runtime for speculative loop parallelization on gpus. In IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 120--127."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSNW.2012.6264677"},{"key":"e_1_3_2_1_59_1","volume-title":"Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 345--357","author":"Zheng T.","unstructured":"T. Zheng , D. Nellans , A. Zulfiqar , M. Stephenson , and S. W. Keckler . 2016. Towards high performance paged memory for GPUs . In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 345--357 . T. Zheng, D. Nellans, A. Zulfiqar, M. Stephenson, and S. W. Keckler. 2016. Towards high performance paged memory for GPUs. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA). 345--357."},{"key":"e_1_3_2_1_60_1","unstructured":"Chris Zimmer. 2018. Summit Burst Buffer. https:\/\/www.olcf.ornl.gov\/wp-content\/uploads\/2018\/05\/Intro_Summit_Burst-Buffer-Webinar.pdf.  Chris Zimmer. 2018. Summit Burst Buffer. https:\/\/www.olcf.ornl.gov\/wp-content\/uploads\/2018\/05\/Intro_Summit_Burst-Buffer-Webinar.pdf."}],"event":{"name":"ICS '19: 2019 International Conference on Supercomputing","location":"Phoenix Arizona","acronym":"ICS '19","sponsor":["SIGARCH ACM Special Interest Group on Computer Architecture"]},"container-title":["Proceedings of the ACM International Conference on Supercomputing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3330345.3330361","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3330345.3330361","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:54:05Z","timestamp":1750204445000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3330345.3330361"}},"subtitle":["checkpoint offloading for GPU-dense systems"],"short-title":[],"issued":{"date-parts":[[2019,6,26]]},"references-count":60,"alternative-id":["10.1145\/3330345.3330361","10.1145\/3330345"],"URL":"https:\/\/doi.org\/10.1145\/3330345.3330361","relation":{},"subject":[],"published":{"date-parts":[[2019,6,26]]},"assertion":[{"value":"2019-06-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}