{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,28]],"date-time":"2025-11-28T17:21:22Z","timestamp":1764350482229,"version":"3.41.0"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2019,5,29]],"date-time":"2019-05-29T00:00:00Z","timestamp":1559088000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-171748, 1829142, and 1914717"],"award-info":[{"award-number":["CNS-171748, 1829142, and 1914717"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2019,6,30]]},"abstract":"<jats:p>Future main memory will likely include Non-Volatile Memory. Non-Volatile Main Memory (NVMM) provides an opportunity to rethink checkpointing strategies for providing failure safety to applications. While there are many checkpointing and logging schemes in the literature, their use must be revisited as they incur high execution time overheads as well as a large number of additional writes to NVMM, which may significantly impact write endurance.<\/jats:p>\n          <jats:p>In this article, we propose a novel recompute-based failure safety approach and demonstrate its applicability to loop-based code. Rather than keeping a fully consistent logging state, we only log enough state to enable recomputation. Upon a failure, our approach recovers to a consistent state by determining which parts of the computation were not completed and recomputing them. Effectively, our approach removes the need to keep checkpoints or logs, thus reducing execution time overheads and improving NVMM write endurance at the expense of more complex recovery. We compare our new approach against logging and checkpointing on five scientific workloads, including tiled matrix multiplication, on a computer system model that was built on gem5 and supports Intel PMEM instruction extensions. For tiled matrix multiplication, our recompute approach incurs an execution time overhead of only 5%, in contrast to 8% overhead with logging and 207% overhead with checkpointing. Furthermore, recompute only adds 7% additional NVMM writes, compared to 111% with logging and 330% with checkpointing. We also conduct experiments on real hardware, allowing us to run our workloads to completion while varying the number of threads used for computation. These experiments substantiate our simulation-based observations and provide a sensitivity study and performance comparison between the Recompute Scheme and Naive Checkpointing.<\/jats:p>","DOI":"10.1145\/3323091","type":"journal-article","created":{"date-parts":[[2019,5,30]],"date-time":"2019-05-30T12:41:00Z","timestamp":1559220060000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Efficient Checkpointing with Recompute Scheme for Non-volatile Main Memory"],"prefix":"10.1145","volume":"16","author":[{"given":"Mohammad","family":"Alshboul","sequence":"first","affiliation":[{"name":"North Carolina State University, USA"}]},{"given":"Hussein","family":"Elnawawy","sequence":"additional","affiliation":[{"name":"North Carolina State University, USA"}]},{"given":"Reem","family":"Elkhouly","sequence":"additional","affiliation":[{"name":"Tanta University, Egypt and Waseda University, Japan"}]},{"given":"Keiji","family":"Kimura","sequence":"additional","affiliation":[{"name":"Waseda University, Japan"}]},{"given":"James","family":"Tuck","sequence":"additional","affiliation":[{"name":"North Carolina State University, USA"}]},{"given":"Yan","family":"Solihin","sequence":"additional","affiliation":[{"name":"University of Central Florida, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,5,29]]},"reference":[{"unstructured":"2016. Ruby Memory System. Retrieved from http:\/\/gem5.org\/Ruby.  2016. Ruby Memory System. Retrieved from http:\/\/gem5.org\/Ruby.","key":"e_1_2_1_1_1"},{"unstructured":"Song Ho Ahn. 2005. Convolution. Retrieved from http:\/\/www.songho.ca\/dsp\/convolution\/convolution.html.  Song Ho Ahn. 2005. Convolution. Retrieved from http:\/\/www.songho.ca\/dsp\/convolution\/convolution.html.","key":"e_1_2_1_2_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_3_1","DOI":"10.1109\/JPROC.2010.2070830"},{"doi-asserted-by":"publisher","key":"e_1_2_1_4_1","DOI":"10.1109\/ISCA.2018.00044"},{"doi-asserted-by":"publisher","key":"e_1_2_1_5_1","DOI":"10.1145\/2925426.2926284"},{"doi-asserted-by":"publisher","key":"e_1_2_1_6_1","DOI":"10.1109\/ISPASS.2015.7095793"},{"doi-asserted-by":"publisher","key":"e_1_2_1_7_1","DOI":"10.1145\/2872362.2872377"},{"doi-asserted-by":"publisher","key":"e_1_2_1_8_1","DOI":"10.1145\/3079856.3080230"},{"doi-asserted-by":"publisher","key":"e_1_2_1_9_1","DOI":"10.1109\/VLSIC.2004.1346644"},{"doi-asserted-by":"publisher","key":"e_1_2_1_10_1","DOI":"10.1145\/143365.143523"},{"doi-asserted-by":"publisher","key":"e_1_2_1_11_1","DOI":"10.1145\/2024716.2024718"},{"doi-asserted-by":"publisher","key":"e_1_2_1_12_1","DOI":"10.1145\/1024393.1024421"},{"doi-asserted-by":"publisher","key":"e_1_2_1_13_1","DOI":"10.1145\/2660193.2660224"},{"doi-asserted-by":"publisher","key":"e_1_2_1_14_1","DOI":"10.14778\/2735479.2735483"},{"doi-asserted-by":"publisher","key":"e_1_2_1_15_1","DOI":"10.1145\/1950365.1950380"},{"doi-asserted-by":"publisher","key":"e_1_2_1_16_1","DOI":"10.1145\/1629575.1629589"},{"unstructured":"Intel Corp. 2016. Intel 64 and IA-32 Architectures Developer\u2019s Manual: Vol. 3A.  Intel Corp. 2016. Intel 64 and IA-32 Architectures Developer\u2019s Manual: Vol. 3A.","key":"e_1_2_1_17_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_18_1","DOI":"10.1145\/2155620.2155637"},{"doi-asserted-by":"publisher","key":"e_1_2_1_19_1","DOI":"10.1145\/2155620.2155637"},{"doi-asserted-by":"publisher","key":"e_1_2_1_20_1","DOI":"10.1109\/CGO.2013.6495002"},{"doi-asserted-by":"publisher","key":"e_1_2_1_21_1","DOI":"10.1145\/2254064.2254120"},{"doi-asserted-by":"publisher","key":"e_1_2_1_22_1","DOI":"10.1145\/1654059.1654117"},{"volume-title":"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 318--329","author":"Elnawawy H.","unstructured":"H. Elnawawy , M. Alshboul , J. Tuck , and Y. Solihin . 2017. Efficient checkpointing of loop-based codes for non-volatile main memory . In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 318--329 . H. Elnawawy, M. Alshboul, J. Tuck, and Y. Solihin. 2017. Efficient checkpointing of loop-based codes for non-volatile main memory. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 318--329.","key":"e_1_2_1_23_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_24_1","DOI":"10.1145\/3192366.3192367"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 12th European Conference on Computer Systems (EuroSys\u201917)","author":"Ching-Hsiang Hsu Terry","year":"2017","unstructured":"Terry Ching-Hsiang Hsu , Helge Br\u00fcgner , Indrajit Roy , Kimberly Keeton , and Patrick Eugster . 2017 . NVthreads: Practical persistence for multi-threaded applications . In Proceedings of the 12th European Conference on Computer Systems (EuroSys\u201917) . Terry Ching-Hsiang Hsu, Helge Br\u00fcgner, Indrajit Roy, Kimberly Keeton, and Patrick Eugster. 2017. NVthreads: Practical persistence for multi-threaded applications. In Proceedings of the 12th European Conference on Computer Systems (EuroSys\u201917)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1177\/1094342015570921"},{"unstructured":"Intel. 2016. Persistent Memory Programming. Retrieved from http:\/\/pmem.io.  Intel. 2016. Persistent Memory Programming. Retrieved from http:\/\/pmem.io.","key":"e_1_2_1_27_1"},{"unstructured":"Intel and Micron. 2015. Intel and Micron Produce Breakthrough Memory Technology.  Intel and Micron. 2015. Intel and Micron Produce Breakthrough Memory Technology.","key":"e_1_2_1_28_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_29_1","DOI":"10.1145\/2872362.2872410"},{"doi-asserted-by":"publisher","key":"e_1_2_1_31_1","DOI":"10.1145\/2830772.2830805"},{"volume-title":"2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).","author":"Joshi A.","unstructured":"A. Joshi , V. Nagarajan , S. Viglas , and M. Cintra . 2017. ATOM: Atomic durability in non-volatile memory through hardware logging . In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). A. Joshi, V. Nagarajan, S. Viglas, and M. Cintra. 2017. ATOM: Atomic durability in non-volatile memory through hardware logging. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).","key":"e_1_2_1_32_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_33_1","DOI":"10.1109\/IPDPS.2013.69"},{"volume-title":"Proceedings of the International Solid-State Circuits Conference (ISSCC).","author":"Kawahara T.","unstructured":"T. Kawahara , R. Takemura , K. Miura , J. Hayakawa , S. Ikeda , Y. Lee , R. Sasaki , Y. Goto , K. Ito , T. Meguro , F. Matsukura , H. Takahashi , H. Matsuoka , and H. Ohno . 2007. 2Mb spin-transfer torque RAM (SPRAM) with bit-by-bit bidirectional current write and parallelizing-direction current read . In Proceedings of the International Solid-State Circuits Conference (ISSCC). T. Kawahara, R. Takemura, K. Miura, J. Hayakawa, S. Ikeda, Y. Lee, R. Sasaki, Y. Goto, K. Ito, T. Meguro, F. Matsukura, H. Takahashi, H. Matsuoka, and H. Ohno. 2007. 2Mb spin-transfer torque RAM (SPRAM) with bit-by-bit bidirectional current write and parallelizing-direction current read. In Proceedings of the International Solid-State Circuits Conference (ISSCC).","key":"e_1_2_1_34_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_35_1","DOI":"10.1145\/2872362.2872392"},{"doi-asserted-by":"publisher","key":"e_1_2_1_36_1","DOI":"10.1145\/2872362.2872381"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS).","author":"Kultursay Emre","year":"2013","unstructured":"Emre Kultursay , Mahmut Kandemir , Anand Sivasubramaniam , and Onur Mutlu . 2013 . Evaluating STT-RAM as an energy-effcient main memory alternative . In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS). Emre Kultursay, Mahmut Kandemir, Anand Sivasubramaniam, and Onur Mutlu. 2013. Evaluating STT-RAM as an energy-effcient main memory alternative. In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_38_1","DOI":"10.1109\/MM.2010.24"},{"doi-asserted-by":"publisher","key":"e_1_2_1_39_1","DOI":"10.1145\/3037697.3037714"},{"volume-title":"2018 51st Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO).","author":"Liu Q.","unstructured":"Q. Liu , J. Izraelevitz , S. K. Lee , M. L. Scott , S. H. Noh , and C. Jung . 2018. iDO: Compiler-directed failure atomicity for nonvolatile memory . In 2018 51st Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). Q. Liu, J. Izraelevitz, S. K. Lee, M. L. Scott, S. H. Noh, and C. Jung. 2018. iDO: Compiler-directed failure atomicity for nonvolatile memory. In 2018 51st Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO).","key":"e_1_2_1_40_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_41_1","DOI":"10.1109\/MICRO.2018.00029"},{"volume-title":"SC\u201916: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis.","author":"Liu Q.","unstructured":"Q. Liu , C. Jung , D. Lee , and D. Tiwari . 2016. Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery . In SC\u201916: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Q. Liu, C. Jung, D. Lee, and D. Tiwari. 2016. Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery. In SC\u201916: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis.","key":"e_1_2_1_42_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_43_1","DOI":"10.1109\/ICCD.2014.6974684"},{"doi-asserted-by":"publisher","key":"e_1_2_1_44_1","DOI":"10.1145\/143365.143529"},{"doi-asserted-by":"publisher","key":"e_1_2_1_45_1","DOI":"10.1145\/128765.128770"},{"doi-asserted-by":"publisher","key":"e_1_2_1_46_1","DOI":"10.1109\/SC.2010.18"},{"volume-title":"2017 International Conference on High Performance Computing Simulation (HPCS).","author":"Osawa K.","unstructured":"K. Osawa , A. Sekiya , H. Naganuma , and R. Yokota . 2017. Accelerating matrix multiplication in deep learning by using low-rank approximation . In 2017 International Conference on High Performance Computing Simulation (HPCS). K. Osawa, A. Sekiya, H. Naganuma, and R. Yokota. 2017. Accelerating matrix multiplication in deep learning by using low-rank approximation. In 2017 International Conference on High Performance Computing Simulation (HPCS).","key":"e_1_2_1_47_1"},{"volume-title":"Proceedings of International Symposium on Computer Architecture (ISCA).","author":"Pelley Steven","unstructured":"Steven Pelley , Peter M. Chen , and Thomas F. Wenisch . 2014. Memory persistency . In Proceedings of International Symposium on Computer Architecture (ISCA). Steven Pelley, Peter M. Chen, and Thomas F. Wenisch. 2014. Memory persistency. In Proceedings of International Symposium on Computer Architecture (ISCA).","key":"e_1_2_1_48_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_49_1","DOI":"10.1145\/2155620.2155658"},{"doi-asserted-by":"publisher","key":"e_1_2_1_50_1","DOI":"10.1145\/2600212.2600713"},{"volume-title":"Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIXATC\u201910)","author":"Saxena Mohit","unstructured":"Mohit Saxena and Michael M. Swift . 2010. FlashVM: Virtual memory management on flash . In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIXATC\u201910) . Mohit Saxena and Michael M. Swift. 2010. FlashVM: Virtual memory management on flash. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference (USENIXATC\u201910).","key":"e_1_2_1_51_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_52_1","DOI":"10.1088\/1742-6596\/78\/1\/012022"},{"doi-asserted-by":"publisher","key":"e_1_2_1_53_1","DOI":"10.1145\/3123939.3124539"},{"doi-asserted-by":"publisher","key":"e_1_2_1_54_1","DOI":"10.1145\/3079856.3080240"},{"doi-asserted-by":"publisher","key":"e_1_2_1_55_1","DOI":"10.1145\/1950365.1950379"},{"doi-asserted-by":"publisher","key":"e_1_2_1_56_1","DOI":"10.1145\/113445.113449"},{"doi-asserted-by":"publisher","key":"e_1_2_1_57_1","DOI":"10.1145\/223982.223990"},{"doi-asserted-by":"publisher","key":"e_1_2_1_58_1","DOI":"10.1145\/195473.195547"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323091","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3323091","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:23:16Z","timestamp":1750202596000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3323091"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,29]]},"references-count":57,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,6,30]]}},"alternative-id":["10.1145\/3323091"],"URL":"https:\/\/doi.org\/10.1145\/3323091","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2019,5,29]]},"assertion":[{"value":"2018-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}