{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:39:51Z","timestamp":1750307991669,"version":"3.41.0"},"reference-count":4,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2006,4,1]],"date-time":"2006-04-01T00:00:00Z","timestamp":1143849600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGOPS Oper. Syst. Rev."],"published-print":{"date-parts":[[2006,4]]},"abstract":"<jats:p>\n            In this paper we will present a new technology that we are currently developing within the\n            <jats:italic>SFT: Scalable Fault Tolerance<\/jats:italic>\n            FastOS project which seeks to implement fault tolerance at the operating system level. Major design goals include dynamic reallocation of resources to allow continuing execution in the presence of hardware failures, very high scalability, high efficiency (low overhead), and transparency---requiring no changes to user applications. Our technology is based on a global coordination mechanism, that enforces transparent recovery lines in the system, and TICK, a lightweight, incremental checkpointing software architecture implemented as a Linux kernel module. TICK is completely user-transparent and does not require any changes to user code or system libraries; it is highly responsive: an interrupt, such as a timer interrupt, can trigger a checkpoint in as little as 2.5\u03bcs; and it supports incremental and full checkpoints with minimal overhead---less than 6% with full checkpointing to disk performed as frequently as once per minute.\n          <\/jats:p>","DOI":"10.1145\/1131322.1131336","type":"journal-article","created":{"date-parts":[[2006,7,24]],"date-time":"2006-07-24T17:00:26Z","timestamp":1153760426000},"page":"55-62","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["SFT"],"prefix":"10.1145","volume":"40","author":[{"given":"Fabrizio","family":"Petrini","sequence":"first","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jarek","family":"Nieplocha","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vinod","family":"Tipparaju","sequence":"additional","affiliation":[{"name":"Pacific Northwest National Laboratory, Richland, WA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2006,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.76"},{"key":"e_1_2_1_2_1","volume-title":"Hank Alme. A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs. In Proceedings of the 2000 International Conference on Parallel Processing (ICPP-2000)","author":"Hoisie Adolfy","year":"2000","unstructured":"Adolfy Hoisie , Olaf Lubeck , Harvey Wasserman , Fabrizio Petrini , and Hank Alme. A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs. In Proceedings of the 2000 International Conference on Parallel Processing (ICPP-2000) , Toronto, Canada , August 21-24, 2000 . Available from http:\/\/hpc.pnl.gov\/people\/fabrizio\/papers\/icpp00.pdf. Adolfy Hoisie, Olaf Lubeck, Harvey Wasserman, Fabrizio Petrini, and Hank Alme. A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs. In Proceedings of the 2000 International Conference on Parallel Processing (ICPP-2000), Toronto, Canada, August 21-24, 2000. Available from http:\/\/hpc.pnl.gov\/people\/fabrizio\/papers\/icpp00.pdf."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/582034.582071"},{"key":"e_1_2_1_4_1","volume-title":"Beniamino di Martino, Jack Dongarra, Adolfy Hoisie, Laurence T","author":"Peinador Juan Fern\u00e1ndez","year":"2005","unstructured":"Juan Fern\u00e1ndez Peinador , Fabrizio Petrini , and Eitan Frachtenberg . Achieving Predictable and Scalable Performance with BCS-MPI . In Beniamino di Martino, Jack Dongarra, Adolfy Hoisie, Laurence T . Yang, and Hans Zima, editors, Engineering the Grid : Status and Perspective. Nova Science , 2005 . Available from http:\/\/hpc.pnl.gov\/people\/fabrizio\/papers\/bcs_book.pdf. Juan Fern\u00e1ndez Peinador, Fabrizio Petrini, and Eitan Frachtenberg. Achieving Predictable and Scalable Performance with BCS-MPI. In Beniamino di Martino, Jack Dongarra, Adolfy Hoisie, Laurence T. Yang, and Hans Zima, editors, Engineering the Grid: Status and Perspective. Nova Science, 2005. Available from http:\/\/hpc.pnl.gov\/people\/fabrizio\/papers\/bcs_book.pdf."}],"container-title":["ACM SIGOPS Operating Systems Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1131322.1131336","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1131322.1131336","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T15:06:16Z","timestamp":1750259176000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1131322.1131336"}},"subtitle":["scalable fault tolerance"],"short-title":[],"issued":{"date-parts":[[2006,4]]},"references-count":4,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2006,4]]}},"alternative-id":["10.1145\/1131322.1131336"],"URL":"https:\/\/doi.org\/10.1145\/1131322.1131336","relation":{},"ISSN":["0163-5980"],"issn-type":[{"type":"print","value":"0163-5980"}],"subject":[],"published":{"date-parts":[[2006,4]]},"assertion":[{"value":"2006-04-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}