{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T17:32:35Z","timestamp":1760549555352},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2008,8]]},"abstract":"<jats:p>\n            We present SGuard, a new fault-tolerance technique for distributed stream processing engines (SPEs) running in clusters of commodity servers. SGuard is less disruptive to normal stream processing and leaves more resources available for normal stream processing than previous proposals. Like several previous schemes, SGuard is based on rollback recovery [18]: it checkpoints the state of stream processing nodes periodically and restarts failed nodes from their most recent checkpoints. In contrast to previous proposals, however, SGuard performs checkpoints asynchronously:\n            <jats:italic>i.e.<\/jats:italic>\n            , operators continue processing streams during the checkpoint thus reducing the potential disruption due to the checkpointing activity. Additionally, SGuard saves the checkpointed state into a new type of distributed and replicated file system (DFS) such as GFS [22] or HDFS [9], leaving more memory resources available for normal stream processing. To manage resource contention due to simultaneous checkpoints by different SPE nodes, SGuard adds a scheduler to the DFS. This scheduler coordinates large batches of write requests in a manner that reduces individual checkpoint times while maintaining good overall resource utilization. We demonstrate the effectiveness of the approach through measurements of a prototype implementation in the Borealis [2] open-source SPE using HDFS [9] as the DFS.\n          <\/jats:p>","DOI":"10.14778\/1453856.1453920","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"574-585","source":"Crossref","is-referenced-by-count":47,"title":["Fault-tolerant stream processing using a distributed, replicated file system"],"prefix":"10.14778","volume":"1","author":[{"given":"YongChul","family":"Kwon","sequence":"first","affiliation":[{"name":"University of Washington, Seattle, WA"}]},{"given":"Magdalena","family":"Balazinska","sequence":"additional","affiliation":[{"name":"University of Washington, Seattle, WA"}]},{"given":"Albert","family":"Greenberg","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]}],"member":"320","published-online":{"date-parts":[[2008,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-003-0095-z"},{"key":"e_1_2_1_2_1","volume-title":"Proc. of the Second CIDR Conf.","author":"Abadi","year":"2005","unstructured":"Abadi et. al. The design of the Borealis stream processing engine . In Proc. of the Second CIDR Conf. , Jan. 2005 . Abadi et. al. The design of the Borealis stream processing engine. In Proc. of the Second CIDR Conf., Jan. 2005."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/647881.760848"},{"key":"e_1_2_1_4_1","volume-title":"Network flows: theory, algorithms, and applications","author":"Ahuja R. K.","year":"1993","unstructured":"R. K. Ahuja , T. L. Magnanti , and J. B. Orlin . Network flows: theory, algorithms, and applications . Prentice-Hall, Inc. , 1993 . R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network flows: theory, algorithms, and applications. Prentice-Hall, Inc., 1993."},{"key":"e_1_2_1_5_1","unstructured":"Aleri. http:\/\/www.aleri.com\/index.html.  Aleri. http:\/\/www.aleri.com\/index.html."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066160"},{"key":"e_1_2_1_7_1","volume-title":"Proc. of the Third CIDR Conf.","author":"Barga R.","year":"2007","unstructured":"R. Barga , J. Goldstein , M. Ali , and M. Hong . Consistent streaming through time: A vision for event stream processing . In Proc. of the Third CIDR Conf. , Jan. 2007 . R. Barga, J. Goldstein, M. Ali, and M. Hong. Consistent streaming through time: A vision for event stream processing. In Proc. of the Third CIDR Conf., Jan. 2007."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1997.1338"},{"key":"e_1_2_1_9_1","unstructured":"D. Borthakur. The Hadoop distributed file system: Architecture and design. http:\/\/lucene.apache.org\/hadoop\/hdfs_design.pdf 2007.  D. Borthakur. The Hadoop distributed file system: Architecture and design. http:\/\/lucene.apache.org\/hadoop\/hdfs_design.pdf 2007."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/781498.781513"},{"key":"e_1_2_1_11_1","volume-title":"Proc. of the First CIDR Conf.","author":"Chandrasekaran","year":"2003","unstructured":"Chandrasekaran et al. Telegraph CQ : Continuous dataflow processing for an uncertain world . In Proc. of the First CIDR Conf. , Jan. 2003 . Chandrasekaran et al. Telegraph CQ: Continuous dataflow processing for an uncertain world. In Proc. of the First CIDR Conf., Jan. 2003."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1760988.1761099"},{"key":"e_1_2_1_13_1","volume-title":"Sept.","author":"Chen","year":"2007","unstructured":"Chen et. al. High availability and scalability guide for DB2 on linux, unix, and windows. IBM Redbooks http:\/\/www.redbooks.ibm.com\/redbooks\/pdfs\/sg247363.pdf , Sept. 2007 . Chen et. al. High availability and scalability guide for DB2 on linux, unix, and windows. IBM Redbooks http:\/\/www.redbooks.ibm.com\/redbooks\/pdfs\/sg247363.pdf, Sept. 2007."},{"key":"e_1_2_1_14_1","volume-title":"Proc. of the First CIDR Conf.","author":"Cherniack","year":"2003","unstructured":"Cherniack et. al. Scalable distributed stream processing . In Proc. of the First CIDR Conf. , Jan. 2003 . Cherniack et. al. Scalable distributed stream processing. In Proc. of the First CIDR Conf., Jan. 2003."},{"key":"e_1_2_1_15_1","unstructured":"Coral8. http:\/\/coral8.com\/.  Coral8. http:\/\/coral8.com\/."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872838"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/125223.125238"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/568522.568525"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/191843.191915"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1294261.1294281"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.180602"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945450"},{"key":"e_1_2_1_23_1","volume-title":"Proc. of the Usenix Mach Workshop","author":"Goldberg A.","year":"1990","unstructured":"A. Goldberg , A. Gopal , K. Li , R. Strom , and D. F. Bacon . Transparent recovery of Mach applications . In Proc. of the Usenix Mach Workshop , Oct. 1990 . A. Goldberg, A. Gopal, K. Li, R. Strom, and D. F. Bacon. Transparent recovery of Mach applications. In Proc. of the Usenix Mach Workshop, Oct. 1990."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/35037.35059"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2005.72"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367863"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142473.1142522"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/582629.848295"},{"key":"e_1_2_1_29_1","unstructured":"Kosmix Corp. Kosmos distributed file system(kfs). http:\/\/kosmosfs.sourceforge.net 2007.  Kosmix Corp. Kosmos distributed file system(kfs). http:\/\/kosmosfs.sourceforge.net 2007."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/125223.125244"},{"issue":"4","key":"e_1_2_1_31_1","first-page":"51","article-title":"Paxos made simple","volume":"32","author":"Lamport L.","year":"2001","unstructured":"L. Lamport . Paxos made simple . SIGACT News , 32 ( 4 ): 51 -- 58 , Dec. 2001 . L. Lamport. Paxos made simple. SIGACT News, 32(4):51--58, Dec. 2001.","journal-title":"SIGACT News"},{"key":"e_1_2_1_32_1","volume-title":"Proc. of the 17th ICDE Conf.","author":"Lee J.","year":"2001","unstructured":"J. Lee , K. Kim , and S. K. Cha . Differential logging: A commutative and associative logging scheme for highly parallel main memory database . In Proc. of the 17th ICDE Conf. , Apr. 2001 . J. Lee, K. Kim, and S. K. Cha. Differential logging: A commutative and associative logging scheme for highly parallel main memory database. In Proc. of the 17th ICDE Conf., Apr. 2001."},{"key":"e_1_2_1_33_1","first-page":"294","volume-title":"Proc. of the 12th VLDB Conf.","author":"Lehman T. J.","year":"1986","unstructured":"T. J. Lehman and M. J. Carey . A study of index structures for main memory database management systems . In Proc. of the 12th VLDB Conf. , pages 294 -- 303 , Aug. 1986 . T. J. Lehman and M. J. Carey. A study of index structures for main memory database management systems. In Proc. of the 12th VLDB Conf., pages 294--303, Aug. 1986."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.180604"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.298215"},{"key":"e_1_2_1_36_1","first-page":"212","volume-title":"Proc. of the 6th VLDB Conf.","author":"Litwin W.","year":"1980","unstructured":"W. Litwin . Linear hashing : A new tool for file and table addressing . In Proc. of the 6th VLDB Conf. , pages 212 -- 223 , Oct. 1980 . W. Litwin. Linear hashing: A new tool for file and table addressing. In Proc. of the 6th VLDB Conf., pages 212--223, Oct. 1980."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/128765.128770"},{"key":"e_1_2_1_38_1","volume-title":"Proc. of the First CIDR Conf.","author":"Motwani","year":"2003","unstructured":"Motwani et. al. Query processing, approximation, and resource management in a data stream management system . In Proc. of the First CIDR Conf. , Jan. 2003 . Motwani et. al. Query processing, approximation, and resource management in a data stream management system. In Proc. of the First CIDR Conf., Jan. 2003."},{"key":"e_1_2_1_39_1","volume-title":"NFS: Network file system protocol specification","author":"Network Working Group","unstructured":"Network Working Group , Sun Microsystems , Inc. RFC 1094 - NFS: Network file system protocol specification . http:\/\/www.faqs.org\/rfcs\/rfc1094.html, 1989. Network Working Group, Sun Microsystems, Inc. RFC 1094 - NFS: Network file system protocol specification. http:\/\/www.faqs.org\/rfcs\/rfc1094.html, 1989."},{"key":"e_1_2_1_40_1","unstructured":"Oracle Corp. Oracle berkeley db. http:\/\/www.oracle.com\/technology\/products\/berkeley-db\/db\/index.html 2008.  Oracle Corp. Oracle berkeley db. http:\/\/www.oracle.com\/technology\/products\/berkeley-db\/db\/index.html 2008."},{"key":"e_1_2_1_42_1","first-page":"213","volume-title":"Usenix Winter Technical Conference","author":"Plank J. S.","year":"1995","unstructured":"J. S. Plank , M. Beck , G. Kingsley , and K. Li . Libckpt: Transparent checkpointing under Unix . In Usenix Winter Technical Conference , pages 213 -- 223 , Jan. 1995 . J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent checkpointing under Unix. In Usenix Winter Technical Conference, pages 213--223, Jan. 1995."},{"key":"e_1_2_1_43_1","volume-title":"Mar.","author":"Ray A.","year":"2002","unstructured":"A. Ray . Oracle data guard : Ensuring disaster recovery for the enterprise. An Oracle white paper , Mar. 2002 . A. Ray. Oracle data guard: Ensuring disaster recovery for the enterprise. An Oracle white paper, Mar. 2002."},{"key":"e_1_2_1_44_1","volume-title":"Proc. of the Int. Parallel and Distributed Proc. Symp.","author":"Sancho J. C.","year":"2004","unstructured":"J. C. Sancho , F. Petrini , G. Johnson , J. Fernandez , and E. Frachtenberg . On the feasibility of incremental checkpointing for scientific computing . In Proc. of the Int. Parallel and Distributed Proc. Symp. , Apr. 2004 . J. C. Sancho, F. Petrini, G. Johnson, J. Fernandez, and E. Frachtenberg. On the feasibility of incremental checkpointing for scientific computing. In Proc. of the Int. Parallel and Distributed Proc. Symp., Apr. 2004."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/98163.98167"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007662"},{"key":"e_1_2_1_47_1","unstructured":"Silicon Graphics Inc. Standard template library programmer's guide. http:\/\/www.sgi.com\/tech\/stl\/.  Silicon Graphics Inc. Standard template library programmer's guide. http:\/\/www.sgi.com\/tech\/stl\/."},{"key":"e_1_2_1_48_1","unstructured":"Streambase. http:\/\/www.streambase.com\/.  Streambase. http:\/\/www.streambase.com\/."},{"key":"e_1_2_1_49_1","volume-title":"Database mirroring in SQL Server","author":"Talmage R.","year":"2005","unstructured":"R. Talmage . Database mirroring in SQL Server 2005 . http:\/\/www.microsoft.com\/technet\/prodtechnol\/sql\/2005\/dbmirror.mspx, Apr. 2005. R. Talmage. Database mirroring in SQL Server 2005. http:\/\/www.microsoft.com\/technet\/prodtechnol\/sql\/2005\/dbmirror.mspx, Apr. 2005."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626403001288"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.780864"},{"key":"e_1_2_1_52_1","volume-title":"http:\/\/www.vmware.com","author":"VMware Inc. Vmware.","year":"2008","unstructured":"VMware Inc. Vmware. http:\/\/www.vmware.com , 2008 . VMware Inc. Vmware. http:\/\/www.vmware.com, 2008."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/11951957_21"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.5555\/615232.615238"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/1453856.1453920","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:13:07Z","timestamp":1672225987000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/1453856.1453920"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,8]]}},"alternative-id":["10.14778\/1453856.1453920"],"URL":"https:\/\/doi.org\/10.14778\/1453856.1453920","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2008,8]]}}}