{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:47:30Z","timestamp":1767340050390},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2015,8]]},"abstract":"<jats:p>We present a new approach for data analytics with iterations. Users express their analysis in Datalog with bag-monotonic aggregate operators, which enables the expression of computations from a broad variety of application domains. Queries are translated into query plans that can execute in shared-nothing engines, are incremental, and support a variety of iterative models (synchronous, asynchronous, different processing priorities) and failure-handling techniques. The plans require only small extensions to an existing shared-nothing engine, making the approach easily implementable. We implement the approach in the Myria big-data management system and use our implementation to empirically study the performance characteristics of different combinations of iterative models, failure handling methods, and applications. Our evaluation uses workloads from a variety of application domains. We find that no single method outperforms others but rather that application properties must drive the selection of the iterative query execution model.<\/jats:p>","DOI":"10.14778\/2824032.2824052","type":"journal-article","created":{"date-parts":[[2015,9,16]],"date-time":"2015-09-16T12:18:17Z","timestamp":1442405897000},"page":"1542-1553","source":"Crossref","is-referenced-by-count":44,"title":["Asynchronous and fault-tolerant recursive datalog evaluation in shared-nothing engines"],"prefix":"10.14778","volume":"8","author":[{"given":"Jingjing","family":"Wang","sequence":"first","affiliation":[{"name":"University of Washington"}]},{"given":"Magdalena","family":"Balazinska","sequence":"additional","affiliation":[{"name":"University of Washington"}]},{"given":"Daniel","family":"Halperin","sequence":"additional","affiliation":[{"name":"University of Washington"}]}],"member":"320","published-online":{"date-parts":[[2015,8]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Amazon EC2 spot instances. http:\/\/aws.amazon.com\/ec2\/purchasing-options\/spot-instances\/.  Amazon EC2 spot instances. http:\/\/aws.amazon.com\/ec2\/purchasing-options\/spot-instances\/."},{"key":"e_1_2_1_2_1","unstructured":"Apache flink. http:\/\/flink.apache.org\/.  Apache flink. http:\/\/flink.apache.org\/."},{"key":"e_1_2_1_3_1","unstructured":"Greenplum. http:\/\/pivotal.io\/big-data\/pivotal-greenplum-database.  Greenplum. http:\/\/pivotal.io\/big-data\/pivotal-greenplum-database."},{"key":"e_1_2_1_4_1","unstructured":"LogicBlox inc. http:\/\/www.logicblox.com\/.  LogicBlox inc. http:\/\/www.logicblox.com\/."},{"key":"e_1_2_1_5_1","unstructured":"Myria: Big Data as a Service. http:\/\/myria.cs.washington.edu\/.  Myria: Big Data as a Service. http:\/\/myria.cs.washington.edu\/."},{"key":"e_1_2_1_6_1","unstructured":"SDSS SkyServer DR7. http:\/\/skyserver.sdss.org\/dr7.  SDSS SkyServer DR7. http:\/\/skyserver.sdss.org\/dr7."},{"key":"e_1_2_1_7_1","volume-title":"VLDB","author":"Abouzeid A.","year":"2009","unstructured":"A. Abouzeid : An architectural hybrid of MapReduce and DBMS technologies for analytical workloads . In VLDB , 2009 . 10.14778\/1687627.1687731 A. Abouzeid et al. HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In VLDB, 2009. 10.14778\/1687627.1687731"},{"key":"e_1_2_1_8_1","volume-title":"EDBT, 2011","author":"Afrati F. N.","year":"1951","unstructured":"F. N. Afrati Mapreduce extensions and recursive queries . In EDBT, 2011 . 10.1145\/ 1951 365.1951367 F. N. Afrati et al. Mapreduce extensions and recursive queries. In EDBT, 2011. 10.1145\/1951365.1951367"},{"key":"e_1_2_1_9_1","volume-title":"VLDB","author":"Alsubaiee S.","year":"2014","unstructured":"S. Alsubaiee : A scalable, open source BDMS . In VLDB , 2014 . 10.14778\/2733085.2733096 S. Alsubaiee et al. AsterixDB: A scalable, open source BDMS. In VLDB, 2014. 10.14778\/2733085.2733096"},{"key":"e_1_2_1_10_1","volume-title":"CIDR","author":"Alvaro P.","year":"2011","unstructured":"P. Alvaro Consistency analysis in Bloom: a CALM and collected approach . In CIDR , 2011 . P. Alvaro et al. Consistency analysis in Bloom: a CALM and collected approach. In CIDR, 2011."},{"key":"e_1_2_1_11_1","volume-title":"IOPADS, 1999","author":"R.","year":"1816","unstructured":"R. H. Arpaci-Dusseau et al. Cluster I\/O with river: Making the fast case common . In IOPADS, 1999 . 10.1145\/30 1816 .301823 R. H. Arpaci-Dusseau et al. Cluster I\/O with river: Making the fast case common. In IOPADS, 1999. 10.1145\/301816.301823"},{"key":"e_1_2_1_12_1","volume-title":"SIGMOD, 2000","author":"Avnur R.","year":"2009","unstructured":"R. Avnur : Continuously adaptive query processing . In SIGMOD, 2000 . 10.1145\/34 2009 .335420 R. Avnur et al. Eddies: Continuously adaptive query processing. In SIGMOD, 2000. 10.1145\/342009.335420"},{"key":"e_1_2_1_13_1","volume-title":"ICDE, 2011","author":"Borkar V.","year":"2011","unstructured":"V. Borkar : A flexible and extensible foundation for data-intensive computing . In ICDE, 2011 . 10.1109\/ICDE. 2011 .5767921 V. Borkar et al. Hyracks: A flexible and extensible foundation for data-intensive computing. In ICDE, 2011. 10.1109\/ICDE.2011.5767921"},{"key":"e_1_2_1_14_1","volume-title":"VLDB, 2010","author":"Bu Y.","year":"1920","unstructured":"Y. Bu : Efficient iterative data processing on large clusters . In VLDB, 2010 . 10.14778\/ 1920 841.1920881 Y. Bu et al. HaLoop: Efficient iterative data processing on large clusters. In VLDB, 2010. 10.14778\/1920841.1920881"},{"key":"e_1_2_1_15_1","volume-title":"VLDB","author":"Bu Y.","year":"2014","unstructured":"Y. Bu : Big(ger) graph analytics on a dataflow engine . In VLDB , 2014 . 10.14778\/2735471.2735477 Y. Bu et al. Pregelix: Big(ger) graph analytics on a dataflow engine. In VLDB, 2014. 10.14778\/2735471.2735477"},{"key":"e_1_2_1_16_1","volume-title":"SoCC","author":"Conway N.","year":"2012","unstructured":"N. Conway Logic and lattices for distributed programming . In SoCC , 2012 . 10.1145\/2391229.2391230 N. Conway et al. Logic and lattices for distributed programming. In SoCC, 2012. 10.1145\/2391229.2391230"},{"key":"e_1_2_1_17_1","volume-title":"SBBD","author":"D. E.","year":"2013","unstructured":"D. E. M. de Oliveira et al. Orbit: Efficient processing of iterations . In SBBD , 2013 . D. E. M. de Oliveira et al. Orbit: Efficient processing of iterations. In SBBD, 2013."},{"key":"e_1_2_1_18_1","volume-title":"OSDI","author":"Dean J.","year":"2004","unstructured":"J. Dean : Simplified data processing on large clusters . In OSDI , 2004 . J. Dean et al. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004."},{"key":"e_1_2_1_19_1","volume-title":"HPDC, 2010","author":"Ekanayake J.","year":"1851","unstructured":"J. Ekanayake : A runtime for iterative MapReduce . In HPDC, 2010 . 10.1145\/ 1851 476.1851593 J. Ekanayake et al. Twister: A runtime for iterative MapReduce. In HPDC, 2010. 10.1145\/1851476.1851593"},{"key":"e_1_2_1_20_1","volume-title":"A density-based algorithm for discovering clusters in large spatial databases with noise","author":"Ester M.","year":"1996","unstructured":"M. Ester A density-based algorithm for discovering clusters in large spatial databases with noise . AAAI Press , 1996 . M. Ester et al. A density-based algorithm for discovering clusters in large spatial databases with noise. AAAI Press, 1996."},{"key":"e_1_2_1_21_1","volume-title":"VLDB","author":"Ewen S.","year":"2012","unstructured":"S. Ewen Spinning fast iterative data flows . In VLDB , 2012 . 10.14778\/2350229.2350245 S. Ewen et al. Spinning fast iterative data flows. In VLDB, 2012. 10.14778\/2350229.2350245"},{"key":"e_1_2_1_22_1","volume-title":"ICDE","author":"Gao J.","year":"2014","unstructured":"J. Gao : A high level graph analysis system using MapReduce . In ICDE , 2014 . J. Gao et al. GLog: A high level graph analysis system using MapReduce. In ICDE, 2014."},{"key":"e_1_2_1_23_1","volume-title":"SIGMOD","author":"Ghazal A.","year":"2012","unstructured":"A. Ghazal Adaptive optimizations of recursive queries in teradata . In SIGMOD , 2012 . 10.1145\/2213836.2213966 A. Ghazal et al. Adaptive optimizations of recursive queries in teradata. In SIGMOD, 2012. 10.1145\/2213836.2213966"},{"key":"e_1_2_1_24_1","volume-title":"OSDI","author":"Gonzalez J. E.","year":"2014","unstructured":"J. E. Gonzalez : Graph processing in a distributed dataflow framework . In OSDI , 2014 . J. E. Gonzalez et al. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014."},{"key":"e_1_2_1_25_1","volume-title":"SIGMOD","author":"Halperin D.","year":"2014","unstructured":"D. Halperin Demo of the Myria big data management service . In SIGMOD , 2014 . 10.1145\/2588555.2594530 D. Halperin et al. Demo of the Myria big data management service. In SIGMOD, 2014. 10.1145\/2588555.2594530"},{"key":"e_1_2_1_26_1","volume-title":"ICDE, 2005","author":"Hwang J.","year":"2005","unstructured":"J. Hwang High-availability algorithms for distributed stream processing . In ICDE, 2005 . 10.1109\/ICDE. 2005 .72 J. Hwang et al. High-availability algorithms for distributed stream processing. In ICDE, 2005. 10.1109\/ICDE.2005.72"},{"key":"e_1_2_1_27_1","volume-title":"EuroSys","author":"Isard M.","year":"2007","unstructured":"M. Isard : distributed data-parallel programs from sequential building blocks . In EuroSys , 2007 . 10.1145\/1272996.1273005 M. Isard et al. Dryad: distributed data-parallel programs from sequential building blocks. In EuroSys, 2007. 10.1145\/1272996.1273005"},{"key":"e_1_2_1_28_1","volume-title":"VLDB","author":"Jiang D.","year":"2014","unstructured":"D. Jiang : An extensible and scalable system for processing big data . In VLDB , 2014 . 10.14778\/2732286.2732291 D. Jiang et al. epiC: An extensible and scalable system for processing big data. In VLDB, 2014. 10.14778\/2732286.2732291"},{"key":"e_1_2_1_29_1","volume-title":"WWW","author":"Kwak H.","year":"2010","unstructured":"H. Kwak What is Twitter, a social network or a news media ? In WWW , 2010 . 10.1145\/1772690.1772751 H. Kwak et al. What is Twitter, a social network or a news media? In WWW, 2010. 10.1145\/1772690.1772751"},{"key":"e_1_2_1_30_1","unstructured":"J. Leskovec etal Snap.py: SNAP for Python a general purpose network analysis and graph mining tool in Python. http:\/\/snap.stanford.edu\/snappy 2014.  J. Leskovec et al. Snap.py: SNAP for Python a general purpose network analysis and graph mining tool in Python. http:\/\/snap.stanford.edu\/snappy 2014."},{"key":"e_1_2_1_31_1","volume-title":"VLDB","author":"Low Y.","year":"2012","unstructured":"Y. Low : a framework for machine learning and data mining in the cloud . In VLDB , 2012 . 10.14778\/2212351.2212354 Y. Low et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud. In VLDB, 2012. 10.14778\/2212351.2212354"},{"key":"e_1_2_1_32_1","unstructured":"Large Synoptic Survey Telescope. http:\/\/www.lsst.org\/.  Large Synoptic Survey Telescope. http:\/\/www.lsst.org\/."},{"key":"e_1_2_1_33_1","volume-title":"SIGMOD, 2010","author":"Malewicz G.","year":"1807","unstructured":"G. Malewicz : a system for large-scale graph processing . In SIGMOD, 2010 . 10.1145\/ 1807 167.1807184 G. Malewicz et al. Pregel: a system for large-scale graph processing. In SIGMOD, 2010. 10.1145\/1807167.1807184"},{"key":"e_1_2_1_34_1","volume-title":"Sept.","author":"Menon H.","year":"2014","unstructured":"H. Menon Adaptive Techniques for Clustered N-Body Cosmological Simulations. ArXiv e-prints , Sept. 2014 . H. Menon et al. Adaptive Techniques for Clustered N-Body Cosmological Simulations. ArXiv e-prints, Sept. 2014."},{"key":"e_1_2_1_35_1","volume-title":"VLDB","author":"Mihaylov S.","year":"2012","unstructured":"S. Mihaylov : Recursive, delta-based data-centric computation . In VLDB , 2012 . 10.14778\/2350229.2350246 S. Mihaylov et al. REX: Recursive, delta-based data-centric computation. In VLDB, 2012. 10.14778\/2350229.2350246"},{"key":"e_1_2_1_36_1","volume-title":"SOSP","author":"Murray D. G.","year":"2013","unstructured":"D. G. Murray : A timely dataflow system . In SOSP , 2013 . 10.1145\/2517349.2522738 D. G. Murray et al. Naiad: A timely dataflow system. In SOSP, 2013. 10.1145\/2517349.2522738"},{"key":"e_1_2_1_37_1","volume-title":"VLDB","author":"Onizuka M.","year":"2013","unstructured":"M. Onizuka Optimization for iterative queries on MapReduce . In VLDB , 2013 . 10.14778\/2732240.2732243 M. Onizuka et al. Optimization for iterative queries on MapReduce. In VLDB, 2013. 10.14778\/2732240.2732243"},{"key":"e_1_2_1_38_1","volume-title":"VLDB","author":"Seo J.","year":"2013","unstructured":"J. Seo : A Datalog-based language for large-scale graph analysis . In VLDB , 2013 . 10.14778\/2556549.2556572 J. Seo et al. Distributed SociaLite: A Datalog-based language for large-scale graph analysis. In VLDB, 2013. 10.14778\/2556549.2556572"},{"key":"e_1_2_1_39_1","volume-title":"ICDE","author":"Seo J.","year":"2013","unstructured":"J. Seo : Datalog extensions for efficient social network analysis . In ICDE , 2013 . J. Seo et al. SociaLite: Datalog extensions for efficient social network analysis. In ICDE, 2013."},{"key":"e_1_2_1_40_1","volume-title":"VLDB","author":"Shen Y.","year":"2014","unstructured":"Y. Shen Fast failure recovery in distributed graph processing systems . In VLDB , 2014 . 10.14778\/2735496.2735506 Y. Shen et al. Fast failure recovery in distributed graph processing systems. In VLDB, 2014. 10.14778\/2735496.2735506"},{"key":"e_1_2_1_41_1","volume-title":"VLDB","author":"Shkapsky A.","year":"2013","unstructured":"A. Shkapsky Graph queries in a next-generation datalog system . In VLDB , 2013 . 10.14778\/2536274.2536290 A. Shkapsky et al. Graph queries in a next-generation datalog system. In VLDB, 2013. 10.14778\/2536274.2536290"},{"key":"e_1_2_1_42_1","unstructured":"Sloan Digital Sky Survey. http:\/\/cas.sdss.org\/.  Sloan Digital Sky Survey. http:\/\/cas.sdss.org\/."},{"key":"e_1_2_1_43_1","unstructured":"University of Washington eScience Institute. http:\/\/escience.washington.edu\/.  University of Washington eScience Institute. http:\/\/escience.washington.edu\/."},{"key":"e_1_2_1_44_1","volume-title":"SIGMOD, 2011","author":"Upadhyaya P.","year":"1989","unstructured":"P. Upadhyaya A latency and fault-tolerance optimizer for online parallel query plans . In SIGMOD, 2011 . 10.1145\/ 1989 323.1989350 P. Upadhyaya et al. A latency and fault-tolerance optimizer for online parallel query plans. In SIGMOD, 2011. 10.1145\/1989323.1989350"},{"key":"e_1_2_1_45_1","volume-title":"FTCS","author":"Vogels W.","year":"1998","unstructured":"W. Vogels The design and architecture of the Microsoft Cluster Service - a practical approach to high-availability and scalability . In FTCS , 1998 . W. Vogels et al. The design and architecture of the Microsoft Cluster Service - a practical approach to high-availability and scalability. In FTCS, 1998."},{"key":"e_1_2_1_46_1","volume-title":"CIDR","author":"Wang G.","year":"2013","unstructured":"G. Wang Asynchronous large-scale graph processing made easy . In CIDR , 2013 . G. Wang et al. Asynchronous large-scale graph processing made easy. In CIDR, 2013."},{"key":"e_1_2_1_47_1","volume-title":"The Definitive Guide","author":"White T.","year":"2009","unstructured":"T. White . Hadoop : The Definitive Guide . 2009 . T. White. Hadoop: The Definitive Guide. 2009."},{"key":"e_1_2_1_48_1","volume-title":"PPoPP","author":"Xie C.","year":"2015","unstructured":"C. Xie SYNC or ASYNC: Time to fuse for distributed graph-parallel computation . In PPoPP , 2015 . 10.1145\/2688500.2688508 C. Xie et al. SYNC or ASYNC: Time to fuse for distributed graph-parallel computation. In PPoPP, 2015. 10.1145\/2688500.2688508"},{"key":"e_1_2_1_49_1","volume-title":"NSDI","author":"Zaharia M.","year":"2012","unstructured":"M. Zaharia Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing . In NSDI , 2012 . M. Zaharia et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012."},{"key":"e_1_2_1_50_1","volume-title":"SoCC, 2011","author":"Zhang Y.","year":"2038","unstructured":"Y. Zhang : a distributed framework for prioritized iterative computations . In SoCC, 2011 . 10.1145\/ 2038 916.2038929 Y. Zhang et al. PrIter: a distributed framework for prioritized iterative computations. In SoCC, 2011. 10.1145\/2038916.2038929"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2824032.2824052","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T10:14:51Z","timestamp":1672222491000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2824032.2824052"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,8]]},"references-count":50,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,8]]}},"alternative-id":["10.14778\/2824032.2824052"],"URL":"https:\/\/doi.org\/10.14778\/2824032.2824052","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2015,8]]}}}