{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:59:41Z","timestamp":1760709581299},"reference-count":72,"publisher":"Cambridge University Press (CUP)","license":[{"start":{"date-parts":[[2018,4,10]],"date-time":"2018-04-10T00:00:00Z","timestamp":1523318400000},"content-version":"unspecified","delay-in-days":99,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Funct. Prog."],"published-print":{"date-parts":[[2018]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection programming. We draw parallels between databases (DB) and programming language (PL) research by demonstrating the connection between pipelined query engines and loop fusion techniques. Based on this connection, we propose a new type of pull-based engine, inspired by a loop fusion technique, which combines the benefits of both approaches. Then, we experimentally evaluate the various engines, in the context of query compilation, for the first time in a fair environment, eliminating the biasing impact of ancillary optimizations that have traditionally only been used with one of the approaches. We show that for realistic analytical workloads, there is no considerable advantage for either form of pipelined query engine, as opposed to what recent research suggests. Also, by using micro-benchmarks, which demonstrate certain edge cases on which one approach or the other performs better, we show that our proposed engine dominates the existing engines by combining the benefits of both.<\/jats:p>","DOI":"10.1017\/s0956796818000102","type":"journal-article","created":{"date-parts":[[2018,4,10]],"date-time":"2018-04-10T09:50:11Z","timestamp":1523353811000},"source":"Crossref","is-referenced-by-count":23,"title":["Push versus pull-based loop fusion in query engines"],"prefix":"10.1017","volume":"28","author":[{"given":"AMIR","family":"SHAIKHHA","sequence":"first","affiliation":[]},{"given":"MOHAMMAD","family":"DASHTI","sequence":"additional","affiliation":[]},{"given":"CHRISTOPH","family":"KOCH","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2018,4,10]]},"reference":[{"key":"S0956796818000102_ref25","first-page":"162","article-title":"Avalanche-safe LINQ compilation","volume":"3","author":"Grust","year":"2010","journal-title":"PVLDB"},{"key":"S0956796818000102_ref4","doi-asserted-by":"crossref","unstructured":"Armbrust M. , Xin R. S. , Lian C. , Huai Y. , Liu D. , Bradley J. K. , Meng X. , Kaftan T. , Franklin M. J. , Ghodsi A. & Zaharia M. (2015) Spark SQL: Relational data processing in spark. In Proceedings of the SIGMOD '15. New York, NY, USA: ACM.","DOI":"10.1145\/2723372.2742797"},{"key":"S0956796818000102_ref16","doi-asserted-by":"crossref","unstructured":"Emir B. , Odersky M. & Williams J. (2007) Matching objects with patterns. In Proceedings of the ECOOP'07. Berlin, Heidelberg: Springer-Verlag.","DOI":"10.1007\/978-3-540-73589-2_14"},{"key":"S0956796818000102_ref32","unstructured":"Jones S. L. P. , Hall C. , Hammond K. , Partain W. & Wadler P. (1993) The glasgow Haskell compiler: A technical overview. In Proceedings of the UK Joint Framework for Information Technology, Technical Conference, vol. 93. Citeseer."},{"key":"S0956796818000102_ref10","doi-asserted-by":"publisher","DOI":"10.1016\/j.entcs.2005.11.061"},{"key":"S0956796818000102_ref12","doi-asserted-by":"publisher","DOI":"10.1145\/320385.320386"},{"key":"S0956796818000102_ref22","doi-asserted-by":"publisher","DOI":"10.1145\/152610.152611"},{"key":"S0956796818000102_ref70","doi-asserted-by":"crossref","unstructured":"Zhou J. & Ross K. A. (2002) Implementing database operations using SIMD instructions. In Proceedings of the SIGMOD '02. New York, NY, USA: ACM.","DOI":"10.1145\/564708.564709"},{"key":"S0956796818000102_ref28","doi-asserted-by":"publisher","DOI":"10.1145\/2528412"},{"key":"S0956796818000102_ref48","first-page":"539","article-title":"Efficiently compiling efficient query plans for modern hardware","volume":"4","author":"Neumann","year":"2011","journal-title":"PVLDB"},{"key":"S0956796818000102_ref59","unstructured":"Stonebraker M. , Abadi D. J. , Batkin A. , Chen X. , Cherniack M. , Ferreira M. , Lau E. , Lin A. , Madden S. , O'Neil E. , O'Neil P. , Rasin A. , Tran N. & Zdonik S. (2005) C-store: A column-oriented DBMS. In Proceedings of the VLDB '05. VLDB Endowment."},{"key":"S0956796818000102_ref50","doi-asserted-by":"crossref","unstructured":"Paredaens J. & Gucht D. V. (1988) Possibilities and limitations of using flat operators in nested algebra expressions. In Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, March 21\u201323, 1988, Austin, Texas, USA, pp. 29\u201338.","DOI":"10.1145\/308386.308402"},{"key":"S0956796818000102_ref3","first-page":"1566","article-title":"DBToaster: A SQL compiler for high-performance delta processing in main-memory databases","volume":"2","author":"Ahmad","year":"2009","journal-title":"PVLDB"},{"key":"S0956796818000102_ref24","doi-asserted-by":"crossref","unstructured":"Grust T. , Mayr M. , Rittinger J. & Schreiber T. (2009) FERRY: Database-supported program execution. In Proceedings of the SIGMOD 2009. ACM.","DOI":"10.1145\/1559845.1559982"},{"key":"S0956796818000102_ref33","doi-asserted-by":"crossref","unstructured":"Jonnalagedda M. & Stucki S. (2015) Fold-based fusion as a library: A generative programming pearl. In Proceedings of the 6th ACM SIGPLAN Symposium on Scala. ACM, pp. 41\u201350.","DOI":"10.1145\/2774975.2774981"},{"key":"S0956796818000102_ref63","unstructured":"Trinder P. (1992) Comprehensions, a query notation for DBPLs. In Proceedings of the 3rd DBPL Workshop, DBPL3. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, pp. 55\u201368."},{"key":"S0956796818000102_ref19","doi-asserted-by":"publisher","DOI":"10.1017\/S0956796809007291"},{"key":"S0956796818000102_ref42","doi-asserted-by":"publisher","DOI":"10.1007\/s007780050071"},{"key":"S0956796818000102_ref2","doi-asserted-by":"crossref","unstructured":"Abadi D. J. , Myers D. S. , DeWitt D. J. & Madden S. R. (2007) Materialization strategies in a column-oriented DBMS. In Proceedings of the IEEE 23rd International Conference on Data Engineering, ICDE 2007. IEEE, pp. 466\u2013475.","DOI":"10.1109\/ICDE.2007.367892"},{"key":"S0956796818000102_ref43","volume-title":"XRM: An Extended (N-ary) Relational Memory","author":"Lorie","year":"1974"},{"key":"S0956796818000102_ref56","doi-asserted-by":"crossref","DOI":"10.1145\/3183653","article-title":"Building efficient query engines in a high-level language","volume":"43","author":"Shaikhha","year":"2018","journal-title":"Trans. Database Syst."},{"key":"S0956796818000102_ref11","first-page":"1313","article-title":"Efficient implementation of sorting on multi-core SIMD CPU architecture","volume":"1","author":"Chhugani","year":"2008","journal-title":"PVLDB"},{"key":"S0956796818000102_ref72","doi-asserted-by":"crossref","unstructured":"Zukowski M. , Heman S. , Nes N. , & Boncz P. (2006) Super-scalar RAM-CPU cache compression. In Proceedings of the 22nd International Conference on Data Engineering, ICDE '06. Washington, DC, USA: IEEE Computer Society, p. 59.","DOI":"10.1109\/ICDE.2006.150"},{"key":"S0956796818000102_ref23","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008705026446"},{"key":"S0956796818000102_ref71","first-page":"17","article-title":"MonetDB\/X100 \u2013 A DBMS In The CPU Cache.","volume":"28","author":"Zukowski","year":"2005","journal-title":"IEEE Data Eng. Bull."},{"key":"S0956796818000102_ref21","doi-asserted-by":"publisher","DOI":"10.1109\/69.273032"},{"key":"S0956796818000102_ref36","first-page":"853","article-title":"Building efficient query engines in a high-level language","volume":"7","author":"Klonatos","year":"2014","journal-title":"PVLDB"},{"key":"S0956796818000102_ref30","doi-asserted-by":"publisher","DOI":"10.1145\/242224.242477"},{"key":"S0956796818000102_ref61","unstructured":"Tibbetts R. , Yang S. , MacNeill, R. & Rydzewski D. (2011) StreamBase LiveView: Push-based real-time analytics. In Proceedings of the StreamBase Systems (Jan 2012)."},{"key":"S0956796818000102_ref27","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24276-2_2"},{"key":"S0956796818000102_ref5","unstructured":"Biboudis A. , Palladinos N. , Fourtounis G. & Smaragdakis Y. (2015) Streams \u00e0 la carte: Extensible pipelines with object algebras. In Proceedings of the 29th European Conference on Object-Oriented Programming, p. 591."},{"key":"S0956796818000102_ref51","doi-asserted-by":"publisher","DOI":"10.1145\/2189750.2151014"},{"key":"S0956796818000102_ref1","doi-asserted-by":"publisher","DOI":"10.1145\/1142473.1142548"},{"key":"S0956796818000102_ref6","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559877"},{"key":"S0956796818000102_ref7","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(85)90135-5"},{"key":"S0956796818000102_ref8","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-54233-7_125"},{"key":"S0956796818000102_ref66","first-page":"11","article-title":"Design patterns: Elements of reusable object-oriented software","volume":"49","author":"Vlissides","year":"1995","journal-title":"Reading: Addison-Wesley"},{"key":"S0956796818000102_ref9","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-56039-4_38"},{"key":"S0956796818000102_ref13","doi-asserted-by":"crossref","unstructured":"Coutts D. , Leshchinskiy R. & Stewart D. (2007) Stream fusion. From lists to streams to nothing at all. In Proceedings of the ICFP '07.","DOI":"10.1145\/1291151.1291199"},{"key":"S0956796818000102_ref14","unstructured":"Crotty A. , Galakatos A. , Dursun K. , Kraska T. , \u00c7etintemel U. & Zdonik S. B. (2015) Tupleware: \u201cBig\u201d data, big analytics, small clusters. In Proceedings of the CIDR."},{"key":"S0956796818000102_ref15","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463710"},{"key":"S0956796818000102_ref40","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-013-0348-4"},{"key":"S0956796818000102_ref17","doi-asserted-by":"publisher","DOI":"10.1145\/377674.377676"},{"key":"S0956796818000102_ref18","doi-asserted-by":"crossref","unstructured":"Gedik B. , Andrade H. , Wu K.-L. , Yu P. & Doo M. (2008) SPADE: The system S seclarative stream processing engine. In Proceedings of the SIGMOD.","DOI":"10.1145\/1376616.1376729"},{"key":"S0956796818000102_ref26","doi-asserted-by":"publisher","DOI":"10.1561\/1900000002"},{"key":"S0956796818000102_ref29","doi-asserted-by":"crossref","unstructured":"Hofer C. & Ostermann K. (2010) Modular domain-specific language components in scala. In Proceedings of the 9th International Conference on Generative Programming and Component Engineering, GPCE '10. New York, NY, USA: ACM, pp. 83\u201392.","DOI":"10.1145\/1868294.1868307"},{"key":"S0956796818000102_ref31","first-page":"40","article-title":"MonetDB: Two decades of research in column-oriented database architectures","volume":"35","author":"Idreos","year":"2012","journal-title":"IEEE Data Eng. Bull."},{"key":"S0956796818000102_ref34","unstructured":"Karpathiotakis M. , Alagiannis I. , Heinis, T, Branco, M. & Ailamaki A. (2015) Just-in-time data virtualization: Lightweight data management with ViDa. In Proceedings of the CIDR."},{"key":"S0956796818000102_ref35","doi-asserted-by":"publisher","DOI":"10.14778\/2994509.2994516"},{"key":"S0956796818000102_ref37","first-page":"1784","article-title":"Errata for \u201cBuilding efficient query engines in a high-level language\u201d PVLDB 7(10):853-864","volume":"7","author":"Klonatos","year":"2014","journal-title":"PVLDB"},{"key":"S0956796818000102_ref38","doi-asserted-by":"crossref","unstructured":"Koch C. (2010) Incremental query evaluation in a ring of databases. In Proceedings of the PODS 2010. ACM.","DOI":"10.1145\/1807085.1807100"},{"key":"S0956796818000102_ref39","first-page":"70","article-title":"Abstraction without regret in database systems building: A manifesto","volume":"37","author":"Koch","year":"2014","journal-title":"IEEE Data Eng. Bull."},{"key":"S0956796818000102_ref41","doi-asserted-by":"crossref","unstructured":"Krikellas K. , Viglas S. & Cintra M. (2010) Generating code for holistic query evaluation. In Proceedings of the ICDE, pp. 613\u2013624.","DOI":"10.1109\/ICDE.2010.5447892"},{"key":"S0956796818000102_ref44","doi-asserted-by":"crossref","unstructured":"Mainland G. , Leshchinskiy R. & Peyton Jones S. (2013) Exploiting vector instructions with generalized stream fusion. In Proceedings of the ICFP '13. New York, NY, USA: ACM.","DOI":"10.1145\/2500365.2500601"},{"key":"S0956796818000102_ref46","doi-asserted-by":"crossref","unstructured":"Murray D. G. , Isard M. & Yu Y. (2011) Steno: Automatic optimization of declarative queries. In Proceedings of the PLDI '11. New York, NY, USA: ACM.","DOI":"10.1145\/1993498.1993513"},{"key":"S0956796818000102_ref69","unstructured":"Zaharia M. , Chowdhury M. , Das T. , Dave A. , Ma J. , McCauley M. , Franklin M. J. , Shenker S. , & Stoica I. (2012) Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the NSDI'12. USENIX Association."},{"key":"S0956796818000102_ref47","first-page":"1095","article-title":"Code generation for efficient query processing in managed runtimes","volume":"7","author":"Nagel","year":"2014","journal-title":"PVLDB"},{"key":"S0956796818000102_ref49","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2001.914871"},{"key":"S0956796818000102_ref52","doi-asserted-by":"crossref","unstructured":"Peyton Jones S. , Leshchinskiy R. , Keller G. & MT Chakravarty M. . (2008) Harnessing the multicores: Nested data parallelism in Haskell. In Proceedings of the LIPIcs-Leibniz International Proceedings in Informatics, vol. 2. Schloss Dagstuhl-Leibniz-Zentrum f\u00fcr Informatik.","DOI":"10.1007\/978-3-540-89330-1_10"},{"key":"S0956796818000102_ref53","volume-title":"Types and Programming Languages","author":"Pierce","year":"2002"},{"key":"S0956796818000102_ref55","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882917"},{"key":"S0956796818000102_ref57","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915244"},{"key":"S0956796818000102_ref60","doi-asserted-by":"crossref","unstructured":"Svenningsson J. (2002) Shortcut fusion for accumulating parameters & zip-like Functions. In Proceedings of the ICFP '02. ACM.","DOI":"10.1145\/581478.581491"},{"key":"S0956796818000102_ref62","unstructured":"Transaction Processing Performance Council. (2017) TPC-H, a Decision Support Benchmark. http:\/\/www.tpc.org\/tpch."},{"key":"S0956796818000102_ref64","unstructured":"Veldhuizen T. L. (2014) Leapfrog triejoin: A simple, worst-case optimal join algorithm. In Proceedings of the 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24\u201328, 2014."},{"key":"S0956796818000102_ref65","first-page":"12","article-title":"Processing declarative queries through generating imperative code in managed runtimes","volume":"37","author":"Viglas","year":"2014","journal-title":"IEEE Data Eng. Bull."},{"key":"S0956796818000102_ref58","doi-asserted-by":"crossref","unstructured":"Shivers O. & Might M. (2006) Continuations and transducer composition. In Proceedings of the PLDI '06. ACM.","DOI":"10.1145\/1133981.1134016"},{"key":"S0956796818000102_ref67","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-19027-9_23"},{"key":"S0956796818000102_ref68","doi-asserted-by":"publisher","DOI":"10.1145\/91556.91592"},{"key":"S0956796818000102_ref45","doi-asserted-by":"crossref","unstructured":"Meijer E. , Beckman B. & Bierman G. (2006) LINQ: Reconciling object, relations and XML in the .NET framework. In Proceedings of the SIGMOD '06. ACM.","DOI":"10.1145\/1142473.1142552"},{"key":"S0956796818000102_ref54","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2747645"},{"key":"S0956796818000102_ref20","doi-asserted-by":"crossref","unstructured":"Gill A. , Launchbury J. & Peyton Jones S. L. (1993) A short cut to deforestation. In Proceedings of the FPCA. ACM.","DOI":"10.1145\/165180.165214"}],"container-title":["Journal of Functional Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S0956796818000102","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T00:44:49Z","timestamp":1693615489000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S0956796818000102\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018]]},"references-count":72,"alternative-id":["S0956796818000102"],"URL":"https:\/\/doi.org\/10.1017\/s0956796818000102","relation":{},"ISSN":["0956-7968","1469-7653"],"issn-type":[{"value":"0956-7968","type":"print"},{"value":"1469-7653","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018]]},"article-number":"e10"}}