{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:13:31Z","timestamp":1760710411310,"version":"3.41.0"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA","license":[{"start":{"date-parts":[[2020,11,13]],"date-time":"2020-11-13T00:00:00Z","timestamp":1605225600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004750","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1763514"],"award-info":[{"award-number":["1763514"]}],"id":[{"id":"10.13039\/501100004750","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2020,11,13]]},"abstract":"<jats:p>High performance architectures for processing distributed data streams, such as Flink, Spark Streaming, and Storm, are increasingly deployed in emerging data-driven computing systems. Exploiting the parallelism afforded by such platforms, while preserving the semantics of the desired computation, is prone to errors, and motivates the development of tools for specification, testing, and verification. We focus on the problem of differential output testing for distributed stream processing systems, that is, checking whether two implementations produce equivalent output streams in response to a given input stream. The notion of equivalence allows reordering of logically independent data items, and the main technical contribution of the paper is an optimal online algorithm for checking this equivalence. Our testing framework is implemented as a library called DiffStream in Flink. We present four case studies to illustrate how our framework can be used to (1) correctly identify bugs in a set of benchmark MapReduce programs, (2) facilitate the development of difficult-to-parallelize high performance applications, and (3) monitor an application for a long period of time with minimal performance overhead.<\/jats:p>","DOI":"10.1145\/3428221","type":"journal-article","created":{"date-parts":[[2020,11,24]],"date-time":"2020-11-24T23:40:14Z","timestamp":1606261214000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["DiffStream: differential output testing for stream processing programs"],"prefix":"10.1145","volume":"4","author":[{"given":"Konstantinos","family":"Kallas","sequence":"first","affiliation":[{"name":"University of Pennsylvania, USA"}]},{"given":"Filip","family":"Niksic","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, USA"}]},{"given":"Caleb","family":"Stanford","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, USA"}]},{"given":"Rajeev","family":"Alur","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, USA"}]}],"member":"320","published-online":{"date-parts":[[2020,11,13]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"Parosh Abdulla Stavros Aronis Bengt Jonsson and Konstantinos Sagonas. 2014. Optimal dynamic partial order reduction. ACM SIGPLAN Notices 49 1 ( 2014 ) 373-384.  Parosh Abdulla Stavros Aronis Bengt Jonsson and Konstantinos Sagonas. 2014. Optimal dynamic partial order reduction. ACM SIGPLAN Notices 49 1 ( 2014 ) 373-384.","key":"e_1_2_2_1_1","DOI":"10.1145\/2578855.2535845"},{"doi-asserted-by":"publisher","key":"e_1_2_2_2_1","DOI":"10.1145\/1806596.1806634"},{"doi-asserted-by":"publisher","key":"e_1_2_2_3_1","DOI":"10.1145\/2535838.2535848"},{"unstructured":"Paris Carbone Asterios Katsifodimos Stephan Ewen Volker Markl Seif Haridi and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36 4 ( 2015 ).  Paris Carbone Asterios Katsifodimos Stephan Ewen Volker Markl Seif Haridi and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36 4 ( 2015 ).","key":"e_1_2_2_4_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_5_1","DOI":"10.1007\/978-3-319-48989-6_8"},{"volume-title":"GOVERNOR: Smoother Stream Processing Through Smarter Backpressure. In 2017 IEEE International Conference on Autonomic Computing (ICAC). IEEE, 145-154","year":"2017","author":"Chen Xin","key":"e_1_2_2_6_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_7_1","DOI":"10.1007\/978-3-319-41540-6_6"},{"key":"e_1_2_2_8_1","first-page":"1789","article-title":"Benchmarking streaming computation engines: Storm, flink and spark streaming. In 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW)","author":"Chintapalli Sanket","year":"2016","journal-title":"IEEE"},{"doi-asserted-by":"publisher","key":"e_1_2_2_9_1","DOI":"10.1145\/1629335.1629363"},{"doi-asserted-by":"publisher","key":"e_1_2_2_10_1","DOI":"10.1145\/2025113.2025204"},{"doi-asserted-by":"publisher","key":"e_1_2_2_11_1","DOI":"10.1145\/1327452.1327492"},{"doi-asserted-by":"publisher","key":"e_1_2_2_12_1","DOI":"10.1142\/2563"},{"volume-title":"The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering: Companion Papers. ACM, 549-552","year":"2007","author":"Evans Robert B","key":"e_1_2_2_13_1"},{"volume-title":"http:\/\/storm.apache.org\/. [Online","year":"2019","author":"Foundation Apache Software","key":"e_1_2_2_14_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_15_1","DOI":"10.1109\/SPDP.1992.242728"},{"doi-asserted-by":"publisher","key":"e_1_2_2_16_1","DOI":"10.1137\/S0097539794279614"},{"volume-title":"Partial-order methods for the verification of concurrent systems: an approach to the state-explosion problem","author":"Godefroid Patrice","doi-asserted-by":"crossref","key":"e_1_2_2_17_1","DOI":"10.1007\/3-540-60761-7"},{"doi-asserted-by":"publisher","key":"e_1_2_2_18_1","DOI":"10.1109\/ICSE.2007.68"},{"doi-asserted-by":"publisher","key":"e_1_2_2_19_1","DOI":"10.1145\/2884781.2884813"},{"doi-asserted-by":"publisher","key":"e_1_2_2_20_1","DOI":"10.1007\/s10009-003-0117-6"},{"doi-asserted-by":"publisher","key":"e_1_2_2_21_1","DOI":"10.1145\/2815400.2815428"},{"doi-asserted-by":"publisher","key":"e_1_2_2_22_1","DOI":"10.1145\/78969.78972"},{"unstructured":"Paul Holser. 2013. junit-quickcheck (software). https:\/\/github.com\/pholser\/junit-quickcheck.  Paul Holser. 2013. junit-quickcheck (software). https:\/\/github.com\/pholser\/junit-quickcheck.","key":"e_1_2_2_23_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_24_1","DOI":"10.5555\/2486788.2486869"},{"volume-title":"JUnit testing framework. https:\/\/junit.org\/junit5\/. [Online","year":"2019","key":"e_1_2_2_25_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_26_1","DOI":"10.1109\/CCGRID.2010.112"},{"doi-asserted-by":"publisher","key":"e_1_2_2_27_1","DOI":"10.1145\/2723372.2742788"},{"doi-asserted-by":"publisher","key":"e_1_2_2_28_1","DOI":"10.1109\/PROC.1987.13876"},{"doi-asserted-by":"publisher","key":"e_1_2_2_29_1","DOI":"10.1016\/j.jlap.2008.08.004"},{"doi-asserted-by":"publisher","key":"e_1_2_2_30_1","DOI":"10.1109\/ICSE.2013.6606646"},{"doi-asserted-by":"publisher","key":"e_1_2_2_31_1","DOI":"10.1145\/2670979.2670980"},{"doi-asserted-by":"crossref","unstructured":"Gavin Lowe. 2017. Testing for linearizability. Concurrency and Computation: Practice and Experience 29 4 ( 2017 ) e3928.  Gavin Lowe. 2017. Testing for linearizability. Concurrency and Computation: Practice and Experience 29 4 ( 2017 ) e3928.","key":"e_1_2_2_32_1","DOI":"10.1002\/cpe.3928"},{"doi-asserted-by":"publisher","key":"e_1_2_2_33_1","DOI":"10.1145\/3314221.3314580"},{"volume-title":"Eduardo Cunha de Almeida, and Gerson Suny\u00e9","year":"2012","author":"Marynowski Jo\u00e3o Eugenio","key":"e_1_2_2_34_1"},{"key":"e_1_2_2_35_1","first-page":"617","article-title":"Tachyon: Tandem Execution for Eficient Live Patch Testing. In Presented as part of the 21st USENIX Security Symposium (USENIX Security 12). USENIX, Bellevue","author":"Maurer Matthew","year":"2012","journal-title":"WA"},{"volume-title":"Advanced course on Petri nets","author":"Mazurkiewicz Antoni","key":"e_1_2_2_36_1"},{"key":"e_1_2_2_37_1","article-title":"Diferential testing for software","volume":"10","author":"McKeeman William M","year":"1998","journal-title":"Digital Technical Journal"},{"doi-asserted-by":"publisher","key":"e_1_2_2_38_1","DOI":"10.1002\/smr.2120"},{"key":"e_1_2_2_40_1","article-title":"What are race conditions? Some issues and formalizations","volume":"1","author":"Netzer Robert HB","year":"1992","journal-title":"ACM Letters on Programming Languages and Systems (LOPLAS)"},{"doi-asserted-by":"publisher","key":"e_1_2_2_41_1","DOI":"10.14778\/3137765.3137770"},{"doi-asserted-by":"publisher","key":"e_1_2_2_42_1","DOI":"10.1145\/1559845.1559873"},{"doi-asserted-by":"publisher","key":"e_1_2_2_43_1","DOI":"10.1145\/1989323.1989459"},{"volume-title":"Questions tagged with apache-flink on Stack Overflow. https:\/\/stackoverflow.com\/questions\/tagged\/ apache-flink. [Online","year":"2020","author":"Overflow Stack","key":"e_1_2_2_44_1"},{"doi-asserted-by":"publisher","key":"e_1_2_2_45_1","DOI":"10.1145\/3276530"},{"doi-asserted-by":"crossref","unstructured":"Oded Padon Kenneth L McMillan Aurojit Panda Mooly Sagiv and Sharon Shoham. 2016. Ivy: safety verification by interactive generalization. ACM SIGPLAN Notices 51 6 ( 2016 ) 614-630.  Oded Padon Kenneth L McMillan Aurojit Panda Mooly Sagiv and Sharon Shoham. 2016. Ivy: safety verification by interactive generalization. ACM SIGPLAN Notices 51 6 ( 2016 ) 614-630.","key":"e_1_2_2_46_1","DOI":"10.1145\/2980983.2908118"},{"doi-asserted-by":"publisher","key":"e_1_2_2_47_1","DOI":"10.1145\/2063384.2063452"},{"volume-title":"Proc. 6th Int. Conference (LNCS 818)","year":"1994","author":"Peled Doron","key":"e_1_2_2_48_1"},{"doi-asserted-by":"crossref","unstructured":"Alex Raizman Asvin Ananthanarayan Anton Kirilov Badrish Chandramouli and Mohamed H Ali. 2010. An extensible test framework for the Microsoft StreamInsight query processor.. In DBTest.  Alex Raizman Asvin Ananthanarayan Anton Kirilov Badrish Chandramouli and Mohamed H Ali. 2010. An extensible test framework for the Microsoft StreamInsight query processor.. In DBTest.","key":"e_1_2_2_49_1","DOI":"10.1145\/1838126.1838128"},{"doi-asserted-by":"publisher","key":"e_1_2_2_50_1","DOI":"10.1145\/2815400.2815418"},{"doi-asserted-by":"publisher","key":"e_1_2_2_51_1","DOI":"10.1145\/265924.265927"},{"volume-title":"Safe data parallelism for general streaming","year":"2013","author":"Schneider Scott","key":"e_1_2_2_52_1"},{"volume-title":"A large-scale study of failures in high-performance computing systems","year":"2009","author":"Schroeder Bianca","key":"e_1_2_2_53_1"},{"doi-asserted-by":"crossref","unstructured":"Koushik Sen. 2008. Race directed random testing of concurrent programs. ACM Sigplan Notices 43 6 ( 2008 ) 11-21.  Koushik Sen. 2008. Race directed random testing of concurrent programs. ACM Sigplan Notices 43 6 ( 2008 ) 11-21.","key":"e_1_2_2_54_1","DOI":"10.1145\/1379022.1375584"},{"doi-asserted-by":"publisher","key":"e_1_2_2_55_1","DOI":"10.1007\/3-540-45937-5_14"},{"doi-asserted-by":"publisher","key":"e_1_2_2_56_1","DOI":"10.1145\/1508244.1508267"},{"volume-title":"An Exploratory Study of How Specialists Deal with Testing in Data Stream Processing Applications. arXiv preprint arXiv","year":"1909","author":"Vianna Alexandre","key":"e_1_2_2_57_1"},{"doi-asserted-by":"crossref","unstructured":"James R Wilcox Doug Woos Pavel Panchekha Zachary Tatlock Xi Wang Michael D Ernst and Thomas Anderson. 2015. Verdi: a framework for implementing and formally verifying distributed systems. ACM SIGPLAN Notices 50 6 ( 2015 ) 357-368.  James R Wilcox Doug Woos Pavel Panchekha Zachary Tatlock Xi Wang Michael D Ernst and Thomas Anderson. 2015. Verdi: a framework for implementing and formally verifying distributed systems. ACM SIGPLAN Notices 50 6 ( 2015 ) 357-368.","key":"e_1_2_2_58_1","DOI":"10.1145\/2813885.2737958"},{"doi-asserted-by":"publisher","key":"e_1_2_2_59_1","DOI":"10.1006\/jpdc.1993.1015"},{"doi-asserted-by":"publisher","key":"e_1_2_2_60_1","DOI":"10.1145\/2591062.2591177"},{"doi-asserted-by":"publisher","key":"e_1_2_2_61_1","DOI":"10.1109\/IISWC.2013.6704673"},{"doi-asserted-by":"publisher","key":"e_1_2_2_62_1","DOI":"10.1109\/ASE.2013.6693071"},{"doi-asserted-by":"publisher","key":"e_1_2_2_63_1","DOI":"10.1145\/1993498.1993532"},{"doi-asserted-by":"publisher","key":"e_1_2_2_64_1","DOI":"10.1145\/2517349.2522737"},{"doi-asserted-by":"publisher","key":"e_1_2_2_65_1","DOI":"10.1109\/ICSE.2015.130"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428221","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3428221","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:57Z","timestamp":1750197777000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428221"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,13]]},"references-count":64,"journal-issue":{"issue":"OOPSLA","published-print":{"date-parts":[[2020,11,13]]}},"alternative-id":["10.1145\/3428221"],"URL":"https:\/\/doi.org\/10.1145\/3428221","relation":{},"ISSN":["2475-1421"],"issn-type":[{"type":"electronic","value":"2475-1421"}],"subject":[],"published":{"date-parts":[[2020,11,13]]},"assertion":[{"value":"2020-11-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}