{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T20:57:16Z","timestamp":1776891436814,"version":"3.51.2"},"reference-count":56,"publisher":"Centre pour la Communication Scientifique Directe (CCSD)","license":[{"start":{"date-parts":[[2017,10,24]],"date-time":"2017-10-24T00:00:00Z","timestamp":1508803200000},"content-version":"unspecified","delay-in-days":296,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Funct. Prog."],"published-print":{"date-parts":[[2017]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Cuneiform is a minimal functional programming language for large-scale scientific data analysis. Implementing a strict black-box view on external operators and data, it allows the direct embedding of code in a variety of external languages like Python or R, provides data-parallel higher order operators for processing large partitioned data sets, allows conditionals and general recursion, and has a naturally parallelizable evaluation strategy suitable for multi-core servers and distributed execution environments like Hadoop, HTCondor, or distributed Erlang. Cuneiform has been applied in several data-intensive research areas including remote sensing, machine learning, and bioinformatics, all of which critically depend on the flexible assembly of pre-existing tools and libraries written in different languages into complex pipelines. This paper introduces the computation semantics for Cuneiform. It presents Cuneiform's abstract syntax, a simple type system, and the semantics of evaluation. Providing an unambiguous specification of the behavior of Cuneiform eases the implementation of interpreters which we showcase by providing a concise reference implementation in Erlang. The similarity of Cuneiform's syntax to the simply typed lambda calculus puts Cuneiform in perspective and allows a straightforward discussion of its design in the context of functional programming. Moreover, the simple type system allows the deduction of the language's safety up to black-box operators. Last, the formulation of the semantics also permits the verification of compilers to and from other workflow languages.<\/jats:p>","DOI":"10.1017\/s0956796817000119","type":"journal-article","created":{"date-parts":[[2017,10,24]],"date-time":"2017-10-24T00:27:42Z","timestamp":1508804862000},"source":"Crossref","is-referenced-by-count":12,"title":["Computation semantics of the functional scientific workflow language Cuneiform"],"prefix":"10.46298","volume":"27","author":[{"given":"J\u00d6RGEN","family":"BRANDT","sequence":"first","affiliation":[]},{"given":"WOLFGANG","family":"REISIG","sequence":"additional","affiliation":[]},{"given":"ULF","family":"LESER","sequence":"additional","affiliation":[]}],"member":"25203","published-online":{"date-parts":[[2017,10,24]]},"reference":[{"key":"S0956796817000119_ref28","first-page":"22","volume-title":"Natural Semantics","author":"Kahn","year":"1987"},{"key":"S0956796817000119_ref54","first-page":"95","article-title":"Spark: Cluster computing with working sets","volume":"10","author":"Zaharia","year":"2010","journal-title":"Hotcloud"},{"key":"S0956796817000119_ref48","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2009.11.009"},{"key":"S0956796817000119_ref41","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.993"},{"key":"S0956796817000119_ref3","doi-asserted-by":"crossref","unstructured":"Bessani A. , Brandt J. , Bux M. , Cogo V. , Dimitrova L. , Dowling J. , Gholami A. , Hakimzadeh K. , Hummel M. , Ismail M. , Laure E. , Leser U. , Litton J.-E. , Martinez R. , Niazi S. , Reichel J. & Zimmermann K. (2015) Biobankcloud: A platform for the secure storage, sharing, and processing of large biomedical data sets. In Proceedings of 1st International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH 2015).","DOI":"10.1007\/978-3-319-41576-5_7"},{"key":"S0956796817000119_ref2","doi-asserted-by":"crossref","unstructured":"Arts T. , Hughes J. , Johansson J. & Wiger U. (2006) Testing telecoms software with quviq quickcheck. In Proceedings of the 2006 ACM SIGPLAN Workshop on Erlang, ERLANG '06. New York, NY, USA: ACM.","DOI":"10.1145\/1159789.1159792"},{"key":"S0956796817000119_ref32","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bts480"},{"key":"S0956796817000119_ref47","doi-asserted-by":"crossref","first-page":"373","DOI":"10.3233\/FI-2009-0075","article-title":"Towards a formal semantics for the process model of the taverna workbench. Part ii","volume":"92","author":"Sroka","year":"2009","journal-title":"Fundam. Inform."},{"key":"S0956796817000119_ref29","unstructured":"Kalayci S. , Dasgupta G. , Fong L. , Ezenwoye O. & Sadjadi S. M. (2010) Distributed and adaptive execution of condor dagman workflows. In SEKE, pp. 587\u2013590."},{"key":"S0956796817000119_ref52","volume-title":"Hadoop: The Definitive Guide","author":"White","year":"2012"},{"key":"S0956796817000119_ref19","doi-asserted-by":"publisher","DOI":"10.1186\/gb-2010-11-8-r86"},{"key":"S0956796817000119_ref6","first-page":"318","volume-title":"From (Sequential) Haskell to (Parallel) Eden: An Implementation Point of View","author":"Breitinger","year":"1998"},{"key":"S0956796817000119_ref38","volume-title":"An Introduction to Functional Programming Through Lambda Calculus","author":"Michaelson","year":"2011"},{"key":"S0956796817000119_ref8","doi-asserted-by":"crossref","unstructured":"Bux M. , Brandt J. , Lipka C. , Hakimzadeh K. , Dowling J. & Leser U. (2015 September) Saasfee: Scalable scientific workflow execution engine. In Proceedings of the VLDB Endowment, vol. 8, pp. 1892\u20131895.","DOI":"10.14778\/2824032.2824094"},{"key":"S0956796817000119_ref39","doi-asserted-by":"publisher","DOI":"10.1016\/0890-5401(91)90052-4"},{"key":"S0956796817000119_ref12","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"S0956796817000119_ref51","doi-asserted-by":"crossref","unstructured":"Turi D. , Missier P. , Goble C. , De Roure D. & Oinn T. (2007) Taverna workflows: Syntax and semantics. In Proceedings of IEEE International Conference on e-Science and Grid Computing. IEEE, pp. 441\u2013448.","DOI":"10.1109\/E-SCIENCE.2007.71"},{"key":"S0956796817000119_ref22","volume-title":"Neural Networks and Learning Machines","author":"Haykin","year":"2009"},{"key":"S0956796817000119_ref45","first-page":"53","volume-title":"The Design and Implementation of Glasgow Distributed Haskell","author":"Pointon","year":"2001"},{"key":"S0956796817000119_ref50","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"S0956796817000119_ref36","volume-title":"Randomization, Bootstrap and Monte Carlo Methods in Biology","author":"Manly","year":"2006"},{"key":"S0956796817000119_ref7","unstructured":"Budiu M. & Goldstein S. C. (2002) Pegasus: An Efficient Intermediate Representation. Technical Report. DTIC Document."},{"key":"S0956796817000119_ref15","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.3820"},{"key":"S0956796817000119_ref53","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/3054.001.0001","volume-title":"The Formal Semantics of Programming Languages: An Introduction","author":"Winskel","year":"1993"},{"key":"S0956796817000119_ref13","unstructured":"Deelman E. , Livny M. , Mehta G. , Pavlo A. , Singh G. , Su M.-H. , Vahi K. & Wenger R. K. (2006) Pegasus and dagman from concept to execution: Mapping scientific workflows onto today's cyberinfrastructure. In High Performance Computing Workshop, pp. 56\u201374."},{"key":"S0956796817000119_ref4","unstructured":"Bishop C. M. (2006) Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA."},{"key":"S0956796817000119_ref30","unstructured":"Kelly P. M. (2011) Applying functional programming theory to the design of workflow engines. PhD thesis, University of Adelaide."},{"key":"S0956796817000119_ref26","first-page":"1","volume-title":"Quickcheck Testing for Fun and Profit","author":"Hughes","year":"2007"},{"key":"S0956796817000119_ref33","doi-asserted-by":"publisher","DOI":"10.1007\/s10723-015-9329-8"},{"key":"S0956796817000119_ref21","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781316576892"},{"key":"S0956796817000119_ref46","doi-asserted-by":"crossref","first-page":"279","DOI":"10.3233\/FI-2009-0075","article-title":"Towards a formal semantics for the process model of the taverna workbench. Part i","volume":"92","author":"Sroka","year":"2009","journal-title":"Fundam. Inform."},{"key":"S0956796817000119_ref44","unstructured":"Plotkin G. D. (1981) A structural approach to operational semantics. Computer Science Department, Aarhus University Aarhus, Denmark."},{"key":"S0956796817000119_ref1","unstructured":"Armstrong J. , Virding R. , Wikstr\u00f6m C. & Williams M. (1996) Concurrent Programming in ERLANG (2nd Ed.). Prentice Hall International (UK) Ltd., Hertfordshire, UK."},{"key":"S0956796817000119_ref5","unstructured":"Brandt J. , Bux M. & Leser U. (2015 March) Cuneiform: A functional language for large scale scientific data analysis. In Proceedings of the Workshops of the EDBT\/ICDT, vol. 1330, pp. 17\u201326."},{"key":"S0956796817000119_ref9","unstructured":"Bux M. , Brandt J. , Witt C. , Dowling J. & Leser U. (2017) Hi-way: Execution of scientific workflows on hadoop yarn. In Proceedings of the 20th International Conference on Extending Database Technology (EDBT)."},{"key":"S0956796817000119_ref10","doi-asserted-by":"publisher","DOI":"10.1090\/S0002-9947-1936-1501858-0"},{"key":"S0956796817000119_ref11","doi-asserted-by":"publisher","DOI":"10.1145\/2034863.2034865"},{"key":"S0956796817000119_ref14","first-page":"80","volume-title":"Programming-in-the-Large versus Programming-in-the-Small","author":"DeRemer","year":"1976"},{"key":"S0956796817000119_ref16","volume-title":"Pattern Classification","author":"Duda","year":"2012"},{"key":"S0956796817000119_ref55","first-page":"45","article-title":"Fast and interactive analytics over hadoop data with spark","volume":"37","author":"Zaharia","year":"2012","journal-title":"Usenix Login"},{"key":"S0956796817000119_ref17","doi-asserted-by":"crossref","DOI":"10.1201\/9780429246593","volume-title":"An Introduction to the Bootstrap","author":"Efron","year":"1994"},{"key":"S0956796817000119_ref18","first-page":"182","volume-title":"Composing Different Models of Computation in Kepler and Ptolemy ii","author":"Goderis","year":"2007"},{"key":"S0956796817000119_ref20","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.988"},{"key":"S0956796817000119_ref23","volume-title":"The Semantics of Programming Languages: An Elementary Introduction using Structural Operational Semantics","author":"Hennessy","year":"1990"},{"key":"S0956796817000119_ref24","volume-title":"The Fourth Paradigm: Data-Intensive Scientific Discovery","author":"Hey","year":"2009"},{"key":"S0956796817000119_ref25","first-page":"374","volume-title":"Towards a Calculus for Collection-Oriented Scientific Workflows with Side Effects","author":"Hidders","year":"2008"},{"key":"S0956796817000119_ref27","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkl320"},{"key":"S0956796817000119_ref31","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1448"},{"key":"S0956796817000119_ref34","doi-asserted-by":"publisher","DOI":"10.1017\/S0956796805005526"},{"key":"S0956796817000119_ref35","unstructured":"Lud\u00e4scher B. & Altintas I. (2003) On providing declarative design and programming constructs for scientific workflows based on process networks. San Diego Supercomputer Center."},{"key":"S0956796817000119_ref37","first-page":"248","volume-title":"Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data","author":"McPhillips","year":"2006"},{"key":"S0956796817000119_ref40","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1003565"},{"key":"S0956796817000119_ref42","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376726"},{"key":"S0956796817000119_ref43","volume-title":"Types and Programming Languages","author":"Pierce","year":"2002"},{"key":"S0956796817000119_ref49","doi-asserted-by":"publisher","DOI":"10.1145\/360303.360308"},{"key":"S0956796817000119_ref56","doi-asserted-by":"crossref","unstructured":"Zinn D. , Bowers S. , McPhillips T. & Lud\u00e4scher B. (2009) Scientific workflow design with data assembly lines. In Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, WORKS '09. New York, NY, USA: ACM, pp. 14:1\u201314:10.","DOI":"10.1145\/1645164.1645178"}],"container-title":["Journal of Functional Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S0956796817000119","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T20:19:51Z","timestamp":1776889191000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S0956796817000119\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017]]},"references-count":56,"alternative-id":["S0956796817000119"],"URL":"https:\/\/doi.org\/10.1017\/s0956796817000119","relation":{},"ISSN":["0956-7968","1469-7653"],"issn-type":[{"value":"0956-7968","type":"print"},{"value":"1469-7653","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017]]},"article-number":"e22"}}