{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,3]],"date-time":"2022-04-03T13:10:23Z","timestamp":1648991423724},"reference-count":0,"publisher":"Cambridge University Press (CUP)","issue":"4","license":[{"start":{"date-parts":[[2001,9,4]],"date-time":"2001-09-04T00:00:00Z","timestamp":999561600000},"content-version":"unspecified","delay-in-days":65,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Funct. Prog."],"published-print":{"date-parts":[[2001,7]]},"abstract":"<jats:p>This paper is an exploration of the parallel graph reduction approach to parallel functional \nprogramming, illustrated by a particular example: pipelined, dynamically-scheduled implementation \nof search, updates and read-modify-write transactions on an in-store binary search \ntree. We use program transformation, execution-driven simulation and analytical modelling \nto expose the maximum potential parallelism, the minimum communication and synchronisation \noverheads, and to control the overall space requirement. We begin with a lazy functional \nprogram specifying a series of transactions on a binary tree, each involving several searches \nand updates, in a side-effect-free fashion. Transformation of the source code produces a \nformulation of the program with greater locality and larger grain size than can be achieved \nusing naive parallelization methods, and we show that, with care, these tasks can be scheduled \neffectively. Even with a workload using random keys, significant spatial locality is found, and \nwe evaluate a modified cache coherency protocol which avoids false sharing so that large \ncache lines can be used to minimise the number of messages required. As expected with \na pipeline, the application should reach a steady state as soon as the first transaction is \ncompleted. However, if the network latency is too large, the rate of completion lags behind \nthe rate at which work is admitted, and internal queues grow without bound. We determine \nthe conditions under which this occurs, and show how it can be avoided while maximising \nspeedup.<\/jats:p>","DOI":"10.1017\/s0956796801003793","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T09:16:36Z","timestamp":1027761396000},"page":"359-393","source":"Crossref","is-referenced-by-count":0,"title":["Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence"],"prefix":"10.1017","volume":"11","author":[{"given":"ANDREW J.","family":"BENNETT","sequence":"first","affiliation":[]},{"given":"PAUL H. J.","family":"KELLY","sequence":"additional","affiliation":[]},{"given":"ROSS A.","family":"PATERSON","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2001,9,4]]},"container-title":["Journal of Functional Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S0956796801003793","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,4,1]],"date-time":"2019-04-01T15:15:26Z","timestamp":1554131726000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S0956796801003793\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2001,7]]},"references-count":0,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2001,7]]}},"alternative-id":["S0956796801003793"],"URL":"https:\/\/doi.org\/10.1017\/s0956796801003793","relation":{},"ISSN":["0956-7968","1469-7653"],"issn-type":[{"value":"0956-7968","type":"print"},{"value":"1469-7653","type":"electronic"}],"subject":[],"published":{"date-parts":[[2001,7]]}}}