{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,3]],"date-time":"2025-07-03T01:43:37Z","timestamp":1751507017880},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,22]],"date-time":"2023-06-22T00:00:00Z","timestamp":1687392000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,22]],"date-time":"2023-06-22T00:00:00Z","timestamp":1687392000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"PROFAS B+, an Algerian-French scholarship program offered by the Algerian Ministry of Higher Education and Scientific Research, and Campus France"},{"name":"National Institute for Research in Digital Science and Technology"},{"name":"Rennes Metropole"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cloud Comp"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Function-as-a-Service (FaaS) is a popular programming model for building serverless applications, supported by all major cloud providers and many open-source software frameworks. One of the main challenges for FaaS providers is providing fault tolerance for the deployed applications, that is, providing the ability to mask failures of function invocations from clients. The basic fault tolerance approach in current FaaS platforms is automatically retrying function invocations. Although the retry approach is well suited for transient failures, it incurs delays in recovering from other types of failures, such as node crashes. This paper proposes the integration of a Request Replication mechanism in FaaS platforms and describes how this integration was implemented in Fission, a well-known, open-source platform. It provides a detailed experimental comparison of the proposed approach with the retry approach and an Active-Standby approach in terms of performance, availability, and resource consumption under different failure scenarios.<\/jats:p>","DOI":"10.1186\/s13677-023-00457-z","type":"journal-article","created":{"date-parts":[[2023,6,22]],"date-time":"2023-06-22T07:09:03Z","timestamp":1687417743000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Integrating request replication into FaaS platforms: an experimental evaluation"],"prefix":"10.1186","volume":"12","author":[{"given":"Yasmina","family":"Bouizem","sequence":"first","affiliation":[]},{"given":"Djawida","family":"Dib","sequence":"additional","affiliation":[]},{"given":"Nikos","family":"Parlavantzas","sequence":"additional","affiliation":[]},{"given":"Christine","family":"Morin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,22]]},"reference":[{"issue":"12","key":"457_CR1","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1145\/3368454","volume":"62","author":"P Castro","year":"2019","unstructured":"Castro P, Ishakian V, Muthusamy V, Slominski A (2019) The rise of serverless computing. Commun ACM 62(12):44\u201354","journal-title":"Commun ACM"},{"key":"457_CR2","unstructured":"Jonas E, Schleier-Smith J, Sreekanti V, Tsai CC, Khandelwal A, Pu Q, Shankar V, Carreira J, Krauth K, Yadwadkar N, et\u00a0al (2019) Cloud programming simplified: A berkeley view on serverless computing. arXiv preprint arXiv:1902.03383"},{"key":"457_CR3","unstructured":"Amazon Web Services (2020) Aws lambda features. https:\/\/aws.amazon.com\/lambda\/features\/. Accessed 07 July 2021"},{"key":"457_CR4","unstructured":"Google cloud functions (2019) Retrying background functions. https:\/\/cloud.google.com\/functions\/docs\/bestpractices\/retries. Accessed 07 July 2021"},{"key":"457_CR5","unstructured":"Azure Functions (2020) Azure functions. https:\/\/azure.microsoft.com\/fr-fr\/services\/functions\/. Accessed 07 July 2021"},{"key":"457_CR6","unstructured":"Fission (2019) Fission. https:\/\/docs.fission.io\/docs\/. Accessed 07 July 2021"},{"key":"457_CR7","unstructured":"OpenFaaS (2019) Openfaas. https:\/\/www.openfaas.com. Accessed 07 July 2021"},{"key":"457_CR8","unstructured":"Kubeless (2021) Kubeless. https:\/\/kubeless.io. Accessed 07 July 2021"},{"key":"457_CR9","unstructured":"Apache OpenWhisk (2021) Apache openwhisk. https:\/\/openwhisk.apache.org. Accessed 07 July 2021"},{"key":"457_CR10","unstructured":"Amazon Web Services (2020) Error handling and automatic retries in aws lambda. https:\/\/docs.aws.amazon.com\/lambda\/latest\/dg\/invocation-retries.html. Accessed 07 July 2021"},{"key":"457_CR11","unstructured":"AWS Admin (2019) Using aws serverless technology as an enabler for cloud adoption. https:\/\/aws.amazon.com\/blogs\/apn\/using-aws-serverless-technology-as-an-enabler-for-cloud-adoption\/. Accessed 07 July 2021"},{"key":"457_CR12","unstructured":"Cloud design patterns (2020) Retry pattern. https:\/\/docs.microsoft.com\/en-us\/azure\/architecture\/patterns\/retry. Accessed 07 July 2021"},{"key":"457_CR13","unstructured":"OpenFaaS (2019) Timeouts - asynchronous invocations. https:\/\/docs.openfaas.com\/deployment\/troubleshooting\/#timeouts-asynchronous-invocations. Accessed 07 July 2021"},{"issue":"5","key":"457_CR14","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1109\/TC.2004.1275293","volume":"53","author":"P Felber","year":"2004","unstructured":"Felber P, Narasimhan P (2004) Experiences, strategies, and challenges in building fault-tolerant corba systems. IEEE Trans Comput 53(5):497\u2013511","journal-title":"IEEE Trans Comput"},{"key":"457_CR15","doi-asserted-by":"publisher","unstructured":"Bouizem Y, Dib D, Parlavantzas N, Morin C (2020) Active-Standby for High-Availability in FaaS. In: Sixth International Workshop on Serverless Computing (WoSC6) 2020, Delft, Netherlands. https:\/\/doi.org\/10.1145\/3429880.3430097. https:\/\/hal.inria.fr\/hal-03043479","DOI":"10.1145\/3429880.3430097"},{"key":"457_CR16","unstructured":"Amazon Web Services (2022) How do i make my lambda function idempotent? https:\/\/aws.amazon.com\/premiumsupport\/knowledge-center\/lambda-function-idempotent\/. Accessed 29 Apr 2022"},{"key":"457_CR17","unstructured":"Google cloud functions (2022) Retrying event-driven functions. https:\/\/cloud.google.com\/functions\/docs\/bestpractices\/retries. Accessed 30 Apr 2022"},{"key":"457_CR18","unstructured":"Amazon Web Services (2022) Configuring provisioned concurrency. https:\/\/docs.aws.amazon.com\/lambda\/latest\/dg\/provisioned-concurrency.html. Accessed 09 Dec 2022"},{"issue":"2","key":"457_CR19","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1109\/TSC.2018.2816644","volume":"14","author":"MA Mukwevho","year":"2021","unstructured":"Mukwevho MA, Celik T (2021) Toward a smart cloud: A review of fault-tolerance methods in cloud systems. IEEE Trans Serv Comput 14(2):589\u2013605. https:\/\/doi.org\/10.1109\/TSC.2018.2816644","journal-title":"IEEE Trans Serv Comput"},{"key":"457_CR20","unstructured":"AWS Admin (2020) Azure functions geo-disaster recovery. https:\/\/docs.microsoft.com\/en-us\/azure\/azure-functions\/functions-geo-disaster-recovery. Accessed 07 July 2021"},{"key":"457_CR21","unstructured":"Google Cloud Workflows (2021) Google cloud workflows. https:\/\/cloud.google.com\/workflows. Accessed 07 July 2021"},{"key":"457_CR22","unstructured":"AWS Step Functions (2021) Aws step functions. https:\/\/aws.amazon.com\/step-functions. Accessed 07 July 2021"},{"key":"457_CR23","unstructured":"Azure Durable Functions (2021) Azure durable functions. https:\/\/docs.microsoft.com\/en-us\/azure\/azure-functions\/durable. Accessed 07 July 2021"},{"key":"457_CR24","unstructured":"Resume AWS Step Functions (2021) Resume aws step functions from any state. https:\/\/aws.amazon.com\/blogs\/compute\/resume-aws-step-functions-from-any-state\/. Accessed 07 July 2021"},{"key":"457_CR25","unstructured":"Apache OpenWhisk Composer (2021) Apache openwhisk composer. https:\/\/github.com\/apache\/openwhisk-composer. Accessed 07 July 2021"},{"key":"457_CR26","unstructured":"Faas-flow (2021) Faas-flow. https:\/\/github.com\/s8sg\/faas-flow. Accessed 07 July 2021"},{"key":"457_CR27","doi-asserted-by":"publisher","unstructured":"Sreekanti V, Wu C, Chhatrapati S, Gonzalez JE, Hellerstein JM, Faleiro JM (2020) A fault-tolerance shim for serverless computing. In: Proceedings of the Fifteenth European Conference on Computer Systems, Association for Computing Machinery, New York, NY, USA, EuroSys \u201920. https:\/\/doi.org\/10.1145\/3342195.3387535","DOI":"10.1145\/3342195.3387535"},{"key":"457_CR28","unstructured":"Zhang H, Cardoza A, Chen PB, Angel S, Liu V (2020) Fault-tolerant and transactional stateful serverless workflows. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), USENIX Association, Banff, Alberta. https:\/\/www.usenix.org\/conference\/osdi20\/presentation\/zhang-haoran.\u00a0Accessed 21 June 2023"},{"key":"457_CR29","doi-asserted-by":"crossref","unstructured":"Jia Z, Witchel E (2021) Boki: Stateful serverless computing with shared logs. In: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles CD-ROM.\u00a0Association for Computing Machinery, New York, pp 691\u2013707","DOI":"10.1145\/3477132.3483541"},{"key":"457_CR30","doi-asserted-by":"crossref","unstructured":"Wu C, Sreekanti V, Hellerstein JM (2020) Transactional causal consistency for serverless computing. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.\u00a0Association for Computing Machinery, New York, pp 83\u201397","DOI":"10.1145\/3318464.3389710"},{"key":"457_CR31","doi-asserted-by":"crossref","unstructured":"Wu C, Faleiro J, Lin Y, Hellerstein J (2019) Anna: A kvs for any scale. IEEE Trans Knowl Data Eng\u00a033(2):344\u201358","DOI":"10.1109\/TKDE.2019.2898401"},{"key":"457_CR32","doi-asserted-by":"crossref","unstructured":"de\u00a0Heus M, Psarakis K, Fragkoulis M, Katsifodimos A (2021) Distributed transactions on serverless stateful functions. In: Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems.\u00a0Association for Computing Machinery, New York, pp 31\u201342","DOI":"10.1145\/3465480.3466920"},{"key":"457_CR33","unstructured":"Apache Flink StateFu (2021) Apache flink statefu. https:\/\/flink.apache.org\/stateful-functions.html. Accessed 14 Nov 2021"},{"key":"457_CR34","doi-asserted-by":"crossref","unstructured":"Zhang W, Fang V, Panda A, Shenker S (2020) Kappa: A programming framework for serverless computing. In: Proceedings of the 11th ACM Symposium on Cloud Computing.\u00a0Association for Computing Machinery, New York, pp 328\u2013343","DOI":"10.1145\/3419111.3421277"},{"key":"457_CR35","doi-asserted-by":"crossref","unstructured":"Carver B, Zhang J, Wang A, Anwar A, Wu Y Panruo\u00a0andCheng (2020) Wukong: A scalable and locality-enhanced framework for serverless parallel computing. In: Proceedings of the 11th ACM Symposium on Cloud Computing.\u00a0Association for Computing Machinery, New York, pp 1\u201315","DOI":"10.1145\/3419111.3421286"},{"key":"457_CR36","doi-asserted-by":"publisher","unstructured":"Karhula P, Janak J, Schulzrinne H (2019) Checkpointing and migration of iot edge functions. In: Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking, Association for Computing Machinery, New York, NY, USA, EdgeSys \u201919, p 60\u201365. https:\/\/doi.org\/10.1145\/3301418.3313947","DOI":"10.1145\/3301418.3313947"},{"key":"457_CR37","doi-asserted-by":"publisher","first-page":"110906","DOI":"10.1016\/j.jss.2021.110906","volume":"175","author":"V Yussupov","year":"2021","unstructured":"Yussupov V, Soldani J, Breitenb\u00fccher U, Brogi A, Leymann F (2021) Faasten your decisions: A classification framework and technology review of function-as-a-service platforms. J Syst Softw 175:110906. https:\/\/doi.org\/10.1016\/j.jss.2021.110906","journal-title":"J Syst Softw"},{"key":"457_CR38","unstructured":"Hightower K, Burns B, Beda J (2017) Kubernetes: up and running: dive into the future of infrastructure. O\u2019Reilly Media, Inc"},{"key":"457_CR39","unstructured":"Shilkov M (2019) What is a cold start? https:\/\/mikhail.io\/serverless\/coldstarts\/define\/. Accessed 07 July 2021"},{"key":"457_CR40","unstructured":"Fission Router (2020) Fission router. https:\/\/godoc.org\/github.com\/fission\/fission\/pkg\/router. Accessed 07 July 2021"},{"key":"457_CR41","unstructured":"Kubernetes Readiness Probe (2021) Configure liveness, readiness and startup probes. https:\/\/kubernetes.io\/docs\/tasks\/configure-pod-container\/configure-liveness-readiness-startup-probes\/. Accessed 07 July 2021"},{"key":"457_CR42","unstructured":"Grid5000 (2020) Grid5000. https:\/\/www.grid5000.fr\/w\/Grid5000:Home. Accessed 07 July 2021"},{"key":"457_CR43","unstructured":"Kubernetes (2021) Kubernetes. https:\/\/kubernetes.io\/. Accessed 07 July 2021"},{"key":"457_CR44","unstructured":"Tsung (2021) Tsung. http:\/\/tsung.erlang-projects.org\/user_manual\/. Accessed 07 July 2021"},{"key":"457_CR45","unstructured":"chaos (2021) Chaos mesh. https:\/\/chaos-mesh.org\/. Accessed 07 July 2021"},{"key":"457_CR46","doi-asserted-by":"publisher","unstructured":"Sreekanti V, Wu C, Lin XC, Schleier-Smith J, Gonzalez JE, Hellerstein JM, Tumanov A (2020) Cloudburst: Stateful functions-as-a-service. Proc VLDB Endow 13(12):2438\u20132452. https:\/\/doi.org\/10.14778\/3407790.3407836","DOI":"10.14778\/3407790.3407836"},{"key":"457_CR47","doi-asserted-by":"crossref","unstructured":"bin Bandan MI, Bhattacharjee S, Pradhan DK, Mathew J, (2015) Adaptive checkpoint interval algorithm considering task deadline and lifetime reliability for real-time system. Procedia Comput Sci 70:821\u2013828","DOI":"10.1016\/j.procs.2015.10.124"}],"container-title":["Journal of Cloud Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-023-00457-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13677-023-00457-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13677-023-00457-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,22]],"date-time":"2023-06-22T07:11:52Z","timestamp":1687417912000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofcloudcomputing.springeropen.com\/articles\/10.1186\/s13677-023-00457-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,22]]},"references-count":47,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["457"],"URL":"https:\/\/doi.org\/10.1186\/s13677-023-00457-z","relation":{},"ISSN":["2192-113X"],"issn-type":[{"value":"2192-113X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,22]]},"assertion":[{"value":"18 September 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 May 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"94"}}