{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:14:11Z","timestamp":1760033651092,"version":"build-2065373602"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA2","funder":[{"name":"Independent Research Fund Denmark","award":["Hyben"],"award-info":[{"award-number":["Hyben"]}]},{"name":"University of Malta Research Seed Fund","award":["CPSRP01-25"],"award-info":[{"award-number":["CPSRP01-25"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2025,10,9]]},"abstract":"<jats:p>Many software applications rely on concurrent and distributed (micro)services that interact via message-passing and various forms of remote procedure calls (RPC). As these systems organically evolve and grow in scale and complexity, the risk of introducing deadlocks increases and their impact may worsen: even if only a few services deadlock, many other services may block while awaiting responses from the deadlocked ones. As a result, the \"core\" of the deadlock can be obfuscated by its consequences on the rest of the system, and diagnosing and fixing the problem can be challenging.<\/jats:p>\n          <jats:p>In this work we tackle the challenge by proposing distributed black-box monitors that are deployed alongside each service and detect deadlocks by only observing the incoming and outgoing messages, and exchanging probes with other monitors. We present a formal model that captures popular RPC-based application styles (e.g., gen_servers in Erlang\/OTP), and a distributed black-box monitoring algorithm that we prove sound and complete (i.e., identifies deadlocked services with neither false positives nor false negatives). We implement our results in a tool called DDMon for the monitoring of Erlang\/OTP applications, and we evaluate its performance.<\/jats:p>\n          <jats:p>This is the first work that formalises, proves the correctness, and implements distributed black-box monitors for deadlock detection. Our results are mechanised in Coq. DDMon is the companion artifact of this paper.<\/jats:p>","DOI":"10.1145\/3763069","type":"journal-article","created":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T08:49:50Z","timestamp":1759999790000},"page":"527-554","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Correct Black-Box Monitors for Distributed Deadlock Detection: Formalisation and Implementation"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-0758-722X","authenticated-orcid":false,"given":"Rados\u0142aw Jan","family":"Rowicki","sequence":"first","affiliation":[{"name":"Technical University of Denmark, Kongens Lyngby, Denmark"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3829-7391","authenticated-orcid":false,"given":"Adrian","family":"Francalanza","sequence":"additional","affiliation":[{"name":"University of Malta, Msida, Malta"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1153-6164","authenticated-orcid":false,"given":"Alceste","family":"Scalas","sequence":"additional","affiliation":[{"name":"Technical University of Denmark, Kongens Lyngby, Denmark"}]}],"member":"320","published-online":{"date-parts":[[2025,10,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-08679-3_1"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPICS.CONCUR.2024.4"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-71500-7_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ECOOP.2024.2"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/11678779_14"},{"key":"e_1_2_1_6_1","unstructured":"Akka and Pekko developers team. 2025. Apache Pekko gRPC. https:\/\/pekko.apache.org\/docs\/pekko-grpc\/current\/index.html Accessed: 2025-03-10"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.JSS.2021.111014"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-33693-0_26"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPDC2018.2018.00032"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.JSS.2023.111788"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3605159.3605857"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/6513.6516"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-75632-5_1"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","unstructured":"Saddek Bensalem Jean-Claude Fernandez Klaus Havelund and Laurent Mounier. 2006. Confirmation of deadlock potentials detected by runtime analysis. In PADTAD. ACM 41\u201350. https:\/\/doi.org\/10.1145\/1147403.1147412 10.1145\/1147403.1147412","DOI":"10.1145\/1147403.1147412"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/11678779_15"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/800222.806756"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/357360.357365"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1094811.1094852"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.21721"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3229060"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1107.5722"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jlamp.2021.100717"},{"key":"e_1_2_1_23_1","unstructured":"Ericsson AB. 2025. Erlang\/OTP System Documentation 14.2.5.8. https:\/\/www.erlang.org\/docs\/26\/pdf\/otp-system-documentation.pdf Accessed: 2025-03-10"},{"key":"e_1_2_1_24_1","unstructured":"Erlang\/OTP Team. 2025. Erlang\/OTP documentation: gen_server Behaviour. https:\/\/www.erlang.org\/doc\/system\/gen_server_concepts Accessed: 2025-03-10"},{"key":"e_1_2_1_25_1","unstructured":"Erlang\/OTP Team. 2025. Erlang\/OTP documentation: gen_statem Behaviour. https:\/\/www.erlang.org\/doc\/system\/statem.html Accessed: 2025-03-10"},{"key":"e_1_2_1_26_1","unstructured":"Erlang\/OTP Team. 2025. Erlang\/OTP documentation: the trace interface. https:\/\/www.erlang.org\/doc\/apps\/kernel\/trace.html Accessed: 2025-03-10"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10703-019-00334-z"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2021.104704"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3329125"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1980.230491"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.IC.2010.05.002"},{"key":"e_1_2_1_32_1","volume-title":"Small (Enough) World After All. In 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021","author":"Hance Travis","year":"2021","unstructured":"Travis Hance, Marijn Heule, Ruben Martins, and Bryan Parno. 2021. Finding Invariants of Distributed Systems: It\u2019s a Small (Enough) World After All. In 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021, April 12-14, 2021, James Mickens and Renata Teixeira (Eds.). USENIX Association, 115\u2013131. isbn:978-1-939133-21-2 https:\/\/www.usenix.org\/conference\/nsdi21\/presentation\/hance"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/10722468_15"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503237"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CMPSAC.1978.810401"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/PRDC.2010.49"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/360248.360251"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.1997.652603"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/276393.278524"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/11817949_16"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2016.03.004"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2006.151"},{"key":"e_1_2_1_43_1","volume-title":"Communication and concurrency","author":"Milner Robin","unstructured":"Robin Milner. 1989. Communication and concurrency. Prentice Hall. isbn:0131150073"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/800222.806755"},{"key":"e_1_2_1_45_1","volume-title":"Nested transactions: an approach to reliable distributed computing","author":"Moss J. E.B.","year":"2001","unstructured":"J. E.B. Moss. 1985. Nested transactions: an approach to reliable distributed computing. Massachusetts Institute of Technology, USA. isbn:0262132001"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/S00450-019-00407-8"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2892208.2892232"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","unstructured":"Luca Padovani. 2014. Deadlock and Lock Freedom in the Linear \u03c0 -Calculus. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM\/IEEE Symposium on Logic in Computer Science (LICS) (CSL-LICS \u201914). Association for Computing Machinery New York NY USA. Article 72 10 pages. isbn:9781450328869 https:\/\/doi.org\/10.1145\/2603088.2603116 10.1145\/2603088.2603116","DOI":"10.1145\/2603088.2603116"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.4204\/EPTCS.190.4"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","unstructured":"Rados\u0142 aw Jan Rowicki. 2025. Coq Mechanisation of a Theory of Black-Box Monitors for Correct Distributed Deadlock Detection (version 0.9.0). https:\/\/doi.org\/10.5281\/zenodo.16909482 10.5281\/zenodo.16909482","DOI":"10.5281\/zenodo.16909482"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","unstructured":"Rados\u0142 aw Jan Rowicki. 2025. DDMon: a Monitoring Tool for Distributed Deadlock Detection (version 0.1.0). https:\/\/doi.org\/10.5281\/zenodo.16909304 10.5281\/zenodo.16909304","DOI":"10.5281\/zenodo.16909304"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","unstructured":"Rados\u0142 aw Jan Rowicki Adrian Francalanza and Alceste Scalas. 2025. Correct Black-Box Monitors for Distributed Deadlock Detection: Formalisation and Implementation (Technical Report). https:\/\/doi.org\/10.48550\/arXiv.2508.14851 10.48550\/arXiv.2508.14851","DOI":"10.48550\/arXiv.2508.14851"},{"key":"e_1_2_1_53_1","volume-title":"The Pi-Calculus - a theory of mobile processes","author":"Sangiorgi Davide","unstructured":"Davide Sangiorgi and David Walker. 2001. The Pi-Calculus - a theory of mobile processes. Cambridge University Press. isbn:0521781779"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1985.231844"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/S10619-011-7078-7"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2737924.2737958"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/187436.187447"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3763069","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:44:57Z","timestamp":1760031897000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3763069"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,9]]},"references-count":57,"journal-issue":{"issue":"OOPSLA2","published-print":{"date-parts":[[2025,10,9]]}},"alternative-id":["10.1145\/3763069"],"URL":"https:\/\/doi.org\/10.1145\/3763069","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,9]]},"assertion":[{"value":"2025-03-26","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}