{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T09:43:19Z","timestamp":1775122999272,"version":"3.50.1"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA","license":[{"start":{"date-parts":[[2019,10,10]],"date-time":"2019-10-10T00:00:00Z","timestamp":1570665600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1617424"],"award-info":[{"award-number":["CNS-1617424"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2019,10,10]]},"abstract":"<jats:p>Performance analysis of a distributed system is typically achieved by collecting profiles whose underlying events are timestamped with unsynchronized clocks of multiple machines in the system. To allow comparison of timestamps taken at different machines, several timestamp synchronization algorithms have been developed. However, the inaccuracies associated with these algorithms can lead to inaccuracies in the final results of performance analysis. To address this problem, in this paper, we develop a system for constructing distributed performance profiles called DProf. At the core of DProf is a new timestamp synchronization algorithm, FreeZer, that tightly bounds the inaccuracy in a converted timestamp to a time interval. This not only allows timestamps from different machines to be compared, it also enables maintaining strong guarantees throughout the comparison which can be carefully transformed into guarantees for analysis results. To demonstrate the utility of DProf, we use it to implement dCSP and dCOZ that are accuracy bounded distributed versions of Context Sensitive Profiles and Causal Profiles developed for shared memory systems. While dCSP enables user to ascertain existence of a performance bottleneck, dCOZ estimates the expected performance benefit from eliminating that bottleneck. Experiments with three distributed applications on a cluster of heterogeneous machines validate that inferences via dCSP and dCOZ are highly accurate. Moreover, if FreeZer is replaced by two existing timestamp algorithms (linear regression &amp; convex hull), the inferences provided by dCSP and dCOZ are severely degraded.<\/jats:p>","DOI":"10.1145\/3360582","type":"journal-article","created":{"date-parts":[[2019,10,11]],"date-time":"2019-10-11T14:53:33Z","timestamp":1570805613000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["DProf: distributed profiler with strong guarantees"],"prefix":"10.1145","volume":"3","author":[{"given":"Zachary","family":"Benavides","sequence":"first","affiliation":[{"name":"University of California at Riverside, USA"}]},{"given":"Keval","family":"Vora","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Canada"}]},{"given":"Rajiv","family":"Gupta","sequence":"additional","affiliation":[{"name":"University of California at Riverside, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,10,10]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"Workshop on Node Level Parallelism for Large Scale Supercomputers, in conjuction with ACM\/IEEE SC.","author":"Adhianto L."},{"key":"e_1_2_2_2_1","volume-title":"ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. 115\u2013125","author":"Anderson T.E."},{"key":"e_1_2_2_4_1","series-title":"IAS Series","volume-title":"Schriften des Forschungszentrums Julich","author":"Becker D."},{"key":"e_1_2_2_5_1","volume-title":"The 17th International Conference on Runtime Verification, LNCS 10548","author":"Benavides Z."},{"key":"e_1_2_2_6_1","volume-title":"26th International Parallel &amp; Distributed Processing Symposium. 1330\u20131340","author":"B\u00f6hme D."},{"key":"e_1_2_2_7_1","volume-title":"ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages &amp; Applications. 355\u2013372","author":"Du Bois K."},{"key":"e_1_2_2_8_1","volume-title":"The 25th Symposium on Operating Systems Principles. 184\u2013197","author":"Curtsinger C."},{"key":"e_1_2_2_9_1","volume-title":"USENIX Annual Technical Conference. 139\u2013150","author":"Ding R."},{"key":"e_1_2_2_10_1","first-page":"299","article-title":"Estimating global time in distributed systems","volume":"87","author":"Duda A.","year":"1987","journal-title":"ICDCS"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4330040305"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1753228.1753234"},{"key":"e_1_2_2_13_1","volume-title":"Fine-grained Clock Synchronization. In 15th USENIX Symposium on Networked Systems Design and Implementation. 81\u201394","author":"Geng Y."},{"key":"e_1_2_2_14_1","volume-title":"Sixth Euromicro Workshop on Parallel and Distributed Processing. 173\u2013179","author":"Hofman R."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/238020.238024"},{"key":"e_1_2_2_16_1","unstructured":"J.K. Hollingsworth and B.P. Miller. 1994. Slack: a new performance metric for parallel programs. In Univ. of Maryland and Univ. of Wisconsin-Madison Tech. Rep.  J.K. Hollingsworth and B.P. Miller. 1994. Slack: a new performance metric for parallel programs. In Univ. of Maryland and Univ. of Wisconsin-Madison Tech. Rep."},{"key":"e_1_2_2_17_1","unstructured":"J. Leskovec K.J. Lang A. Dasgupta and M.W. Mahoney. 2008. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. In CoRR abs\/0810.1355.  J. Leskovec K.J. Lang A. Dasgupta and M.W. Mahoney. 2008. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. In CoRR abs\/0810.1355."},{"key":"e_1_2_2_18_1","volume-title":"IEEE\/ACM 9th Annual International Symposium on Code Generation and Optimization. 171\u2013180","author":"Liu X."},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1995.1090"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303974"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.80132"},{"key":"e_1_2_2_22_1","volume-title":"Vampir: Visualization and analysis of mpi resources. https:\/\/tu-dresden.de\/zih\/forschung\/projekte\/vampir?set_language=en","author":"Nagel W.E.","year":"1996"},{"key":"e_1_2_2_23_1","unstructured":"L. Page S. Brin and R. Motwani. 1999. The PageRank citation ranking: Bringing order to the web. In Stanford InfoLab.  L. Page S. Brin and R. Motwani. 1999. The PageRank citation ranking: Bringing order to the web. In Stanford InfoLab."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1842733.1842747"},{"key":"e_1_2_2_25_1","volume-title":"The 5th EUROMICRO Workshop on Parallel and Distributed Processing. 477\u2013484","author":"Rabenseifner R.","year":"1997"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342006064482"},{"key":"e_1_2_2_27_1","volume-title":"13th ACM Conference on Embedded Networked Sensor Systems. 127\u2013140","author":"Stisen A."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037748"},{"key":"e_1_2_2_29_1","volume-title":"International Conference on Object Oriented Programming Systems, Languages and Applications (OOPSLA). 861\u2013878","author":"Vora K."},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037747"},{"key":"e_1_2_2_31_1","volume-title":"8th International Conference on Distributed Computing Systems. 366\u2013373","author":"Yang C.-Q."},{"key":"e_1_2_2_32_1","volume-title":"Fast Causal Profiler for Task Parallel Programs. In 11th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering.","author":"Yoga A."},{"key":"e_1_2_2_33_1","first-page":"10","article-title":"Spark: Cluster computing with working sets","volume":"10","author":"Zaharia M.","year":"2010","journal-title":"HotCloud"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3360582","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3360582","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3360582","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:22:59Z","timestamp":1750202579000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3360582"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,10]]},"references-count":32,"journal-issue":{"issue":"OOPSLA","published-print":{"date-parts":[[2019,10,10]]}},"alternative-id":["10.1145\/3360582"],"URL":"https:\/\/doi.org\/10.1145\/3360582","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,10,10]]},"assertion":[{"value":"2019-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}