{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T13:29:38Z","timestamp":1753882178678,"version":"3.41.2"},"reference-count":65,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T00:00:00Z","timestamp":1774915200000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.sagepub.com\/licence-information-for-chorus"},{"start":{"date-parts":[[2025,3,31]],"date-time":"2025-03-31T00:00:00Z","timestamp":1743379200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/100006192","name":"Advanced Scientific Computing Research","doi-asserted-by":"publisher","award":["DESC001270","DESC001274"],"award-info":[{"award-number":["DESC001270","DESC001274"]}],"id":[{"id":"10.13039\/100006192","id-type":"DOI","asserted-by":"publisher"}]},{"name":"U.S. Department of Energy Office of Science, Exascale Computing Project","award":["ORNL agreement #4000201895"],"award-info":[{"award-number":["ORNL agreement #4000201895"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Chimbuko is the first in situ, scalable, workflow-level performance analysis tool for trace-level analysis and visualization of application performance. This tool was developed by the Co-design Center for Online Data Analysis and Reduction and funded by the U.S. Department of Energy\u2019s Exascale Computing Project. We provide a detailed description of Chimbuko\u2019s architecture and illustrate our online and offline visualization with multiple use cases. We also present results for the deployment and scalability of the tool as applied to a high-energy physics workflow running at large scale on the Frontier supercomputer.<\/jats:p>","DOI":"10.1177\/10943420251316253","type":"journal-article","created":{"date-parts":[[2025,4,1]],"date-time":"2025-04-01T04:16:29Z","timestamp":1743480989000},"page":"553-578","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Performance analysis and data reduction for exascale scientific workflows"],"prefix":"10.1177","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3033-1196","authenticated-orcid":false,"given":"Christopher","family":"Kelly","sequence":"first","affiliation":[{"name":"Brookhaven National Laboratory, Upton, NY, USA"},{"name":"Computational Science Department, Upton, NY, USA"}]},{"given":"Wei","family":"Xu","sequence":"additional","affiliation":[{"name":"Brookhaven National Laboratory, Upton, NY, USA"},{"name":"Artificial Intelligence Department, Upton, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2120-6521","authenticated-orcid":false,"given":"Line C","family":"Pouchard","sequence":"additional","affiliation":[{"name":"Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, USA"}]},{"given":"Hubertus","family":"Van Dam","sequence":"additional","affiliation":[{"name":"Brookhaven National Laboratory, Upton, NY, USA"},{"name":"Computational Science Department, Upton, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2877-5871","authenticated-orcid":false,"given":"Tanzima Z","family":"Islam","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Texas State University, San Marcos, TX, USA"}]},{"given":"Shinjae","family":"Yoo","sequence":"additional","affiliation":[{"name":"Brookhaven National Laboratory, Upton, NY, USA"},{"name":"Artificial Intelligence Department, Upton, NY, USA"}]},{"given":"Kerstin","family":"Kleese Van Dam","sequence":"additional","affiliation":[{"name":"Brookhaven National Laboratory, Upton, NY, USA"},{"name":"Computing and Data Sciences Directorate, Upton, NY, USA"}]}],"member":"179","published-online":{"date-parts":[[2025,3,31]]},"reference":[{"key":"e_1_3_4_2_1","unstructured":"AI DD (2024) Dynatrace davis ai. https:\/\/docs.dynatrace.com\/docs\/platform\/davis-ai\/anomaly-detection"},{"key":"e_1_3_4_3_1","doi-asserted-by":"publisher","DOI":"10.1063\/5.0004997"},{"key":"e_1_3_4_4_1","doi-asserted-by":"crossref","unstructured":"Bhatele A Jain N Livnat Y et al. (2016) Analyzing network health and congestion in dragonfly-based supercomputers. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Chicago IL USA 23\u201327 May 2016 93\u2013102.","DOI":"10.1109\/IPDPS.2016.123"},{"key":"e_1_3_4_5_1","doi-asserted-by":"crossref","unstructured":"Boyle P Cossu G Yamaguchi A et al. (2016) Grid: a next generation data parallel C++ QCD library. In: The 33rd International Symposium on Lattice Field Theory was held at the Kobe International Conference Center Kobe Japan 14 July 2015.","DOI":"10.22323\/1.251.0023"},{"key":"e_1_3_4_6_1","unstructured":"Boyle P Kelly C (2024) Grid. branch: \u2018\u2018feature\/Xconjugate_BCs_dirichlet\u201d commit: d1b4963e8b3423049e4913bb5ec5d68eaf7807dd. https:\/\/github.com\/giltirn\/Grid"},{"key":"e_1_3_4_7_1","volume-title":"Readings in Information Visualization: Using Vision to Think","author":"Card M","year":"1999","unstructured":"Card M (1999) Readings in Information Visualization: Using Vision to Think. Burlington, MA: Morgan Kaufmann."},{"key":"e_1_3_4_8_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1252043"},{"key":"e_1_3_4_9_1","doi-asserted-by":"crossref","unstructured":"Chen T Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD\u2019 16) New York NY USA 13 August 2016 785\u2013794.","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_3_4_10_1","unstructured":"Frontier O (2024) The OLCF Frontier supercomputer. https:\/\/www.olcf.ornl.gov\/frontier"},{"key":"e_1_3_4_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.visinf.2018.04.010"},{"key":"e_1_3_4_12_1","doi-asserted-by":"crossref","unstructured":"Gamblin T LeGendre M Collette MR et al. (2015) The Spack package manager: bringing order to HPC software chaos. In: Proceedings of the International conference for high performance computing networking storage and analysis SC \u201915 New York NY USA 15\u201320 November 2015.","DOI":"10.1145\/2807591.2807623"},{"key":"e_1_3_4_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2718532"},{"key":"e_1_3_4_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.softx.2020.100561"},{"key":"e_1_3_4_15_1","first-page":"59","article-title":"Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm","volume":"1","author":"Goldstein M","year":"2012","unstructured":"Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track 1: 59\u201363.","journal-title":"KI-2012: Poster and Demo Track"},{"key":"e_1_3_4_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346456"},{"key":"e_1_3_4_17_1","first-page":"6","article-title":"State of the art of performance visualization","volume":"3","author":"Isaacs KE","year":"2014","unstructured":"Isaacs KE, Gim\u00e9nez A, Jusufi I, et al. (2014b) State of the art of performance visualization. EuroVis (STARs) 3: 6.","journal-title":"EuroVis (STARs)"},{"key":"e_1_3_4_18_1","doi-asserted-by":"crossref","unstructured":"Janecek M Ezzati-Jivan N Hamou-Lhadj A (2022) Performance anomaly detection through sequence alignment of system-level traces. In: 2022 IEEE\/ACM 30th International Conference on Program Comprehension (ICPC) Pittsburgh PA USA 16\u201317 May 2022 264\u2013274.","DOI":"10.1145\/3524610.3527898"},{"key":"e_1_3_4_19_1","doi-asserted-by":"crossref","unstructured":"Kelly C Ha S Huck K Van Dam H Pouchard L Matyasfalvi G et al. (2020) Chimbuko: a workflow-level scalable performance trace analysis tool. In: ISAV\u201920 in situ infrastructures for enabling extreme-scale analysis and visualization Atlanta GA USA 122020 November 15\u201319.","DOI":"10.1145\/3426462.3426465"},{"key":"e_1_3_4_20_1","unstructured":"Kelly C Xu W Yoo S et al. (2024a) Chimbuko backend documentation. https:\/\/chimbuko-performance-analysis.readthedocs.io\/en\/latest\/index.html"},{"key":"e_1_3_4_21_1","unstructured":"Kelly C Xu W Yoo S et al. (2024b) Chimbuko docker images. https:\/\/hub.docker.com\/u\/chimbuko"},{"key":"e_1_3_4_22_1","unstructured":"Kelly C Xu W Yoo S et al. (2024c) Chimbuko GitHub. https:\/\/github.com\/CODARcode\/Chimbuko"},{"key":"e_1_3_4_23_1","unstructured":"Kelly C Xu W Yoo S et al. (2024d) Chimbuko version used for scalability study. branch: \u2018\u2018ckelly_develop\u201d commit:2ccf3c4f89daab98f59fa374ebcf0a5f66c68ab7. https:\/\/github.com\/CODARcode\/PerformanceAnalysis"},{"key":"e_1_3_4_24_1","unstructured":"Kelly C Xu W Yoo S et al. (2024e) Git repository for Mochi Spack packages. https:\/\/github.com\/mochi-hpc\/mochi-spack-packages"},{"key":"e_1_3_4_25_1","unstructured":"Kelly C Christ NH Boyle PA et al. (2024f) Accelerating gauge field evolution for simulations with G-parity boundary conditions. In: Preparation for Publication in Physical Review D."},{"key":"#cr-split#-e_1_3_4_26_1.1","doi-asserted-by":"crossref","unstructured":"Kesavan SP Fujiwara T Li JK et al. (2020) A visual analytics framework for reviewing streaming performance data. In","DOI":"10.1109\/PacificVis48177.2020.9280"},{"key":"#cr-split#-e_1_3_4_26_1.2","unstructured":"2020 IEEE Pacific Visualization Symposium (PacificVis) Tianjin China 03-05 June 2020 206-215."},{"key":"e_1_3_4_27_1","doi-asserted-by":"crossref","unstructured":"Kn\u00fcpfer A et al. (2008). The Vampir Performance Analysis Tool-Set. In: Resch M. Keller R. Himmler V. Krammer B. Schulz A. (eds) Tools for High Performance Computing. Springer Berlin Heidelberg. https:\/\/doi.org\/10.1007\/978-3-540-68564-7_9","DOI":"10.1007\/978-3-540-68564-7_9"},{"key":"e_1_3_4_28_1","doi-asserted-by":"crossref","unstructured":"Kn\u00fcpfer A et al. (2012). Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope Scalasca TAU and Vampir. In: Brunst H. M\u00fcller M. Nagel W. Resch M. (eds) Tools for High Performance Computing 2011. Springer Berlin Heidelberg. https:\/\/doi.org\/10.1007\/978-3-642-31476-6_7","DOI":"10.1007\/978-3-642-31476-6_7"},{"key":"e_1_3_4_29_1","doi-asserted-by":"crossref","unstructured":"Kohyarnejadfard I Shakeri M Aloise D (2019) System performance anomaly detection using tracing data analysis. In: Proceedings of the 2019 5th International Conference on Computer and Technology Applications ICCTA \u201919 New York NY USA 16 April 2019 169\u2013173.","DOI":"10.1145\/3323933.3324085"},{"key":"e_1_3_4_30_1","doi-asserted-by":"publisher","DOI":"10.1021\/acs.chemrev.0c00998"},{"key":"e_1_3_4_31_1","doi-asserted-by":"crossref","unstructured":"Li JK Mubarak M Ross RB et al. (2017) Visual analytics techniques for exploring the design space of large-scale high-radix networks. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER) Honolulu HI USA 05\u201308 September 2017 193\u2013203.","DOI":"10.1109\/CLUSTER.2017.26"},{"key":"e_1_3_4_32_1","doi-asserted-by":"crossref","unstructured":"Li Z Zhao Y Botta N et al. (2020) COPOD: copula-based outlier detection. In: 2020 IEEE International Conference on Data Mining (ICDM) Sorrento Italy 17\u201320 November 2020 1118\u20131123.","DOI":"10.1109\/ICDM50108.2020.00135"},{"key":"e_1_3_4_33_1","doi-asserted-by":"crossref","unstructured":"Liu P Xu H Ouyang Q et al. (2020) Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks. In: 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE) Coimbra Portugal 12\u201315 October 2020 48\u201358.","DOI":"10.1109\/ISSRE5003.2020.00014"},{"key":"e_1_3_4_34_1","unstructured":"Microsoft (2024) Azure anomaly detector. https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/anomaly-detector\/overview"},{"key":"e_1_3_4_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2022.12.001"},{"key":"e_1_3_4_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2534558"},{"key":"e_1_3_4_37_1","doi-asserted-by":"crossref","unstructured":"Nedelkoski S Cardoso J Kao O (2019) Anomaly detection from system tracing data using multimodal deep learning. In: 2019 IEEE 12th International Conference on Cloud Computing (CLOUD) Milan Italy 08\u201313 July 2019 179\u2013186.","DOI":"10.1109\/CLOUD.2019.00038"},{"key":"e_1_3_4_38_1","doi-asserted-by":"crossref","unstructured":"Nicolae B Islam TZ Ross R et al. (2023) Building the I (interoperability) of fair for performance reproducibility of large-scale composable workflows in RECUP. In: 2023 IEEE 19th International Conference on E-Science (E-Science) Limassol Cyprus 09\u201313 October 2023 1\u20137.","DOI":"10.1109\/e-Science58273.2023.10254808"},{"key":"e_1_3_4_39_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342020913628"},{"key":"e_1_3_4_40_1","doi-asserted-by":"crossref","unstructured":"Podhorszki N Liu Q Klasky S et al. (2009) Plasma fusion code coupling using scalable I\/O services and scientific workflows. In: Proceedings of the 4th workshop on workflows in support of large-scale science WORKS 2009 Portland OR USA 16 November 2009 1\u20139.","DOI":"10.1145\/1645164.1645172"},{"key":"e_1_3_4_41_1","doi-asserted-by":"crossref","unstructured":"Pouchard L Malik A Dam HV et al. (2017) Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization. In: 2017 New York Scientific Data Summit (NYSDS) New York NY USA 06\u201309 August 2017 1\u20138.","DOI":"10.1109\/NYSDS.2017.8085043"},{"key":"e_1_3_4_42_1","doi-asserted-by":"crossref","unstructured":"Pouchard L Huck K Matyasfalvi G et al. (2018) Prescriptive provenance for streaming analysis of workflows at scale. In: 2018 New York Scientific Data Summit (NYSDS) New York NY USA 06\u201308 August 2018 1\u20136.","DOI":"10.1109\/NYSDS.2018.8538951"},{"key":"e_1_3_4_43_1","unstructured":"Project C (2024) Celery: distributed task queue. https:\/\/www.celeryproject.org\/"},{"key":"e_1_3_4_44_1","unstructured":"RedisLabs (2024) Redis. https:\/\/www.redislabs.com\/"},{"key":"e_1_3_4_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-020-9802-0"},{"key":"e_1_3_4_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2022.3209375"},{"key":"e_1_3_4_47_1","doi-asserted-by":"crossref","unstructured":"Sanderson A Humphrey A Schmidt J Sisneros R. (2018). Coupling the Uintah Framework and the VisIt Toolkit for Parallel In Situ Data Analysis and Visualization and Computational Steering. In: Yokota R. Weiland M. Shalf J. Alam S. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science() vol 11203. Springer Cham. https:\/\/doi.org\/10.1007\/978-3-030-02465-9_14","DOI":"10.1007\/978-3-030-02465-9_14"},{"key":"e_1_3_4_48_1","doi-asserted-by":"publisher","DOI":"10.3390\/informatics4030021"},{"key":"e_1_3_4_49_1","article-title":"An evaluation of real-time adaptive sampling change point detection algorithm using KCUSUM","author":"Saravanan V","year":"2024","unstructured":"Saravanan V, Siehien P, Yoo S, et al. (2024) An evaluation of real-time adaptive sampling change point detection algorithm using KCUSUM. Submitted to ACM Transactions on Knowledge Discovery from Data.","journal-title":"Submitted to ACM Transactions on Knowledge Discovery from Data"},{"key":"e_1_3_4_50_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342006064482"},{"key":"e_1_3_4_51_1","doi-asserted-by":"crossref","unstructured":"Shende S Malony AD Cuny J et al. (1998) Portable profiling and tracing for parallel scientific applications using C++. In: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools Welches OR USA 01 August 1998 134\u2013145.","DOI":"10.1145\/281035.281049"},{"key":"e_1_3_4_52_1","unstructured":"socketio (2024). https:\/\/socket.IO"},{"key":"e_1_3_4_53_1","unstructured":"Sonata M (2024) Mochi Sonata. https:\/\/github.com\/mochi-hpc\/mochi-sonata"},{"key":"e_1_3_4_54_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-4655(00)00054-0"},{"key":"e_1_3_4_55_1","unstructured":"Summit O (2024) The OLCF Summit supercomputer. https:\/\/www.olcf.ornl.gov\/summit"},{"key":"e_1_3_4_56_1","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/125\/1\/012088"},{"key":"e_1_3_4_57_1","unstructured":"uWSGI (2024). https:\/\/uwsgi-docs.readthedocs.io\/en\/latest\/"},{"key":"e_1_3_4_58_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2010.04.018"},{"key":"e_1_3_4_59_1","doi-asserted-by":"publisher","DOI":"10.5220\/0006646803330340"},{"key":"e_1_3_4_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2865026"},{"key":"e_1_3_4_61_1","doi-asserted-by":"crossref","unstructured":"Xie C et al. (2019). Exploratory Visual Analysis of Anomalous Runtime Behavior in Streaming High Performance Computing Applications. In: Rodrigues J. et al. Computational Science \u2013 ICCS 2019. ICCS 2019. Lecture Notes in Computer Science() vol 11536. Springer Cham. https:\/\/doi.org\/10.1007\/978-3-030-22734-0_12","DOI":"10.1007\/978-3-030-22734-0_12"},{"key":"e_1_3_4_62_1","doi-asserted-by":"crossref","unstructured":"Yamaguchi A Boyle P Cossu G et al. (2022) Grid: OneCode and FourAPIs. In: The 38th International Symposium on Lattice Field Theory MIT 26 July 2021.","DOI":"10.22323\/1.396.0035"},{"key":"e_1_3_4_63_1","unstructured":"Yokan M (2024) Mochi Yokan. https:\/\/mochi.readthedocs.io\/en\/latest\/yokan.html"},{"key":"e_1_3_4_64_1","volume-title":"Cray User Group 2023","author":"Yokelson D","year":"2023","unstructured":"Yokelson D, Lappi O, Ramesh S, et al. (2023) Observability, monitoring, and in situ analytics in exascale applications. In: Cray User Group 2023."},{"key":"e_1_3_4_65_1","doi-asserted-by":"publisher","DOI":"10.1177\/109434209901300310"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420251316253","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/10943420251316253","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420251316253","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420251316253","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T17:31:25Z","timestamp":1751995885000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/10943420251316253"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,31]]},"references-count":65,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.1177\/10943420251316253"],"URL":"https:\/\/doi.org\/10.1177\/10943420251316253","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2025,3,31]]}}}