{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T19:23:33Z","timestamp":1774121013871,"version":"3.50.1"},"reference-count":144,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T00:00:00Z","timestamp":1669766400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"U.S. Department of Energy National Nuclear Security Agency ATDM"},{"name":"U.S. Department of Energy Office of Science, under the SSIO"},{"name":"SIRIUS project and the Data Management"},{"name":"United States Department of Energy through the Computational Sciences Graduate Fellowship","award":["DE-SC0020347"],"award-info":[{"award-number":["DE-SC0020347"]}]},{"name":"State of Illinois"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Storage"],"published-print":{"date-parts":[[2022,11,30]]},"abstract":"<jats:p>High-performance computing scientists are producing unprecedented volumes of data that take a long time to load for analysis. However, many analyses only require loading in the data containing particular features of interest and scientists have many approaches for identifying these features. Therefore, if scientists store information (descriptive metadata) about these identified features, then for subsequent analyses they can use this information to only read in the data containing these features. This can greatly reduce the amount of data that scientists have to read in, thereby accelerating analysis. Despite the potential benefits of descriptive metadata management, no prior work has created a descriptive metadata system that can help scientists working with a wide range of applications and analyses to restrict their reads to data containing features of interest. In this article, we present EMPRESS, the first such solution. EMPRESS offers all of the features needed to help accelerate discovery: It can accelerate analysis by up to 300 \u00d7, supports a wide range of applications and analyses, is high-performing, is highly scalable, and requires minimal storage space. In addition, EMPRESS offers features required for a production-oriented system: scalable metadata consistency techniques, flexible system configurations, fault tolerance as a service, and portability.<\/jats:p>","DOI":"10.1145\/3523698","type":"journal-article","created":{"date-parts":[[2022,9,27]],"date-time":"2022-09-27T11:25:22Z","timestamp":1664277922000},"page":"1-49","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["EMPRESS: Accelerating Scientific Discovery through Descriptive Metadata Management"],"prefix":"10.1145","volume":"18","author":[{"given":"Margaret","family":"Lawson","sequence":"first","affiliation":[{"name":"The University of Illinois at Urbana-Champaign, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"William","family":"Gropp","sequence":"additional","affiliation":[{"name":"The University of Illinois at Urbana-Champaign, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jay","family":"Lofstead","sequence":"additional","affiliation":[{"name":"Sandia National Laboratories, United States"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,12,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-010-0135-6"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/119\/7\/072003"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1093\/mnras\/sty1308"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-36265-7_49"},{"key":"e_1_3_1_6_2","first-page":"57","article-title":"Benchmarking a hemodynamics application on Intel based HPC systems","volume":"32","author":"Auricchio Ferdinando","year":"2018","unstructured":"Ferdinando Auricchio, Marco Fedele, Marco Ferretti, Adrien Lefieux, Rodrigo Romarowski, Luigi Santangelo, and Alessandro Veneziani. 2018. Benchmarking a hemodynamics application on Intel based HPC systems. Parallel Comput. Everyw. 32 (2018), 57.","journal-title":"Parallel Comput. Everyw."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626411000060"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.2312\/envirvis.20181134"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.2514\/6.2021-1750"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389063"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-43659-3_31"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1051\/0004-6361\/201220610"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.142"},{"key":"e_1_3_1_14_2","first-page":"13","volume-title":"Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization","author":"Biswas Ayan","year":"2018","unstructured":"Ayan Biswas, Soumya Dutta, Jesus Pulido, and James Ahrens. 2018. In situ data-driven adaptive sampling for large-scale simulation data summarization. In Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization. ACM, New York, NY, 13\u201318."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/I2CT42659.2018.9058252"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.2172\/1422987"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2010.253"},{"key":"e_1_3_1_18_2","volume-title":"Cray User Group Conference (CUG\u201917)","author":"Byna Suren","year":"2017","unstructured":"Suren Byna, Mohamad Chaarawi, Quincey Koziol, John Mainzer, and Frank Willmore. 2017. Tuning HDF5 subfiling performance on parallel file systems. In Cray User Group Conference (CUG\u201917)."},{"key":"e_1_3_1_19_2","first-page":"49","volume-title":"Symposium on Data Visualisation","author":"Carr Hamish","year":"2003","unstructured":"Hamish Carr and Jack Snoeyink. 2003. Path seeds and flexible isosurfaces using topology for exploratory visualization. In Symposium on Data Visualisation. The Eurographics Association, 49\u201358."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/IC2E.2017.22"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2015.53"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.3389\/fphys.2018.00268"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.114.108001"},{"key":"e_1_3_1_24_2","unstructured":"Jai Dayal and Jay Lofstead. 2016. Doubly Distributed Transactions. Retrieved from https:\/\/github.com\/gflofst\/d2t."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1155\/2005\/128026"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.17487\/RFC1950"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1093\/mnras\/stw655"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2009.2021005"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2017.2749300"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-011-0162-y"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2016.25"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/SASO.2008.47"},{"key":"e_1_3_1_33_2","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1145\/1966895.1966900","volume-title":"EDBT\/ICDT Workshop on Array Databases","author":"Folk Mike","year":"2011","unstructured":"Mike Folk, Gerd Heber, Quincey Koziol, Elena Pourmal, and Dana Robinson. 2011. An overview of the HDF5 technology suite and its applications. In EDBT\/ICDT Workshop on Array Databases. ACM, New York, NY, 36\u201347."},{"key":"e_1_3_1_34_2","unstructured":"MPI Forum. 2012. MPI: A Message-Passing Interface Standard Version 3.0. Retrieved November 20 2022 from https:\/\/www.mpi-forum.org\/mpi-30\/."},{"key":"e_1_3_1_35_2","first-page":"176","volume-title":"High-Performance Computing","author":"Fujishiro Issei","year":"2005","unstructured":"Issei Fujishiro, Rieko Otsuka, Shigeo Takahashi, and Yuriko Takeshima. 2005. T-Map: A topological approach to visual exploration of time-varying volume data. In High-Performance Computing. Springer, New York, NY, 176\u2013190."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/1869381.1869384"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/1995896.1995924"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1177\/1094342019842919"},{"key":"e_1_3_1_39_2","article-title":"Machine learning for galaxy morphology classification","author":"Gauci Adam","year":"2010","unstructured":"Adam Gauci, Kristian Zarb Adami, and John Abela. 2010. Machine learning for galaxy morphology classification. arXiv preprint arXiv:1005.0390 (2010).","journal-title":"arXiv preprint arXiv:1005.0390"},{"key":"e_1_3_1_40_2","first-page":"343","volume-title":"13th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","author":"Gong Zhenhuan","year":"2013","unstructured":"Zhenhuan Gong, David A. Boyuka II, Xiaocheng Zou, Qing Liu, Norbert Podhorszki, Scott Klasky, Xiaosong Ma, and Nagiza F. Samatova. 2013. PARLO: PArallel Run-time Layout Optimization for scientific data explorations with heterogeneous access patterns. In 13th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, New York, NY, 343\u2013351."},{"key":"e_1_3_1_41_2","first-page":"873","volume-title":"IEEE 26th International Symposium on Parallel & Distributed Processing Symposium (IPDPS)","author":"Gong Zhenhuan","year":"2012","unstructured":"Zhenhuan Gong, Sriram Lakshminarasimhan, John Jenkins, Hemanth Kolla, Stephane Ethier, Jackie Chen, Robert Ross, Scott Klasky, and Nagiza F. Samatova. 2012. Multi-level layout optimization for efficient spatio-temporal queries on ISABELA-compressed data. In IEEE 26th International Symposium on Parallel & Distributed Processing Symposium (IPDPS). IEEE, New York, NY, 873\u2013884."},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2012.39"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2007.70519"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/SSDBM.2006.27"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2010.80"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1093\/mnras\/stu642"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.5555\/2750482.2750484"},{"key":"e_1_3_1_48_2","volume-title":"Using MPI: Portable Parallel Programming with the Message-passing Interface","author":"Gropp William","year":"1999","unstructured":"William Gropp, William D. Gropp, Ewing Lusk, Anthony Skjellum, and Argonne Distinguished Fellow Emeritus Ewing Lusk. 1999. Using MPI: Portable Parallel Programming with the Message-passing Interface. Vol. 1. MIT Press, Cambridge, MA."},{"key":"e_1_3_1_49_2","volume-title":"Using Advanced MPI: Modern Features of the Message-passing Interface","author":"Gropp William","year":"2014","unstructured":"William Gropp, Torsten Hoefler, Rajeev Thakur, and Ewing Lusk. 2014. Using Advanced MPI: Modern Features of the Message-passing Interface. MIT Press."},{"key":"e_1_3_1_50_2","volume-title":"Using MPI-2: Advanced Features of the Message-passing Interface","author":"Gropp William","year":"1999","unstructured":"William Gropp, Rajeev Thakur, and Ewing Lusk. 1999. Using MPI-2: Advanced Features of the Message-passing Interface. MIT Press, Cambridge, MA."},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-69953-0_4"},{"key":"e_1_3_1_52_2","volume-title":"IEEE Visualization Conference","author":"Gyulassy Attila","year":"2005","unstructured":"Attila Gyulassy and Vijay Natarajan. 2005. Topology-based simplification for feature extraction from 3D scalar fields. In IEEE Visualization Conference. IEEE, New York, NY."},{"key":"e_1_3_1_53_2","article-title":"ASCR\/HEP exascale requirements review report","author":"Habib Salman","year":"2016","unstructured":"Salman Habib, Robert Roser, Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma, A. Almgren, J. Amundson, et\u00a0al. 2016. ASCR\/HEP exascale requirements review report. arXiv preprint arXiv:1603.09303 (2016).","journal-title":"arXiv preprint arXiv:1603.09303"},{"key":"e_1_3_1_54_2","article-title":"GlobeNet: Convolutional neural networks for typhoon eye tracking from remote sensing imagery","author":"Hong Seungkyun","year":"2017","unstructured":"Seungkyun Hong, Seongchan Kim, Minsu Joh, and Sa-kwang Song. 2017. GlobeNet: Convolutional neural networks for typhoon eye tracking from remote sensing imagery. arXiv preprint arXiv:1708.03417 (2017).","journal-title":"arXiv preprint arXiv:1708.03417"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compfluid.2011.05.002"},{"key":"e_1_3_1_56_2","article-title":"LSST: From science drivers to reference design and anticipated data products","author":"Ivezic Zeljko","year":"2008","unstructured":"Zeljko Ivezic, J. A. Tyson, B. Abel, E. Acosta, R. Allsman, Y. AlSayyad, S. F. Anderson, J. Andrew, R. Angel, G. Angeli, et\u00a0al. 2008. LSST: From science drivers to reference design and anticipated data products. arXiv preprint arXiv:0805.2366 (2008).","journal-title":"arXiv preprint arXiv:0805.2366"},{"key":"e_1_3_1_57_2","unstructured":"JAMO 2018. JAMO\u2014JGI Archive and Metadata Organizer. Retrieved from https:\/\/storageconference.us\/2018\/Presentations\/Beecroft.pdf."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32597-7_2"},{"key":"e_1_3_1_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.26"},{"key":"e_1_3_1_60_2","first-page":"191","volume-title":"USENIX Conference on File and Storage Technologies","author":"Johnson Charles","year":"2014","unstructured":"Charles Johnson, Kimberly Keeton, Charles B. Morrey III, Craig A. N. Soules, Alistair C. Veitch, Stephen Bacon, Oskar Batuner, Marcelo Condotta, Hamilton Coutinho, Patrick J. Doyle, et\u00a0al. 2014. From research to practice: Experiences engineering a production metadata database for a scale out file system. In USENIX Conference on File and Storage Technologies. USENIX Association, Berkeley, CA, 191\u2013198."},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2019.06.006"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10278-010-9328-z"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1088\/0029-5515\/49\/11\/115021"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/CSCI51800.2020.00229"},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.5555\/2033345.2033384"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-014-0358-z"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.88"},{"key":"e_1_3_1_68_2","volume-title":"The Next Generation of EMPRESS: A Metadata Management System for Accelerated Scientific Discovery at Exascale","author":"Lawson Margaret","year":"2018","unstructured":"Margaret Lawson. 2018. The Next Generation of EMPRESS: A Metadata Management System for Accelerated Scientific Discovery at Exascale. Bachelor\u2019s Thesis. Dartmouth College."},{"key":"e_1_3_1_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/PDSW-DISCS.2018.00004"},{"key":"e_1_3_1_70_2","first-page":"153","volume-title":"USENIX Conference on File and Storage Technologies","author":"Leung Andrew W.","year":"2009","unstructured":"Andrew W. Leung, Minglong Shao, Timothy Bisson, Shankar Pasupathy, and Ethan L. Miller. 2009. Spyglass: Fast, scalable metadata search for large-scale storage systems. In USENIX Conference on File and Storage Technologies. 153\u2013166."},{"key":"e_1_3_1_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2003.10053"},{"key":"e_1_3_1_72_2","article-title":"Application of deep convolutional neural networks for detecting extreme weather in climate datasets","author":"Liu Yunjie","year":"2016","unstructured":"Yunjie Liu, Evan Racah, Joaquin Correa, Amir Khosrowshahi, David Lavers, Kenneth Kunkel, Michael Wehner, William Collins, et\u00a0al. 2016. Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv preprint arXiv:1605.01156 (2016).","journal-title":"arXiv preprint arXiv:1605.01156"},{"key":"e_1_3_1_73_2","doi-asserted-by":"publisher","DOI":"10.1109\/2945.489388"},{"key":"e_1_3_1_74_2","volume-title":"High Performance Computing Meets Databases at Supercomputing (HPCDB\u201912)","author":"Lofstead Jay","year":"2012","unstructured":"Jay Lofstead and Jai Dayal. 2012. Transactional parallel metadata services for integrated application workflows. In High Performance Computing Meets Databases at Supercomputing (HPCDB\u201912)."},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2012.79"},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2016.49"},{"key":"e_1_3_1_77_2","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1145\/1996130.1996139","volume-title":"20th International Symposium on High-performance Distributed Computing (HPDC\u201911)","author":"Lofstead Jay","year":"2011","unstructured":"Jay Lofstead, Milo Polte, Garth Gibson, Scott Klasky, Karsten Schwan, Ron Oldfield, Matthew Wolf, and Qing Liu. 2011. Six degrees of scientific data: Reading patterns for extreme scale science IO. In 20th International Symposium on High-performance Distributed Computing (HPDC\u201911). ACM, New York, NY, 49\u201360. DOI:http:\/\/doi.acm.org\/10.1145\/1996130.1996139"},{"key":"e_1_3_1_78_2","first-page":"1","volume-title":"IEEE International Symposium on Parallel & Distributed Processing (IPDPS)","author":"Lofstead Jay","year":"2009","unstructured":"Jay Lofstead, Fang Zheng, Scott Klasky, and Karsten Schwan. 2009. Adaptable, metadata rich IO methods for portable high performance IO. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, New York, NY, 1\u201310."},{"key":"e_1_3_1_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.32"},{"key":"e_1_3_1_80_2","article-title":"nsCouette\u2014A high-performance code for direct numerical simulations of turbulent Taylor-Couette flow","author":"Lopez Jose Manuel","year":"2019","unstructured":"Jose Manuel Lopez, Daniel Feldmann, Markus Rampp, Alberto Vela-Martin, Liang Shi, and Marc Avila. 2019. nsCouette\u2014A high-performance code for direct numerical simulations of turbulent Taylor-Couette flow. arXiv preprint arXiv:1908.00587 (2019).","journal-title":"arXiv preprint arXiv:1908.00587"},{"key":"e_1_3_1_81_2","doi-asserted-by":"publisher","DOI":"10.1145\/37402.37422"},{"key":"e_1_3_1_82_2","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.994"},{"key":"e_1_3_1_83_2","doi-asserted-by":"publisher","DOI":"10.14529\/jsfi180103"},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.5555\/1720672.1720675"},{"key":"e_1_3_1_85_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2016.7727189"},{"key":"e_1_3_1_86_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10723-007-9065-9"},{"key":"e_1_3_1_87_2","doi-asserted-by":"publisher","DOI":"10.5555\/2388996.2389006"},{"key":"e_1_3_1_88_2","volume-title":"Applying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing","author":"Menon Harshitha","year":"2015","unstructured":"Harshitha Menon, Abhinav Bhatele, Sebastien Fourestier, Laxmikant Kale, and Francois Pellegrini. 2015. Applying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing. Technical Report, UIUC. https:\/\/www.ideals.illinois.edu\/items\/77157."},{"key":"e_1_3_1_89_2","doi-asserted-by":"publisher","DOI":"10.1080\/00401706.2016.1158740"},{"key":"e_1_3_1_90_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2003.1207437"},{"key":"e_1_3_1_91_2","volume-title":"International Workshop on High Performance I\/O Techniques and Deployment of Very Large Scale I\/O Systems","author":"Oldfield Ron A.","year":"2006","unstructured":"Ron A. Oldfield, Patrick Widener, Arthur B. Maccabe, Lee Ward, and Todd Kordenbrock. 2006. Efficient data-movement for lightweight I\/O. In International Workshop on High Performance I\/O Techniques and Deployment of Very Large Scale I\/O Systems."},{"key":"e_1_3_1_92_2","first-page":"1","volume-title":"1st Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS)","author":"Ovsyannikov Andrey","year":"2016","unstructured":"Andrey Ovsyannikov, Melissa Romanus, Brian Van Straalen, Gunther H. Weber, and David Trebotich. 2016. Scientific workflows at DataWarp-speed: Accelerated data-intensive science using NERSC\u2019s Burst Buffer. In 1st Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS). IEEE, New York, NY, 1\u20136."},{"key":"e_1_3_1_93_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2011.05.010"},{"key":"e_1_3_1_94_2","unstructured":"Panasas 2022. Retrieved November 20 2022 from https:\/\/www.panasas.com\/."},{"key":"e_1_3_1_95_2","doi-asserted-by":"publisher","DOI":"10.1145\/1142473.1142535"},{"key":"e_1_3_1_96_2","first-page":"775","volume-title":"Computer Graphics Forum","author":"Post Frits H.","year":"2003","unstructured":"Frits H. Post, Benjamin Vrolijk, Helwig Hauser, Robert S. Laramee, and Helmut Doleisch. 2003. The state of the art in flow visualisation: Feature extraction and tracking. In Computer Graphics Forum, Vol. 22. Wiley Online Library, New York, NY, 775\u2013792."},{"issue":"1","key":"e_1_3_1_97_2","first-page":"1","article-title":"PKDGRAV3: Beyond trillion particle cosmological simulations for the next era of galaxy surveys","volume":"4","author":"Potter Douglas","year":"2017","unstructured":"Douglas Potter, Joachim Stadel, and Romain Teyssier. 2017. PKDGRAV3: Beyond trillion particle cosmological simulations for the next era of galaxy surveys. Computat. Astrophys. Cosmol. 4, 1 (2017), 1\u201313.","journal-title":"Computat. Astrophys. Cosmol."},{"key":"e_1_3_1_98_2","first-page":"56","volume-title":"Programming and Performance Visualization Tools","author":"Prabhu Tarun","year":"2017","unstructured":"Tarun Prabhu and William Gropp. 2017. Moya\u2013A JIT compiler for HPC. In Programming and Performance Visualization Tools. Springer, New York, NY, 56\u201373."},{"key":"e_1_3_1_99_2","doi-asserted-by":"publisher","DOI":"10.3847\/1538-4357\/aabfed"},{"key":"e_1_3_1_100_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.euromechsol.2018.03.021"},{"key":"e_1_3_1_101_2","first-page":"3405","volume-title":"31st International Conference on Neural Information Processing Systems (NIPS\u201917)","author":"Racah Evan","year":"2017","unstructured":"Evan Racah, Christopher Beckham, Tegan Maharaj, Samira Ebrahimi Kahou, Prabhat, and Christopher Pal. 2017. ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In 31st International Conference on Neural Information Processing Systems (NIPS\u201917). Curran Associates Inc., Red Hook, NY, 3405\u20133416."},{"key":"e_1_3_1_102_2","volume-title":"22nd International Conference on Interactive Information Processing Systems for Meteorology, Oceanograph, and Hydrology","author":"Rew R.","year":"2006","unstructured":"R. Rew, E. Hartnett, J. Caron, et\u00a0al. 2006. netCDF-4: Software implementing an enhanced data model for the geosciences. In 22nd International Conference on Interactive Information Processing Systems for Meteorology, Oceanograph, and Hydrology."},{"key":"e_1_3_1_103_2","volume-title":"Storage Systems and I\/O: Organizing, Storing, and Accessing Data for Scientific Discovery (Report for the DOE ASCR Workshop on Storage Systems and I\/O)","author":"Ross Robert","year":"2018","unstructured":"Robert Ross, Lee Ward, Philip Carns, Gary Grider, Scott Klasky, Quincey Koziol, Glenn K. Lockwood, Kathryn Mohror, Bradley Settlemyer, and Matthew Wolf. 2018. Storage Systems and I\/O: Organizing, Storing, and Accessing Data for Scientific Discovery (Report for the DOE ASCR Workshop on Storage Systems and I\/O). Technical Report. USDOE Office of Science (SC)."},{"key":"e_1_3_1_104_2","first-page":"50","volume-title":"2nd International Workshop on Petascale Data Storage at Supercomputing\u201907","author":"Roth Philip C.","year":"2007","unstructured":"Philip C. Roth. 2007. Characterizing the I\/O behavior of scientific applications on the Cray XT. In 2nd International Workshop on Petascale Data Storage at Supercomputing\u201907. 50\u201355."},{"key":"e_1_3_1_105_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2012.04.093"},{"key":"e_1_3_1_106_2","first-page":"567","volume-title":"IEEE International Parallel and Distributed Processing Symposium (IPDPS)","author":"Rudyy Oleksandr","year":"2019","unstructured":"Oleksandr Rudyy, Marta Garcia-Gasulla, Filippo Mantovani, Alfonso Santiago, Ra\u00fcl Sirvent, and Mariano V\u00e1zquez. 2019. Containers in HPC: A scalability and portability study in production biological simulations. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, New York, NY, 567\u2013577."},{"key":"e_1_3_1_107_2","volume-title":"10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage\u201918)","author":"Sevilla Michael A.","year":"2018","unstructured":"Michael A. Sevilla, Reza Nasirigerdeh, Carlos Maltzahn, Jeff LeFevre, Noah Watkins, Peter Alvaro, Margaret Lawson, Jay Lofstead, and Jim Pivarski. 2018. Tintenfisch: File system namespace schemas and generators. In 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage\u201918). USENIX Association, Berkeley, CA."},{"key":"e_1_3_1_108_2","doi-asserted-by":"publisher","DOI":"10.1006\/jvci.1993.1005"},{"key":"e_1_3_1_109_2","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126929"},{"key":"e_1_3_1_110_2","unstructured":"Samuel W. Skillman Michael S. Warren Matthew J. Turk Risa H. Wechsler Daniel E. Holz and P. M. Sutter. 2014. Dark Sky Simulations: Early Data Release. (2014). arxiv:astro-ph.CO\/1407.2600."},{"key":"e_1_3_1_111_2","unstructured":"SQLite 2022. SQLite. Retrieved from http:\/\/www.sqlite.org\/."},{"key":"e_1_3_1_112_2","unstructured":"Starfish 2017. Starfish. Retrieved from https:\/\/storageconference.us\/2017\/Presentations\/Farmer.pdf."},{"key":"e_1_3_1_113_2","volume-title":"IEEE Visualization Conference","author":"Stockinger Kurt","year":"2005","unstructured":"Kurt Stockinger, John Shalf, Kesheng Wu, and E. Wes Bethel. 2005. Query-driven visualization of large data sets. In IEEE Visualization Conference. IEEE, New York, NY."},{"key":"e_1_3_1_114_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2012.33"},{"key":"e_1_3_1_115_2","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2017.53"},{"key":"e_1_3_1_116_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2004.57"},{"key":"e_1_3_1_117_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.56"},{"key":"e_1_3_1_118_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2008.15"},{"key":"e_1_3_1_119_2","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1109\/FMPC.1999.750599","volume-title":"7th Symposium on the Frontiers of Massively Parallel Computation (Frontiers\u201999)","author":"Thakur Rajeev","year":"1999","unstructured":"Rajeev Thakur, William Gropp, and Ewing Lusk. 1999. Data sieving and collective I\/O in ROMIO. In 7th Symposium on the Frontiers of Massively Parallel Computation (Frontiers\u201999). IEEE, New York, NY, 182\u2013189."},{"key":"e_1_3_1_120_2","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1109\/LDAV.2011.6092313","volume-title":"IEEE Symposium on Large Data Analysis and Visualization","author":"Thompson David","year":"2011","unstructured":"David Thompson, Joshua A. Levine, Janine C. Bennett, Peer-Timo Bremer, Attila Gyulassy, Valerio Pascucci, and Philippe P. P\u00e9bay. 2011. Analysis of large-scale scalar data using hixels. In IEEE Symposium on Large Data Analysis and Visualization. IEEE, New York, NY, 23\u201330."},{"key":"e_1_3_1_121_2","unstructured":"Top500 2022. TOP500 Lists. Retrieved from https:\/\/www.top500.org\/lists\/top500\/."},{"key":"e_1_3_1_122_2","unstructured":"Craig E. Tull Abdelilah Essiari Dan Gunter et\u00a0al. 2013. The SPOT Suite project. Retrieved from http:\/\/spot.nersc.gov\/."},{"key":"e_1_3_1_123_2","volume-title":"ACM\/IEEE Conference on Supercomputing","author":"Tzeng Fan-Yin","year":"2005","unstructured":"Fan-Yin Tzeng and Kwan-Liu Ma. 2005. Intelligent feature extraction and tracking for visualizing large-scale 4D flow simulations. In ACM\/IEEE Conference on Supercomputing. IEEE, New York, NY."},{"key":"e_1_3_1_124_2","doi-asserted-by":"publisher","DOI":"10.5194\/gmd-10-1069-2017"},{"key":"e_1_3_1_125_2","doi-asserted-by":"publisher","DOI":"10.5194\/gmd-14-5023-2021"},{"key":"e_1_3_1_126_2","volume-title":"Workshop on Infrastructure for Workflows and Application Composition (IWAC)","author":"Ulmer Craig","year":"2018","unstructured":"Craig Ulmer, Shyamali Mukherjee, Gary Templet, Scott Levy, Jay Lofstead, Patrick Widener, Todd Kordenbrock, and Margaret Lawson. 2018. Faodel: Data management for next-generation application workflows. In Workshop on Infrastructure for Workflows and Application Composition (IWAC)."},{"key":"e_1_3_1_127_2","unstructured":"U.S. Department of Energy. 2019. U.S. Department of Energy and Intel to Build First Exascale Supercomputer. Retrieved from https:\/\/www.energy.gov\/articles\/us-department-energy-and-intel-build-first-exascale-supercomputer."},{"key":"e_1_3_1_128_2","unstructured":"VIC 2022. VIC. Retrieved from https:\/\/zenodo.org\/badge\/DOI\/10.5281\/zenodo.5781377.svg."},{"key":"e_1_3_1_129_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2007.15"},{"key":"e_1_3_1_130_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2006.159"},{"key":"e_1_3_1_131_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2008.140"},{"key":"e_1_3_1_132_2","first-page":"307","volume-title":"7th Symposium on Operating Systems Design and Implementation","author":"Weil Sage A.","year":"2006","unstructured":"Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In 7th Symposium on Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 307\u2013320."},{"key":"e_1_3_1_133_2","first-page":"1","volume-title":"IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST)","author":"Welch Brent","year":"2013","unstructured":"Brent Welch and Geoffrey Noer. 2013. Optimizing a hybrid SSD\/HDD HPC storage system based on file size distributions. In IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST). IEEE, New York, NY, 1\u201312."},{"key":"e_1_3_1_134_2","first-page":"1","volume-title":"USENIX Conference on File and Storage Technologies","author":"Welch Brent","year":"2008","unstructured":"Brent Welch, Marc Unangst, Zainul Abbasi, Garth A. Gibson, Brian Mueller, Jason Small, Jim Zelenka, and Bin Zhou. 2008. Scalable performance of the Panasas parallel file system. In USENIX Conference on File and Storage Technologies. 1\u201317."},{"key":"e_1_3_1_135_2","volume-title":"Detecting Changes in Simulations","author":"Wendelberger Joanne Roth","year":"2017","unstructured":"Joanne Roth Wendelberger, Divya Banesh, Laura Jean Wendelberger, and James Paul Ahrens. 2017. Detecting Changes in Simulations. Technical Report. Los Alamos National Lab. (LANL), Los Alamos, NM."},{"key":"e_1_3_1_136_2","doi-asserted-by":"publisher","DOI":"10.1109\/LDAV.2012.6378962"},{"key":"e_1_3_1_137_2","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/180\/1\/012053"},{"key":"e_1_3_1_138_2","doi-asserted-by":"publisher","DOI":"10.1145\/3126908.3126934"},{"key":"e_1_3_1_139_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2010.55"},{"key":"e_1_3_1_140_2","first-page":"73","volume-title":"1st International Symposium on Network Cloud Computing and Applications","author":"Zaspel Peter","year":"2011","unstructured":"Peter Zaspel and Michael Griebel. 2011. Massively parallel fluid simulations on Amazon\u2019s HPC cloud. In 1st International Symposium on Network Cloud Computing and Applications. IEEE, New York, NY, 73\u201378."},{"key":"e_1_3_1_141_2","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356146"},{"key":"e_1_3_1_142_2","doi-asserted-by":"publisher","DOI":"10.1175\/2009JCLI3049.1"},{"key":"e_1_3_1_143_2","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00006"},{"key":"e_1_3_1_144_2","doi-asserted-by":"publisher","DOI":"10.1145\/2834976.2834984"},{"key":"e_1_3_1_145_2","first-page":"37","volume-title":"Computer Graphics Forum","author":"Zhou Bo","year":"2018","unstructured":"Bo Zhou and Yi-Jen Chiang. 2018. Key time steps selection for large-scale time-varying volume datasets using an information-theoretic storyboard. In Computer Graphics Forum, Vol. 37. Wiley Online Library, New York, NY, 37\u201349."}],"container-title":["ACM Transactions on Storage"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3523698","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3523698","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:36Z","timestamp":1750188636000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3523698"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,30]]},"references-count":144,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,11,30]]}},"alternative-id":["10.1145\/3523698"],"URL":"https:\/\/doi.org\/10.1145\/3523698","relation":{},"ISSN":["1553-3077","1553-3093"],"issn-type":[{"value":"1553-3077","type":"print"},{"value":"1553-3093","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,30]]},"assertion":[{"value":"2021-01-06","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-02","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}