{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T18:45:52Z","timestamp":1776969952887,"version":"3.51.4"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMOD Rec."],"published-print":{"date-parts":[[2026,4,23]]},"abstract":"<jats:p>Research advancements in storage formats continuously produce more efficient encodings and better compression rates. Despite this, new formats are not adopted due to high implementation cost, and existing formats cannot evolve because they need to maintain compatibility across systems. Can this problem be solved by introducing a new abstraction? We answer affirmatively with AnyBlox, a framework for reading arbitrary datasets using lightweight WebAssembly decoders bundled with the data. By decoupling decoders from both systems and file format specifications, AnyBlox allows transparent format evolution, instance-optimized encodings, and enables mainstream adoption of research advancements. It integrates seamlessly with modern systems like DuckDB, Spark, and Umbra, while delivering solid performance and security guarantees.<\/jats:p>","DOI":"10.1145\/3810900.3810912","type":"journal-article","created":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T18:16:38Z","timestamp":1776968198000},"page":"62-72","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["AnyBlox: Let Your Data Read Itself"],"prefix":"10.1145","volume":"55","author":[{"given":"Mateusz","family":"Gienieczko","sequence":"first","affiliation":[{"name":"Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maximilian","family":"Kuschewski","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas","family":"Neumann","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Viktor","family":"Leis","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jana","family":"Giceva","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,4,23]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.14778\/3598581.3598587"},{"issue":"11","key":"e_1_2_1_2_1","first-page":"4629","volume":"18","author":"Afroozeh A.","year":"2025","unstructured":"A. Afroozeh and P. Boncz. The FastLanes File Format. Proc. VLDB Endow., 18(11):4629-4643, July 2025.","journal-title":"The FastLanes File Format. Proc. VLDB Endow."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3626717"},{"key":"e_1_2_1_4_1","volume-title":"Conference on Innovative Data Systems Research","author":"Alotaibi R.","year":"2024","unstructured":"R. Alotaibi, Y. Tian, S. Grafberger, J. Camacho-Rodr\u00b4?guez, N. Bruno, B. Kroth, S. Matusevych, A. Agrawal, M. Behera, A. Gosalia, C. A. Galindo-Legaria, M. Joshi, M. Potocnik, B. Sezgin, X. Li, and C. Curino. Towards Query Optimizer as a Service (QOaaS) in a Unified LakeHouse Ecosystem: Can One QO Rule Them All? In Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, 2024. www.cidrdb.org. arXiv: 2411.13704."},{"issue":"12","key":"e_1_2_1_5_1","first-page":"3515","volume":"16","author":"Anneser C.","year":"2023","unstructured":"C. Anneser, N. Tatbul, D. Cohen, Z. Xu, P. Pandian, N. Laptev, and R. Marcus. AutoSteer: Learned Query Optimization for Any SQL Database. Proc. VLDB Endow., 16(12):3515-3527, Aug. 2023. Publisher: VLDB Endowment.","journal-title":"AutoSteer: Learned Query Optimization for Any SQL Database. Proc. VLDB Endow."},{"key":"e_1_2_1_6_1","volume-title":"Recommended parquet file size on hdfs","author":"Parquet Apache","year":"2025","unstructured":"Apache Parquet. Recommended parquet file size on hdfs, 2025."},{"key":"e_1_2_1_7_1","unstructured":"Apache Software Foundation. Apache Arrow 2023."},{"key":"e_1_2_1_8_1","volume-title":"Apache Arrow Columnar Format - Variable-size Binary View Layout","author":"Foundation Apache Software","year":"2023","unstructured":"Apache Software Foundation. Apache Arrow Columnar Format - Variable-size Binary View Layout, 2023."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526045"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476296"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-020-0935-z"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407851"},{"key":"e_1_2_1_13_1","first-page":"1975","volume-title":"31st USENIX Security Symposium (USENIX Security 22)","author":"Bosamiya J.","year":"2022","unstructured":"J. Bosamiya, W. S. Lim, and B. Parno. Provably-Safe Multilingual Software Sandboxing using WebAssembly. In 31st USENIX Security Symposium (USENIX Security 22), pages 1975-1992, Boston, MA, Aug. 2022. USENIX Association."},{"key":"e_1_2_1_14_1","volume-title":"An open source framework for analyzing ALICE\u2019s Open Data","author":"Bourjau C.","year":"2018","unstructured":"C. Bourjau. mALICE: An open source framework for analyzing ALICE\u2019s Open Data, 2018."},{"key":"e_1_2_1_15_1","unstructured":"Bytecode Alliance. Cranelift 2016."},{"key":"e_1_2_1_16_1","volume-title":"wasmtime","author":"Alliance Bytecode","year":"2019","unstructured":"Bytecode Alliance. wasmtime, 2019."},{"key":"e_1_2_1_17_1","volume-title":"Memory64 proposal for webassembly","author":"Alliance Bytecode","year":"2025","unstructured":"Bytecode Alliance. Memory64 proposal for webassembly, 2025."},{"key":"e_1_2_1_18_1","volume-title":"CERN","author":"CERN.","year":"1995","unstructured":"CERN. CERN ROOT Format. CERN, 1995."},{"key":"e_1_2_1_19_1","first-page":"1","volume-title":"Proceedings of the 26th International Conference on Very Large Data Bases, VLDB \u201900","author":"Chaudhuri S.","year":"2000","unstructured":"S. Chaudhuri and G. Weikum. Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System. In Proceedings of the 26th International Conference on Very Large Data Bases, VLDB \u201900, pages 1-10, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.epidem.2022.100576"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-023-05364-3"},{"key":"e_1_2_1_22_1","unstructured":"ClickHouse. ClickBench 2022."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/362384.362685"},{"key":"e_1_2_1_24_1","first-page":"72","volume-title":"International Conference on Extending Database Technology","author":"Damme P.","year":"2017","unstructured":"P. Damme, D. Habich, J. Hildebrandt, and W. Lehner. Lightweight Data Compression Algorithms: An Experimental Survey (Experiments and Analyses). In International Conference on Extending Database Technology, pages 72-83, Venice, Italy, 2017. OpenProceedings."},{"key":"e_1_2_1_25_1","volume-title":"Common language infrastructure (cli)","author":"International ECMA","year":"2012","unstructured":"ECMA International. Common language infrastructure (cli), 2012."},{"key":"e_1_2_1_26_1","first-page":"293","volume-title":"USENIX 2008 Annual Technical Conference, ATC'08","author":"Ford B.","year":"2008","unstructured":"B. Ford and R. Cox. Vx32: lightweight user-level sandboxing on the x86. In USENIX 2008 Annual Technical Conference, ATC'08, pages 293-306, USA, 2008. USENIX Association. event-place: Boston, Massachusetts."},{"key":"e_1_2_1_27_1","volume-title":"White-box Compression: Learning and Exploiting Compact Table Representations. In Conference on Innovative Data Systems Research","author":"Ghita B. V.","year":"2020","unstructured":"B. V. Ghita, D. G. Tom\u00b4e, and P. A. Boncz. White-box Compression: Learning and Exploiting Compact Table Representations. In Conference on Innovative Data Systems Research, 2020."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/93597.98720"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3489496.3489498"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062363"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389734"},{"key":"e_1_2_1_32_1","first-page":"107","volume-title":"Native Code. In 2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Jangda A.","year":"2019","unstructured":"A. Jangda, B. Powers, E. D. Berger, and A. Guha. Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 107-120, Renton, WA, July 2019. USENIX Association."},{"key":"e_1_2_1_33_1","first-page":"661","volume-title":"Revealing Performance Issues in Server-Side WebAssembly Runtimes Via Differential Testing. In 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE)","author":"Jiang S.","year":"2023","unstructured":"S. Jiang, R. Zeng, Z. Rao, J. Gu, Y. Zhou, and M. R. Lyu. Revealing Performance Issues in Server-Side WebAssembly Runtimes Via Differential Testing. In 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE), pages 661-672, 2023."},{"issue":"11","key":"e_1_2_1_34_1","first-page":"3461","volume":"16","author":"Jungmair M.","year":"2023","unstructured":"M. Jungmair and J. Giceva. Declarative Sub-Operators for Universal Data Processing. Proc. VLDB Endow., 16(11):3461-3474, July 2023. Publisher: VLDB Endowment.","journal-title":"Declarative Sub-Operators for Universal Data Processing. Proc. VLDB Endow."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3275366.3284966"},{"key":"e_1_2_1_36_1","first-page":"36","article-title":"The Modern Data Architecture: The Deconstructed Database. log","volume":"43","author":"Khurana A.","year":"2018","unstructured":"A. Khurana and J. L. Dem. The Modern Data Architecture: The Deconstructed Database. login Usenix Mag., 43:36-40, 2018.","journal-title":"Usenix Mag."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-022-01444-z"},{"key":"e_1_2_1_38_1","series-title":"Series A","first-page":"369","volume-title":"On tables of random numbers. Sankhy: The Indian Journal of Statistics","author":"Kolmogorov A. N.","year":"1963","unstructured":"A. N. Kolmogorov. On tables of random numbers. Sankhy: The Indian Journal of Statistics, Series A, pages 369-376, 1963. Publisher: JSTOR."},{"key":"e_1_2_1_39_1","volume-title":"Jan.","author":"Kuiper L.","year":"2025","unstructured":"L. Kuiper. Query Engines: Gatekeepers of the Parquet File Format, Jan. 2025."},{"issue":"2","key":"e_1_2_1_40_1","first-page":"1","volume":"1","author":"Kuschewski M.","year":"2023","unstructured":"M. Kuschewski, D. Sauerwein, A. Alhomssi, and V. Leis. BtrBlocks: Efficient Columnar Compression for Data Lakes. Proc. ACM Manag. Data, 1(2):118:1-118:26, June 2023. Place: New York, NY, USA Publisher: Association for Computing Machinery.","journal-title":"BtrBlocks: Efficient Columnar Compression for Data Lakes. Proc. ACM Manag. Data"},{"key":"e_1_2_1_41_1","first-page":"5","volume-title":"Modular Analytic Query Engine. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS \u201924","author":"Lamb A.","year":"2024","unstructured":"A. Lamb, Y. Shen, D. Heres, J. Chakraborty, M. O. Kabak, L.-C. Hsieh, and C. Sun. Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS \u201924, pages 5-17, New York, NY, USA, 2024. Association for Computing Machinery. event-place: Santiago AA, Chile."},{"key":"e_1_2_1_42_1","unstructured":"Lance contributors. Lance 2025."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-019-00578-5"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2004.1281665"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2610507"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2203"},{"key":"e_1_2_1_47_1","first-page":"334","volume-title":"Multi-Cloud Lakehouse. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS \u201924","author":"Levandoski J.","year":"2024","unstructured":"J. Levandoski, G. Casto, M. Deng, R. Desai, P. Edara, T. Hottelier, A. Hormati, A. Johnson, J. Johnson, D. Kurzyniec, S. McVeety, P. Ramanathan, G. Saxena, V. Shanmugan, and Y. Volobuev. BigLake: BigQuery\u2019s Evolution toward a Multi-Cloud Lakehouse. In Companion of the 2024 International Conference on Management of Data, SIGMOD\/PODS \u201924, pages 334-346, New York, NY, USA, 2024. Association for Computing Machinery. event-place: ."},{"key":"e_1_2_1_48_1","volume-title":"Matter Antimatter Differences (B meson decays to three hadrons) - Data Files","author":"Cb","year":"2017","unstructured":"LHCb collaboration (2017). Matter Antimatter Differences (B meson decays to three hadrons) - Data Files, 2011."},{"key":"e_1_2_1_49_1","volume-title":"userfaultfd(2) Linux User\u2019s Manual","author":"Developers Linux Kernel","year":"2024","unstructured":"Linux Kernel Developers. userfaultfd(2) Linux User\u2019s Manual, 2024."},{"key":"e_1_2_1_50_1","volume-title":"memfd create(2) Linux User\u2019s Manual","author":"Developers Linux Kernel","year":"2025","unstructured":"Linux Kernel Developers. memfd create(2) Linux User\u2019s Manual, 2025."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, VLDB 2024","author":"Liu H.","year":"2024","unstructured":"H. Liu, M. Stoian, A. v. Renen, and A. Kipf. Corra: Correlation-Aware Column Compression. In Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, VLDB 2024, Guangzhou, China, August 26-30, 2024. VLDB.org, 2024."},{"key":"e_1_2_1_52_1","volume-title":"CorBit: Leveraging Correlations for Compressing Bitmap Indexes. In VLDB Workshops","author":"Lyu X.","year":"2023","unstructured":"X. Lyu, A. Kipf, P. Pfeil, D. Horn, J. Giceva, and T. Kraska. CorBit: Leveraging Correlations for Compressing Bitmap Indexes. In VLDB Workshops, 2023."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.5555\/2831090.2831104"},{"key":"e_1_2_1_54_1","unstructured":"Microsoft. Microsoft LSP 2016."},{"key":"e_1_2_1_55_1","volume-title":"Umbra: A Disk-Based System with In-Memory Performance. In Conference on Innovative Data Systems Research","author":"Neumann T.","year":"2020","unstructured":"T. Neumann and M. J. Freitag. Umbra: A Disk-Based System with In-Memory Performance. In Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, 2020. www.cidrdb.org."},{"key":"e_1_2_1_56_1","volume-title":"The java virtual machine specification","year":"2013","unstructured":"Oracle. The java virtual machine specification, 2013."}],"container-title":["ACM SIGMOD Record"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3810900.3810912","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T18:17:12Z","timestamp":1776968232000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3810900.3810912"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,23]]},"references-count":56,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,4,23]]}},"alternative-id":["10.1145\/3810900.3810912"],"URL":"https:\/\/doi.org\/10.1145\/3810900.3810912","relation":{},"ISSN":["0163-5808"],"issn-type":[{"value":"0163-5808","type":"print"}],"subject":[],"published":{"date-parts":[[2026,4,23]]},"assertion":[{"value":"2026-04-23","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}