{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T07:14:33Z","timestamp":1779174873496,"version":"3.51.4"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,12,8]],"date-time":"2023-12-08T00:00:00Z","timestamp":1701993600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,12,8]]},"abstract":"<jats:p>IEEE 754 doubles do not exactly represent most real values, introducing rounding errors in computations and [de]serialization to text. These rounding errors inhibit the use of existing lightweight compression schemes such as Delta and Frame Of Reference (FOR), but recently new schemes were proposed: Gorilla, Chimp128, PseudoDecimals (PDE), Elf and Patas. However, their compression ratios are not better than those of general-purpose compressors such as Zstd; while [de]compression is much slower than Delta and FOR.<\/jats:p>\n          <jats:p>We propose and evaluate ALP, that significantly improves these previous schemes in both speed and compression ratio (Figure 1). We created ALP after carefully studying the datasets used to evaluate the previous schemes. To obtain speed, ALP is designed to fit vectorized execution. This turned out to be key for also improving the compression ratio, as we found in-vector commonalities to create compression opportunities. ALP is an adaptive scheme that uses a strongly enhanced version of PseudoDecimals [31] to losslessly encode doubles as integers if they originated as decimals, and otherwise uses vectorized compression of the doubles' front bits. Its high speeds stem from our implementation in scalar code that auto-vectorizes, using building blocks provided by our FastLanes library [6], and an efficient two-stage compression algorithm that first samples row-groups and then vectors.<\/jats:p>","DOI":"10.1145\/3626717","type":"journal-article","created":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T14:01:21Z","timestamp":1702389681000},"page":"1-26","source":"Crossref","is-referenced-by-count":25,"title":["ALP: Adaptive Lossless floating-Point Compression"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-1608-834X","authenticated-orcid":false,"given":"Azim","family":"Afroozeh","sequence":"first","affiliation":[{"name":"Centrum Wiskunde &amp; Informatica, Amsterdam, Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3575-0528","authenticated-orcid":false,"given":"Leonardo X.","family":"Kuffo","sequence":"additional","affiliation":[{"name":"Centrum Wiskunde &amp; Informatica, Amsterdam, Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6256-0140","authenticated-orcid":false,"given":"Peter","family":"Boncz","sequence":"additional","affiliation":[{"name":"Centrum Wiskunde &amp; Informatica, Amsterdam, Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,12,12]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/IEEESTD.2019.8766229"},{"key":"e_1_2_2_2_1","unstructured":"2019. Public BI Benchmark. https:\/\/github.com\/cwida\/public_bi_benchmark. Accessed on: 2023-04--13."},{"key":"e_1_2_2_3_1","unstructured":"2023. FastLanes. https:\/\/github.com\/cwida\/FastLanes Accesed on: 2023-04--13."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142473.1142548"},{"key":"e_1_2_2_5_1","unstructured":"Azim Afroozeh and P Boncz. 2020. Towards a New File Format for Big Data: SIMD-Friendly Composable Compression."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/3598581.3598587"},{"key":"e_1_2_2_7_1","first-page":"225","article-title":"MonetDB\/X100: Hyper-Pipelining Query Execution","volume":"5","author":"Boncz Peter A","year":"2005","unstructured":"Peter A Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB\/X100: Hyper-Pipelining Query Execution.. In Cidr, Vol. 5. 225--237.","journal-title":"Cidr"},{"key":"e_1_2_2_8_1","unstructured":"Boudewijn Braams. 2018. Predicate Pushdown in Parquet and Apache Spark. MSc thesis (2018)."},{"key":"e_1_2_2_9_1","volume-title":"Giulio Ermanno Pibiri, Roberto Trani, and Rossano Venturini.","author":"Bruno Andrea","year":"2021","unstructured":"Andrea Bruno, Franco Maria Nardini, Giulio Ermanno Pibiri, Roberto Trani, and Rossano Venturini. 2021. TSXor: A Simple Time Series Compression Algorithm. In String Processing and Information Retrieval: 28th International Symposium, SPIRE 2021, Lille, France, October 4--6, 2021, Proceedings 28. Springer, 217--223."},{"key":"e_1_2_2_10_1","volume-title":"FPC: A high-speed compressor for double-precision floating-point data","author":"Burtscher Martin","year":"2008","unstructured":"Martin Burtscher and Paruj Ratanaworabhan. 2008. FPC: A high-speed compressor for double-precision floating-point data. IEEE transactions on computers 58, 1 (2008), 18--31."},{"key":"e_1_2_2_11_1","volume-title":"Emerging Properties in Self-Supervised Vision Transformers. CoRR abs\/2104.14294","author":"Caron Mathilde","year":"2021","unstructured":"Mathilde Caron, Hugo Touvron, Ishan Misra, Herv\u00e9 J\u00e9gou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging Properties in Self-Supervised Vision Transformers. CoRR abs\/2104.14294 (2021). arXiv:2104.14294 https:\/\/arxiv.org\/abs\/2104.14294"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352121"},{"key":"e_1_2_2_13_1","unstructured":"Yann Collet. 2014. LZ4 - Extremely fast compression. https:\/\/github.com\/lz4\/lz4 Accesed on: 2023-04--13."},{"key":"e_1_2_2_14_1","unstructured":"Yann Collet. 2015. Zstandard - Fast real-time compression algorithm. https:\/\/github.com\/facebook\/zstd Accesed on: 2023-04--13."},{"key":"e_1_2_2_15_1","unstructured":"Patrick Damme Dirk Habich Juliana Hildebrandt and Wolfgang Lehner. 2017. Lightweight Data Compression Algorithms: An Experimental Survey (Experiments and Analyses).. In EDBT. 72--83."},{"key":"e_1_2_2_16_1","unstructured":"Vadim Engelson Peter Fritzson and Dag Fritzson. 2000. Lossless compression of high-volume numerical data from simulations."},{"key":"e_1_2_2_17_1","first-page":"110","article-title":"Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs","volume":"93","author":"Agner Fog","year":"2011","unstructured":"Agner Fog et al. 2011. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. Copenhagen University College of Engineering 93 (2011), 110. https:\/\/www.agner.org\/optimize\/instruction_tables.pdf","journal-title":"Copenhagen University College of Engineering"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2012.194"},{"key":"e_1_2_2_19_1","volume-title":"What every computer scientist should know about floating-point arithmetic. ACM computing surveys (CSUR) 23, 1","author":"Goldberg David","year":"1991","unstructured":"David Goldberg. 1991. What every computer scientist should know about floating-point arithmetic. ACM computing surveys (CSUR) 23, 1 (1991), 5--48."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.1998.655800"},{"key":"e_1_2_2_21_1","volume-title":"Beating floating point at its own game: Posit arithmetic. Supercomputing frontiers and innovations 4, 2","author":"Gustafson John L","year":"2017","unstructured":"John L Gustafson and Isaac T Yonemoto. 2017. Beating floating point at its own game: Posit arithmetic. Supercomputing frontiers and innovations 4, 2 (2017), 71--86."},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cad.2004.09.015"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/3275366.3284966"},{"key":"e_1_2_2_24_1","volume-title":"Patas Compression: Variation on Chimp. https:\/\/github.com\/duckdb\/duckdb\/pull\/5044. Accessed on: 2023-04--13.","author":"Labs DB","year":"2022","unstructured":"DuckDB Labs. 2022. Patas Compression: Variation on Chimp. https:\/\/github.com\/duckdb\/duckdb\/pull\/5044. Accessed on: 2023-04--13."},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882925"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341105.3374044"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2203"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3587136.3587149"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3551793.3551852"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2006.143"},{"key":"e_1_2_2_31_1","volume-title":"Proceedings of the 2023 ACM SIGMOD international conference on Management of data. https:\/\/www.cs.cit.tum.de\/fileadmin\/w00cfj\/dis\/papers\/btrblocks.pdf In press. Accessed on: 2023-04--13","author":"Maximilian Kuschewski Adnan Alhomssi","year":"2023","unstructured":"Adnan Alhomssi Maximilian Kuschewski, David Sauerwein and Viktor Leis. 2023. BtrBlocks: Efficient Columnar Com-pression for Data Lakes. Proceedings of the 2023 ACM SIGMOD international conference on Management of data. https:\/\/www.cs.cit.tum.de\/fileadmin\/w00cfj\/dis\/papers\/btrblocks.pdf In press. Accessed on: 2023-04--13."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.48443\/S9YA-ZC81"},{"key":"e_1_2_2_33_1","volume-title":"Barometric pressure (DP1.00004.001). https:\/\/doi.org\/10. 48443\/RXR7-PP32","author":"National Ecological Observatory Network (NEON). 2021.","unstructured":"National Ecological Observatory Network (NEON). 2021. Barometric pressure (DP1.00004.001). https:\/\/doi.org\/10. 48443\/RXR7-PP32"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","unstructured":"National Ecological Observatory Network (NEON). 2021. Dust and particulate size distribution (DP1.00017.001). https:\/\/doi.org\/10.48443\/4E6X-V373","DOI":"10.48443\/4E6X-V373"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","unstructured":"National Ecological Observatory Network (NEON). 2021. IR biological temperature (DP1.00005.001). https:\/\/doi.org\/ 10.48443\/JNWY-B177","DOI":"10.48443\/JNWY-B177"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","unstructured":"National Ecological Observatory Network (NEON). 2021. Relative humidity above water on-buoy (DP1.20271.001). https:\/\/doi.org\/10.48443\/Z99V-0502","DOI":"10.48443\/Z99V-0502"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554829"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824078"},{"key":"e_1_2_2_39_1","unstructured":"Johannes Pietrzyk Annett Ungeth\u00fcm Dirk Habich and Wolfgang Lehner. 2018. Beyond Straightforward Vectorization of Lightweight Data Compression Algorithms for Larger Vector Sizes.. In Grundlagen von Datenbanken. 71--76."},{"key":"e_1_2_2_40_1","unstructured":"Mark Raasveldt and Hannes Muehleisen. 2019. DuckDB. https:\/\/github.com\/duckdb\/duckdb Accesed on: 2023-04--13."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3320212"},{"key":"e_1_2_2_42_1","unstructured":"Alec Radford Jeff Wu Rewon Child David Luan Dario Amodei and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. (2019)."},{"key":"e_1_2_2_43_1","doi-asserted-by":"crossref","unstructured":"Vipul Raheja Dhruv Kumar Ryan Koo and Dongyeop Kang. 2023. CoEdIT: Text Editing by Task-Specific Instruction Tuning. (2023). arXiv:2305.09857 [cs.CL]","DOI":"10.18653\/v1\/2023.findings-emnlp.350"},{"key":"e_1_2_2_44_1","volume-title":"Proceedings of the 32nd international conference on Very large data bases. Citeseer, 858--869","author":"Raman Vijayshankar","year":"2006","unstructured":"Vijayshankar Raman and Garret Swart. 2006. How to wring a table dry: Entropy compression of relations and querying of compressed relations. In Proceedings of the 32nd international conference on Very large data bases. Citeseer, 858--869."},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2006.35"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/163090.163096"},{"key":"e_1_2_2_47_1","unstructured":"Ying Sheng Lianmin Zheng Binhang Yuan Zhuohan Li Max Ryabinin Daniel Y Fu Zhiqiang Xie Beidi Chen Clark Barrett Joseph E Gonzalez et al. 2023. High-throughput generative inference of large language models with a single gpu. arXiv preprint arXiv:2303.06865 (2023)."},{"key":"e_1_2_2_48_1","unstructured":"Aliaksandr Valialkin. 2019. VictoriaMetrics: achieving better compression than Gorilla for time series data. https: \/\/faun.pub\/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932. Accesed on: 2023-04--13."},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209950.3209952"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","unstructured":"Deepak Vohra. 2016. Apache Parquet. 325--335. https:\/\/doi.org\/10.1007\/978--1--4842--2199-0_8","DOI":"10.1007\/978--1--4842--2199-0_8"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.150"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626717","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3626717","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T13:03:25Z","timestamp":1755867805000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626717"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,8]]},"references-count":51,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12,8]]}},"alternative-id":["10.1145\/3626717"],"URL":"https:\/\/doi.org\/10.1145\/3626717","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,8]]}}}