{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T05:05:52Z","timestamp":1764997552212,"version":"3.41.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T00:00:00Z","timestamp":1686614400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Direct Grant for Research, The Chinese University of Hong Kong","award":["Project No. 4055151"],"award-info":[{"award-number":["Project No. 4055151"]}]},{"name":"the Research Grants Council of the Hong Kong Special Administrative Region, China","award":["GRF 14219422"],"award-info":[{"award-number":["GRF 14219422"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,6,13]]},"abstract":"<jats:p>Timeseries management systems play an important role in IoT and performance monitoring. As the data volume scales up, absorbing data memory efficiently with high throughput becomes a growing requirement for timeseries management systems. However, the designs of the existing systems, especially the in-memory data structures, suffer from two issues. First, they suffer from the trade-off between memory efficiency and performance. Second, they are not scalable because of lock contention where they cannot benefit from parallel insertion and querying.<\/jats:p>\n          <jats:p>In this paper, we propose ForestTI, a scalable inverted-index-oriented timeseries management system where the balance point between memory efficiency and performance can be flexibly adjusted under the increasing memory pressure. First, we present a two-level inverted index, which is scalable with optimistic lock coupling, and its internal structure can be gradually converted to more memory efficient representations. Second, we propose a two-level pointer swizzling mechanism to actively swap out the cold posting lists and in-memory timeseries objects as the number of timeseries increases. Finally, we further optimize the on-disk data structures (i.e. write-ahead logs and LSM-tree) to adapt to the high insertion throughput from the in-memory components. We prototype ForestTI with C++ from scratch, and compared to the storage engine of Prometheus, ForestTI achieves 1.79x higher insertion throughput, 52.1% lower query latency, and 56.9% lower memory occupation. We have released the open-source code of ForestTI for public access.<\/jats:p>","DOI":"10.1145\/3589260","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T20:26:45Z","timestamp":1687292805000},"page":"1-25","source":"Crossref","is-referenced-by-count":2,"title":["ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4536-6252","authenticated-orcid":false,"given":"Zhiqi","family":"Wang","sequence":"first","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2173-2847","authenticated-orcid":false,"given":"Zili","family":"Shao","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]}],"member":"320","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"2022. CPU cores and threads per CPU core per instance type. https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/cpu-options-supported-instances-values.html."},{"key":"e_1_2_2_2_1","unstructured":"2022. Grafana: The open observability platform | Grafana Labs. https:\/\/grafana.com\/."},{"key":"e_1_2_2_3_1","unstructured":"2022. mlock(2) - Linux manual page. https:\/\/man7.org\/linux\/man-pages\/man2\/mlock.2.html."},{"key":"e_1_2_2_4_1","unstructured":"2022. mmap(2) - Linux manual page. https:\/\/www.man7.org\/linux\/man-pages\/man2\/mmap.2.html."},{"key":"e_1_2_2_5_1","unstructured":"2022. Transaction ID Wraparound in Postgres. https:\/\/blog.sentry.io\/2015\/07\/23\/transaction-id-wraparound-in-postgres."},{"key":"e_1_2_2_6_1","unstructured":"2022. VictoriaMetrics: Simple & Reliable Monitoring For Everyone. https:\/\/victoriametrics.com\/."},{"key":"e_1_2_2_7_1","volume-title":"or its affiliates","author":"Inc.","year":"2020","unstructured":"Inc. or its affiliates 2020, Amazon Web Services. 2022. Amazon EC2. https:\/\/aws.amazon.com\/ec2\/."},{"volume-title":"BTrDB: Optimizing Storage System Design for Timeseries Processing. In 14th USENIX Conference on File and Storage Technologies (FAST 16)","author":"Andersen Michael P","key":"e_1_2_2_8_1","unstructured":"Michael P Andersen and David E. Culler. 2016. BTrDB: Optimizing Storage System Design for Timeseries Processing. In 14th USENIX Conference on File and Storage Technologies (FAST 16). USENIX Association, Santa Clara, CA, 39--52. https:\/\/www.usenix.org\/conference\/fast16\/technical-sessions\/presentation\/andersen"},{"key":"e_1_2_2_9_1","volume-title":"USENIX Annual Technical Conference, FREENIX Track. 297--309","author":"Arcangeli Andrea","year":"2003","unstructured":"Andrea Arcangeli, Mingming Cao, Paul E McKenney, and Dipankar Sarma. 2003. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel.. In USENIX Annual Technical Conference, FREENIX Track. 297--309."},{"key":"e_1_2_2_10_1","unstructured":"Prometheus Authors. 2020. Prometheus remote write. https:\/\/prometheus.io\/docs\/prometheus\/latest\/configuration\/configuration\/#remote_write."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3386136"},{"key":"e_1_2_2_12_1","volume-title":"2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Chan Helen H. W.","year":"2018","unstructured":"Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 1007--1019. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/chan"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1457838.1457895"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556575"},{"key":"e_1_2_2_15_1","unstructured":"Nagios Enterprises. 2020. Nagios - The Industry Standard In IT Infrastructure Monitoring. https:\/\/www.nagios.org\/."},{"key":"e_1_2_2_16_1","unstructured":"Cloud Native Computing Foundation. 2020. Horizontally scalable highly available multi-tenant long term Prometheus. https:\/\/cortexmetrics.io\/."},{"key":"e_1_2_2_18_1","unstructured":"2021 gRPC Authors. 2020. gPRC A high performance open source universal RPC framework. https:\/\/grpc.io\/."},{"key":"e_1_2_2_19_1","unstructured":"influxdata. 2020. InfluxDB 1.7 Documentation. https:\/\/docs.influxdata.com\/influxdb\/."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454211"},{"key":"e_1_2_2_21_1","unstructured":"Alfons Kemper Donald Kossmann and Lehrstuhl Fur Informatik. 1993. Adaptable Pointer Swizzling Strategies in Object Bases: Design Realization and Quantitative Analysis."},{"key":"e_1_2_2_22_1","volume-title":"13th USENIX Conference on File and Storage Technologies (FAST 15)","author":"Lee Changman","year":"2015","unstructured":"Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. {F2FS}: A New File System for Flash Storage. In 13th USENIX Conference on File and Storage Technologies (FAST 15). 273--286."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2018.00026"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544812"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2933349.2933352"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3033273"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2933349.2933358"},{"key":"e_1_2_2_28_1","volume-title":"Freitag","author":"Neumann Thomas","year":"2020","unstructured":"Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. In 10th Conference on Innovative Data Systems Research, CIDR 2020, Amsterdam, The Netherlands, January 12--15, 2020, Misc Proceedings. www.cidrdb.org. http:\/\/cidrdb.org\/cidr2020\/papers\/p29-neumann-cidr20.pdf"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s002360050048"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824078"},{"key":"e_1_2_2_31_1","unstructured":"Prometheus. 2020. Prometheus - From metrics to insight power your metrics and alerting with a leading open-source monitoring solution. https:\/\/prometheus.io\/."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132765"},{"key":"e_1_2_2_33_1","unstructured":"Jeff Dean Sanjay Ghemawat. 2022. LevelDB. https:\/\/github.com\/google\/leveldb."},{"key":"e_1_2_2_34_1","unstructured":"Michael Kerrisk Serge Hallyn. 2021. cgroups - Linux manual page. https:\/\/man7.org\/linux\/man-pages\/man7\/cgroups.7.html."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419111.3421289"},{"key":"e_1_2_2_36_1","volume-title":"Zdonik","author":"Solleza Franco","year":"2021","unstructured":"Franco Solleza, Andrew Crotty, Suman Karumuri, Nesime Tatbul, and Stanley B. Zdonik. 2021. Mach: A Pluggable Metrics Storage Engine for the Age of Observability."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485278.2485285"},{"key":"e_1_2_2_38_1","first-page":"21","article-title":"The VoltDB Main Memory DBMS","volume":"36","author":"Stonebraker Michael","year":"2013","unstructured":"Michael Stonebraker and Ariel Weisberg. 2013. The VoltDB Main Memory DBMS. IEEE Data Eng. Bull. 36 (2013), 21--27.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_2_39_1","volume-title":"I. Mat. Nat. Kl.","author":"Thue Axel","year":"1912","unstructured":"Axel Thue. 1912. Uber die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Kra. Vidensk. Selsk. Skrifer, I. Mat. Nat. Kl. (1912), 1--67."},{"key":"e_1_2_2_40_1","unstructured":"Timescale. 2020. Time Series Benchmark Suite a tool for comparing and evaluating databases for time series data. https:\/\/github.com\/timescale\/tsbs."},{"key":"e_1_2_2_41_1","unstructured":"Inc. Timescale. 2020. Time-series data simplified | Timescale. https:\/\/www.timescale.com\/."},{"key":"e_1_2_2_42_1","volume-title":"2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Visheratin Alexander","year":"2020","unstructured":"Alexander Visheratin, Alexey Struckov, Semen Yufa, Alexey Muratov, Denis Nasonov, Nikolay Butakov, Yury Kuznetsov, and Michael May. 2020. Peregreen -- modular database for efficient storage of historical time series in cloud environments. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 589--601. https:\/\/www.usenix.org\/conference\/atc20\/presentation\/visheratin"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514221.3526175"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.14778\/3447689.3447710"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/122576.122577"},{"key":"e_1_2_2_46_1","unstructured":"Naoki Yoshinaga. 2021. cedar - C implementation of efficiently-updatable double-array trie. http:\/\/www.tkl.iis.u-tokyo.ac.jp\/~ynaga\/cedar\/."},{"key":"e_1_2_2_47_1","volume-title":"Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. Dublin City University and Association for Computational Linguistics","author":"Yoshinaga Naoki","year":"2014","unstructured":"Naoki Yoshinaga and Masaru Kitsuregawa. 2014. A Self-adaptive Classifier for Efficient Text-stream Processing. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. Dublin City University and Association for Computational Linguistics, Dublin, Ireland, 1091--1102. https:\/\/aclanthology.org\/C14--1103"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589260","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589260","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:54Z","timestamp":1750182534000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589260"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,13]]},"references-count":46,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6,13]]}},"alternative-id":["10.1145\/3589260"],"URL":"https:\/\/doi.org\/10.1145\/3589260","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2023,6,13]]}}}