{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:58:19Z","timestamp":1760237899329,"version":"build-2065373602"},"reference-count":29,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,8,18]],"date-time":"2022-08-18T00:00:00Z","timestamp":1660780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Spanish Ministry of Science and Innovation","award":["857191","2017-SGR-1414"],"award-info":[{"award-number":["857191","2017-SGR-1414"]}]},{"name":"Generalitat de Catalunya","award":["857191","2017-SGR-1414"],"award-info":[{"award-number":["857191","2017-SGR-1414"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Time series databases aim to handle big amounts of data in a fast way, both when introducing new data to the system, and when retrieving it later on. However, depending on the scenario in which these databases participate, reducing the number of requested resources becomes a further requirement. Following this goal, NagareDB and its Cascading Polyglot Persistence approach were born. They were not just intended to provide a fast time series solution, but also to find a great cost-efficiency balance. However, although they provided outstanding results, they lacked a natural way of scaling out in a cluster fashion. Consequently, monolithic approaches could extract the maximum value from the solution but distributed ones had to rely on general scalability approaches. In this research, we proposed a holistic approach specially tailored for databases following Cascading Polyglot Persistence to further maximize its inherent resource-saving goals. The proposed approach reduced the cluster size by 33%, in a setup with just three ingestion nodes and up to 50% in a setup with 10 ingestion nodes. Moreover, the evaluation shows that our scaling method is able to provide efficient cluster growth, offering scalability speedups greater than 85% in comparison to a theoretically 100% perfect scaling, while also ensuring data safety via data replication.<\/jats:p>","DOI":"10.3390\/bdcc6030086","type":"journal-article","created":{"date-parts":[[2022,8,18]],"date-time":"2022-08-18T21:39:21Z","timestamp":1660858761000},"page":"86","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Holistic Scalability Strategy for Time Series Databases Following Cascading Polyglot Persistence"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8185-3667","authenticated-orcid":false,"given":"Carlos Garcia","family":"Calatrava","sequence":"first","affiliation":[{"name":"Barcelona Supercomputing Center, Pla\u00e7a Eusebi G\u00fcell, 1-3, 08034 Barcelona, Spain"},{"name":"Department of Computer Architecture, Universitat Polit\u00e8cnica de Catalunya, BarcelonaTech. C. Jordi Girona, 31, 08034 Barcelona, Spain"}]},{"given":"Yolanda Becerra","family":"Fontal","sequence":"additional","affiliation":[{"name":"Barcelona Supercomputing Center, Pla\u00e7a Eusebi G\u00fcell, 1-3, 08034 Barcelona, Spain"},{"name":"Department of Computer Architecture, Universitat Polit\u00e8cnica de Catalunya, BarcelonaTech. C. Jordi Girona, 31, 08034 Barcelona, Spain"}]},{"given":"Fernando M.","family":"Cucchietti","sequence":"additional","affiliation":[{"name":"Barcelona Supercomputing Center, Pla\u00e7a Eusebi G\u00fcell, 1-3, 08034 Barcelona, Spain"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2581","DOI":"10.1109\/TKDE.2017.2740932","article-title":"Time Series Management Systems: A Survey","volume":"29","author":"Jensen","year":"2017","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_2","unstructured":"(2022, May 31). The DB-Engines Ranking, according to Their Popularity. Available online: https:\/\/db-engines.com\/en\/ranking."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Calatrava, G.C., Becerra, Y., Cucchietti, F., and Div\u00ed, C. (2021). NagareDB: A Resource-Efficient Document-Oriented Time-series Database. Data, 6.","DOI":"10.3390\/data6080091"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gilbert, S., and Lynch, N. (2002). Brewer\u2019s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. SIGACT News, Association for Computing Machinery.","DOI":"10.1145\/564585.564601"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Promberger, L., Schwemmer, R., and Fr\u00f6ning, H. (2022). Characterization of data compression across CPU platforms and accelerators. Concurrency and Computation: Practice and Experience, Wiley.","DOI":"10.1002\/cpe.6465"},{"key":"ref_6","unstructured":"(2022, May 24). Zstandard Benchmarking. Available online: https:\/\/facebook.github.io\/zstd\/."},{"key":"ref_7","unstructured":"Gunderson, S.H. (2022, May 24). Snappy: A Fast Compressor\/Decompressor. Available online: http:\/\/google.github.io\/snappy\/."},{"key":"ref_8","unstructured":"Gailly, J., and Adler, M. (2022, May 24). Zlib Compression Library. Available online: https:\/\/zlib.net\/."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"103149","DOI":"10.1109\/ACCESS.2020.2996661","article-title":"Lossless Compression of Data From Static and Mobile Dynamic Vision Sensors-Performance and Trade-Offs","volume":"8","author":"Khan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1922649.1922660","article-title":"What is the future of disk drives, death or rebirth?","volume":"23","author":"Deng","year":"2011","journal-title":"ACM Comput. Surv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1109\/JPROC.2017.2678018","article-title":"Solid-State Drive (SSD): A Nonvolatile Storage System","volume":"105","author":"Micheloni","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_12","unstructured":"Kasavajhala, V. (2011). Solid State Drive vs. Hard Disk Drive Price and Performance Study, DELL. A DELL Technical White Paper."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Micheloni, R., Marelli, A., and Eshghi, K. (2012). Inside Solid State Drives (SSDs), Springer.","DOI":"10.1007\/978-94-007-5146-0"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"69398","DOI":"10.1109\/ACCESS.2022.3187405","article-title":"Introducing Polyglot-Based Data-Flow Awareness to Time-series Data Stores","volume":"10","author":"Calatrava","year":"2022","journal-title":"IEEE Access"},{"key":"ref_15","unstructured":"(2022, May 31). MongoDB Details and Popularity, according to the DB-Engines Ranking. Available online: https:\/\/db-engines.com\/en\/system\/MongoDB."},{"key":"ref_16","unstructured":"Yuhanna, N., Leganza, G., and Perdoni, R. (2019). The Forrester Wave\u2122: Big Data NoSQL, Forrester. Q1 2019 Report."},{"key":"ref_17","unstructured":"MongoDB Documentation (2022, May 31). Mongodb.com. Available online: https:\/\/docs.mongodb.com."},{"key":"ref_18","unstructured":"MongoDB Time-Series Documentation (2022, May 31). Mongodb.com. Available online: https:\/\/www.mongodb.com\/docs\/manual\/core\/timeseries-collections\/."},{"key":"ref_19","unstructured":"Time-Series Collection Schema (2022, May 31). MongoDB. Available online: https:\/\/github.com\/mongodb\/mongo\/tree\/master\/src\/mongo\/db\/timeseries."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3158661","article-title":"Survey on NoSQL Stores","volume":"51","author":"Davoudian","year":"2019","journal-title":"ACM Comput. Surv."},{"key":"ref_21","unstructured":"(2022, June 05). InfluxDB: Open Source Time Series Database. Available online: https:\/\/www.influxdata.com\/."},{"key":"ref_22","unstructured":"Hajek, V., Klapka, T., and Kudibal, O. (2021, June 05). Benchmarking InfluxDB vs. MongoDB for Time Series Data, Metrics & Management. An Influxdata Technical Paper. Available online: https:\/\/www.influxdata.com\/blog\/influxdb-is-27x-faster-vs-mongodb-for-Time-series-workloads\/."},{"key":"ref_23","unstructured":"(2022, June 05). InfluxDB Clustering Design. Available online: https:\/\/www.influxdata.com\/blog\/influxdb-clustering-design-neither-strictly-cp-or-ap\/."},{"key":"ref_24","unstructured":"Zhaofeng, Z. (2021, March 26). Key Concepts and Features of Time Series Databases. Available online: https:\/\/www.alibabacloud.com\/blog\/key-concepts-and-features-of-Time-series-databases_594734."},{"key":"ref_25","unstructured":"Canonical Ltd (2022, June 29). Ubuntu 18.04.6 LTS (Bionic Beaver). Available online: https:\/\/releases.ubuntu.com\/bionic\/."},{"key":"ref_26","unstructured":"Garcia, F.D., Garcia, R., Entrialgo, J., Garcia, J., and Garcia, M. (2008, January 20\u201322). Experimental Evaluation of Horizontal and Vertical Scalability of Cluster-Based Application Servers for Transactional Workloads. Proceedings of the International Conference on Applied Informatics and Communications, Rhodes, Greece."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Malitsky, N., Chaudhary, A., Jourdain, S., Cowan, M., O\u2019Leary, P., Hanwell, M., and Van Dam, K.K. (2017, January 6\u20139). Building near-real-time processing pipelines with the spark-MPI platform. Proceedings of the 2017 New York Scientific Data Summit (NYSDS), New York, NY, USA.","DOI":"10.1109\/NYSDS.2017.8085039"},{"key":"ref_28","unstructured":"(2022, June 15). Spark Streaming Programming Guide. Available online: https:\/\/spark.apache.org\/docs\/latest\/structured-streaming-programming-guide.html."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., and Stoica, I. (2013, January 3\u20136). Discretized streams: Fault-tolerant streaming computation at scale. Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, Farmington, PA, USA.","DOI":"10.1145\/2517349.2522737"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/3\/86\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:11:30Z","timestamp":1760141490000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/3\/86"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,18]]},"references-count":29,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["bdcc6030086"],"URL":"https:\/\/doi.org\/10.3390\/bdcc6030086","relation":{},"ISSN":["2504-2289"],"issn-type":[{"type":"electronic","value":"2504-2289"}],"subject":[],"published":{"date-parts":[[2022,8,18]]}}}