{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:50:25Z","timestamp":1740124225215,"version":"3.37.3"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,16]],"date-time":"2021-03-16T00:00:00Z","timestamp":1615852800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,3,16]],"date-time":"2021-03-16T00:00:00Z","timestamp":1615852800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS18081C","01IS18081D"],"award-info":[{"award-number":["01IS18081C","01IS18081D"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["419942270"],"award-info":[{"award-number":["419942270"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"name":"HAW Promotion, MWK, Baden-W\u00fcrrtembgerg, Germany"},{"DOI":"10.13039\/501100005714","name":"Technische Universit\u00e4t Darmstadt","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005714","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Distrib Parallel Databases"],"published-print":{"date-parts":[[2022,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Massive data transfers in modern data-intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-Data processing (NDP) and a shift to <jats:italic>code-to-data<\/jats:italic> designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become feasible. The shift towards NDP system architectures calls for revision of established principles. Abstractions such as <jats:italic>data formats and layouts<\/jats:italic> typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under RocksDB and the COSMOS hardware platform.<\/jats:p>","DOI":"10.1007\/s10619-021-07328-z","type":"journal-article","created":{"date-parts":[[2021,3,16]],"date-time":"2021-03-16T08:03:35Z","timestamp":1615881815000},"page":"27-45","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["On the necessity of explicit cross-layer data formats in near-data processing systems"],"prefix":"10.1007","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0621-1151","authenticated-orcid":false,"given":"Lukas","family":"Weber","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8306-9999","authenticated-orcid":false,"given":"Tobias","family":"Vin\u00e7on","sequence":"additional","affiliation":[]},{"given":"Christian","family":"Kn\u00f6dler","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6896-9879","authenticated-orcid":false,"given":"Leonardo","family":"Solis-Vasquez","sequence":"additional","affiliation":[]},{"given":"Arthur","family":"Bernhardt","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6042-9878","authenticated-orcid":false,"given":"Ilia","family":"Petrov","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1164-3082","authenticated-orcid":false,"given":"Andreas","family":"Koch","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,3,16]]},"reference":[{"key":"7328_CR1","doi-asserted-by":"crossref","unstructured":"Acharya, A., Uysal, M., Saltz, J.: Active disks: programming model, algorithms and evaluation. In: Proceedings of ASPLOS (1998)","DOI":"10.1145\/291069.291026"},{"key":"7328_CR2","unstructured":"Ailamaki, A., DeWitt, D.J., Hill, M.D., Skounakis, M.: Weaving relations for cache performance. In: Proceedings of VLDB, 01 (2001)"},{"key":"7328_CR3","unstructured":"Babarinsa, O.O., Idreos, S.: JAFAR: near-data processing for databases (2015)"},{"key":"7328_CR4","unstructured":"Boral, H., DeWitt, D.J.: Parallel architectures for database systems. In: Database Machines: An Idea Whose Time Has Passed? A Critique of the Future of Database Machines, pp. 11?28 (1989)"},{"key":"7328_CR5","doi-asserted-by":"crossref","unstructured":"Copeland, G.P., Khoshafian, S.N.: A decomposition storage model. In: Proceedings of SIGMOD (1985)","DOI":"10.1145\/318898.318923"},{"key":"7328_CR6","doi-asserted-by":"crossref","unstructured":"De, A., Gokhale, M., Swanson, S., Al, E.: Minerva: Accelerating data analysis in next-generation SSDS. In: Proceedings of FCCM (2013)","DOI":"10.1109\/FCCM.2013.46"},{"key":"7328_CR7","doi-asserted-by":"crossref","unstructured":"Do, J., Patel, J., DeWitt, D., Al, E.: Query processing on smart SSDs: opportunities and challenges. In: Proceedinds of SIGMOD (2013)","DOI":"10.1145\/2463676.2465295"},{"key":"7328_CR8","doi-asserted-by":"crossref","unstructured":"Graefe, G., Petrov, I., Ivanov, T., Marinov, V.: A hybrid page layout integrating PAX and NSM. In: Proceedings of the 17th International Database Engineering & Applications Symposium, IDEAS ?13, pp. 86?95 (2013)","DOI":"10.1145\/2513591.2513643"},{"key":"7328_CR9","doi-asserted-by":"crossref","unstructured":"Gu, B., Yoon, A.S., Al., E.: Biscuit: a framework for near-data processing of big data workloads. In: Proceedings of ISCA (2016)","DOI":"10.1109\/ISCA.2016.23"},{"key":"7328_CR10","doi-asserted-by":"crossref","unstructured":"Hardock, S., Petrov, I., Gottstein, R., Buchmann, A.: From in-place updates to in-place appends. In: Proceedings of SIGMOD ?17 (2017)","DOI":"10.1145\/3035918.3035958"},{"key":"7328_CR11","doi-asserted-by":"publisher","unstructured":"He, Y., Lee, R., Huai, Y., Shao, Z., Jain, N., Zhang, X., Xu, Z.: Rcfile: a fast and space-efficient data placement structure in mapreduce-based warehouse systems. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 1199?1208 (2011). https:\/\/doi.org\/10.1109\/ICDE.2011.5767933","DOI":"10.1109\/ICDE.2011.5767933"},{"key":"7328_CR12","doi-asserted-by":"crossref","unstructured":"Hemmatpour, M., Sadoghi, M., et\u00a0al.: Kanzi: a distributed, in-memory key-value store. In: Proceedings of Middleware (2016)","DOI":"10.1145\/3007592.3007594"},{"key":"7328_CR13","doi-asserted-by":"crossref","unstructured":"Istv\u00e1n, Z., Sidler, D., Alonso, G.: Caribou: intelligent distributed storage. In: Proceedings of VLDB (2017)","DOI":"10.14778\/3137628.3137632"},{"key":"7328_CR14","doi-asserted-by":"crossref","unstructured":"Jo, I., Bae, D.h., Al., E.: YourSQL: a high-performance database system leveraging in-storage computing. In: Proceedings of VLDB (2016)","DOI":"10.14778\/2994509.2994512"},{"key":"7328_CR15","doi-asserted-by":"crossref","unstructured":"Kang, Y., Kee, Y.S., et\u00a0al.: Enabling cost-effective data processing with smart SSD. In: Proceedings of MSST (2013)","DOI":"10.1109\/MSST.2013.6558444"},{"key":"7328_CR16","doi-asserted-by":"crossref","unstructured":"Keeton, K., Patterson, D.A., Hellerstein, J.M.: A case for intelligent disks (IDISKs). In: SIGMOD Rec (1998)","DOI":"10.1145\/290593.290602"},{"key":"7328_CR17","doi-asserted-by":"crossref","unstructured":"Kim, J., et\u00a0al.: PapyrusKV: A high-performance parallel key-value store for distributed NVM architectures. In: Proceedings of SC (2017)","DOI":"10.1145\/3126908.3126943"},{"key":"7328_CR18","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1016\/j.ins.2015.07.056","volume":"327","author":"S Kim","year":"2016","unstructured":"Kim, S., Lee, S.W., Moon, B., et al.: In-storage processing of database scans and joins. Inf. Sci. 327, 183 (2016)","journal-title":"Inf. Sci."},{"key":"7328_CR19","doi-asserted-by":"crossref","unstructured":"Lang, H., M\u00fchlbauer, T., Funke, F., et\u00a0al.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: Proceedings of SIGMOD 16 (2016)","DOI":"10.1145\/2882903.2882925"},{"issue":"2","key":"7328_CR20","doi-asserted-by":"publisher","first-page":"110","DOI":"10.1109\/LCA.2020.3009347","volume":"19","author":"JH Lee","year":"2020","unstructured":"Lee, J.H., Zhang, H., Lagrange, V., Krishnamoorthy, P., Zhao, X., Ki, Y.S.: SmartSSD: FPGA accelerated near-storage data analytics on SSD. IEEE Comput. Architect. Lett. 19(2), 110?113 (2020)","journal-title":"IEEE Comput. Architect. Lett."},{"key":"7328_CR21","unstructured":"Ming, S.W.J., Arvind, et\u00a0al.: BlueDBM: an appliance for big data analytics. In: Proceedings of ISCA (2015)"},{"key":"7328_CR22","unstructured":"OpenSSD Project: COSMOS Project Documentation (2019). http:\/\/www.openssd-project.org\/wiki\/Cosmos_OpenSSD_Technical_Resources"},{"key":"7328_CR23","unstructured":"Ramakrishnan, R., Gehrke, J.: Database Management Systems (2003)"},{"key":"7328_CR24","doi-asserted-by":"crossref","unstructured":"Raman, V., Attaluri, G., Barber, R.: DB2 with BLU acceleration: so much more than just a column store. In: Proceedings of VLDB, 13 (2013)","DOI":"10.14778\/2536222.2536233"},{"key":"7328_CR25","unstructured":"Riedel, E., Gibson, G.A., Faloutsos, C.: Active storage for large-scale data mining and multimedia. In: Proceedings of VLDB (1998)"},{"key":"7328_CR26","unstructured":"Seshadri, S., Swanson, S., et\u00a0al.: Willow: a user-programmable SSD, pp. 67?80. USENIX, OSDI (2014)"},{"key":"7328_CR27","doi-asserted-by":"publisher","unstructured":"Vin\u00e7on, T., Bernhardt, A., Petrov, I., Weber, L., Koch, A.: nKV: near-data processing with KV-stores on native computational storage. In: Proceedings of the 16th International Workshop on Data Management on New Hardware, DaMoN 20. Association for Computing Machinery, New York, USA (2020). https:\/\/doi.org\/10.1145\/3399666.3399934","DOI":"10.1145\/3399666.3399934"},{"key":"7328_CR28","unstructured":"Vin\u00e7on, T., Weber, L., Bernhardt, A., Riegger, C., Hardock, S., Knoedler, C., Stock, F., Solis-Vasquez, L., Tamimi, S., Koch, A.: nKV in action: accelerating KV-stores on native computation storage with near-data processing. In: Proceedings of the VLDB Endowment, vol. 13 (2020)"},{"key":"7328_CR29","unstructured":"Vincon, T., Hardock, S., Riegger, C., Oppermann, J., Koch, A., Petrov, I.: NoFTL-KV: tackling write-amplification on KV-stores with native storage management. In: Proceedings of EDBT (2018)"},{"key":"7328_CR30","doi-asserted-by":"crossref","unstructured":"Woods, L., Teubner, J., Alonso, G.: Less Watts, More performance: an intelligent storage engine for data appliances. In: Proceedings of SIGMOD (2013)","DOI":"10.1145\/2463676.2463685"},{"key":"7328_CR31","doi-asserted-by":"crossref","unstructured":"Xi, S., Babarinsa, O., Athanassoulis, M., Idreos, S.: Beyond the wall: near-data processing for databases. In: Proceedings of DAMON (2015)","DOI":"10.1145\/2771937.2771945"}],"container-title":["Distributed and Parallel Databases"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10619-021-07328-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10619-021-07328-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10619-021-07328-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,3]],"date-time":"2022-03-03T08:10:09Z","timestamp":1646295009000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10619-021-07328-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,16]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,3]]}},"alternative-id":["7328"],"URL":"https:\/\/doi.org\/10.1007\/s10619-021-07328-z","relation":{},"ISSN":["0926-8782","1573-7578"],"issn-type":[{"type":"print","value":"0926-8782"},{"type":"electronic","value":"1573-7578"}],"subject":[],"published":{"date-parts":[[2021,3,16]]},"assertion":[{"value":"25 February 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 March 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"Ilia Petrov was part of the program committee of Joint International Workshop on Big Data Management on Emerging Hardware and Data Management on Virtualized Active Systems (HardBD & Active 2020), but has not reviewed the original paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}