{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T08:02:29Z","timestamp":1774252949265,"version":"3.50.1"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T00:00:00Z","timestamp":1768262400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T00:00:00Z","timestamp":1768780800000},"content-version":"vor","delay-in-days":6,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Keyword search, which identifies transactions associated with specified keywords across historical blocks, is a critical query type in blockchain analytics. However, existing approaches, such as on-chain indexing and off-chain synchronization, may lead to significant space overhead or challenges in maintaining data freshness. To address these challenges, we propose BlockSketch, a novel probabilistic data structure (PDS) that adopts a differentiated encoding strategy, aimed at resolving the trade-off between query performance and storage overhead in blockchain indexing. BlockSketch features a hierarchical filtering architecture that combines Bloom filters and Sketches within a binary tree framework, enabling dynamic structural maintenance. Keywords are categorized as \u201chot\u201d or \u201ccold\u201d based on their on-chain frequency and encoded into the most suitable component to achieve resource-efficient storage and accurate querying. In addition, BlockSketch integrates two distinct query rules, namely \u201clevel-down\u201d and \u201cjump,\u201d to balance query accuracy and efficiency when processing keywords with varying frequencies. Furthermore, we enhance the query efficiency of BlockSketch by merging inefficient lower-level nodes into more compact ones and pruning redundant node checks during query execution. Extensive experiments on a real-world dataset demonstrate that BlockSketch delivers up to 73% faster query processing, achieves 44.56% of the average false positive rate of baselines at low multiplicity and as low as 1.52% at high multiplicity, and saves 79% in storage compared to state-of-the-art methods.<\/jats:p>","DOI":"10.1007\/s41019-025-00326-6","type":"journal-article","created":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T09:25:50Z","timestamp":1768296350000},"page":"177-198","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["BlockSketch: A Hybrid Tree-Based Sketch for Keyword Search in Blockchain Systems"],"prefix":"10.1007","volume":"11","author":[{"given":"Yushi","family":"Liu","sequence":"first","affiliation":[]},{"given":"Xiaodong","family":"Qi","sequence":"additional","affiliation":[]},{"given":"Zhao","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Yanqin","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Cheqing","family":"Jin","sequence":"additional","affiliation":[]},{"given":"Aoying","family":"Zhou","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,1,13]]},"reference":[{"key":"326_CR1","unstructured":"Nakamoto S (2008) Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review"},{"key":"326_CR2","first-page":"22","volume":"1","author":"V Buterin","year":"2013","unstructured":"Buterin V et al (2013) Ethereum white paper GitHub repository 1:22\u201323","journal-title":"Ethereum white paper GitHub repository"},{"issue":"2","key":"326_CR3","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1093\/jfr\/fjaa010","volume":"6","author":"DA Zetzsche","year":"2020","unstructured":"Zetzsche DA, Arner DW, Buckley RP (2020) Decentralized finance. J Financ Regul 6(2):172\u2013203","journal-title":"J Financ Regul"},{"key":"326_CR4","doi-asserted-by":"crossref","unstructured":"Bartoletti M, Chiang JH-y, Lafuente AL (2021) Sok: lending pools in decentralized finance. In: Financial Cryptography and Data Security. FC 2021 International Workshops: CoDecFin, DeFi, VOTING, and WTSC, Virtual Event, March 5, 2021, Revised Selected Papers 25, 553\u2013578","DOI":"10.1007\/978-3-662-63958-0_40"},{"key":"326_CR5","doi-asserted-by":"crossref","unstructured":"Zhou L, Qin K, Torres CF, Le DV, Gervais A (2021) High-frequency trading on decentralized on-chain exchanges. In: 2021 IEEE Symposium on Security and Privacy (SP), 428\u2013445","DOI":"10.1109\/SP40001.2021.00027"},{"issue":"2","key":"326_CR6","doi-asserted-by":"publisher","first-page":"196","DOI":"10.1108\/JFRC-08-2016-0068","volume":"25","author":"P Yeoh","year":"2017","unstructured":"Yeoh P (2017) Regulatory issues in blockchain technology. J Financ Regul Compl 25(2):196\u2013208","journal-title":"J Financ Regul Compl"},{"key":"326_CR7","doi-asserted-by":"publisher","first-page":"1994","DOI":"10.1109\/TIFS.2023.3346276","volume":"19","author":"J Wu","year":"2023","unstructured":"Wu J, Lin D, Fu Q, Yang S, Chen T, Zheng Z, Song B (2023) Toward understanding asset flows in crypto money laundering through the lenses of ethereum heists. IEEE Trans Inf Forensics Secur 19:1994\u20132009","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"326_CR8","doi-asserted-by":"crossref","unstructured":"Hristidis V, Papakonstantinou Y (2002) Discover: Keyword search in relational databases. In: VLDB\u201902: Proceedings of the 28th International Conference on Very Large Databases, 670\u2013681","DOI":"10.1016\/B978-155860869-6\/50065-2"},{"key":"326_CR9","doi-asserted-by":"crossref","unstructured":"Liu F, Yu C, Meng W, Chowdhury A (2006) Effective keyword search in relational databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, 563\u2013574","DOI":"10.1145\/1142473.1142536"},{"key":"326_CR10","doi-asserted-by":"crossref","unstructured":"Qin L, Yu JX, Chang L (2009) Keyword search in databases: the power of rdbms. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, 681\u2013694","DOI":"10.1145\/1559845.1559917"},{"key":"326_CR11","doi-asserted-by":"crossref","unstructured":"Li Y, Zheng K, Yan Y, Liu Q, Zhou X (2017) Etherql: a query layer for blockchain system. In: Database Systems for Advanced Applications: 22nd International Conference, DASFAA 2017, Suzhou, China, March 27-30, 2017, Proceedings, Part II 22, 556\u2013567","DOI":"10.1007\/978-3-319-55699-4_34"},{"key":"326_CR12","doi-asserted-by":"crossref","unstructured":"Zhu Y, Zhang Z, Jin C, Zhou A, Yan Y (2019) Sebdb: Semantics empowered blockchain database. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), 1820\u20131831","DOI":"10.1109\/ICDE.2019.00198"},{"key":"326_CR13","doi-asserted-by":"crossref","unstructured":"Xu, C., Zhang, C., Xu, J (2019) vchain: Enabling verifiable boolean range queries over blockchain databases. In: Proceedings of the 2019 International Conference on Management of Data, 141\u2013158","DOI":"10.1145\/3299869.3300083"},{"key":"326_CR14","doi-asserted-by":"crossref","unstructured":"Wang H, Xu C, Zhang C, Xu J, Peng Z, Pei J (2022) vchain+: Optimizing verifiable blockchain boolean range queries. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), 1927\u20131940","DOI":"10.1109\/ICDE53745.2022.00190"},{"key":"326_CR15","doi-asserted-by":"crossref","unstructured":"Zhang C, Xu C, Xu J, Tang Y, Choi B (2019) Gem$$\\hat{\\,}$$ 2-tree: A gas-efficient structure for authenticated range queries in blockchain. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), 842\u2013853","DOI":"10.1109\/ICDE.2019.00080"},{"issue":"2","key":"326_CR16","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1145\/356770.356776","volume":"11","author":"D Comer","year":"1979","unstructured":"Comer D (1979) Ubiquitous b-tree. ACM Computing Surveys (CSUR) 11(2):121\u2013137","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"326_CR17","doi-asserted-by":"crossref","unstructured":"Sch\u00fctze H, Manning CD, Raghavan P (2008) Introduction to Information Retrieval 39","DOI":"10.1017\/CBO9780511809071"},{"issue":"6","key":"326_CR18","doi-asserted-by":"publisher","first-page":"1393","DOI":"10.1109\/TPDS.2021.3113873","volume":"33","author":"H Wu","year":"2021","unstructured":"Wu H, Peng Z, Guo S, Yang Y, Xiao B (2021) Vql: efficient and verifiable cloud query services for blockchain systems. IEEE Trans Parallel Distrib Syst 33(6):1393\u20131406","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"326_CR19","doi-asserted-by":"crossref","unstructured":"Tong X, Tang H, Jiang N, Fan W, Gao Y, Deng S, Zhang Z, Jin C, Yang Y, Qin G (2021) Sql-middleware: enabling the blockchain with sql. In: Database Systems for Advanced Applications: 26th International Conference, DASFAA 2021, Taipei, Taiwan, April 11\u201314, 2021, Proceedings, Part III 26, 622\u2013626","DOI":"10.1007\/978-3-030-73200-4_48"},{"key":"326_CR20","unstructured":"Day A, Medvedev E, Risdal M, Katesit T (2019) Ethereum in bigquery: a public dataset for smart contract analytics. Google Cloud Blog"},{"key":"326_CR21","unstructured":"Kalodner H, M\u00f6ser M, Lee K, Goldfeder S, Plattner M, Chator A, Narayanan A (2020) $$\\{$$BlockSci$$\\}$$: Design and applications of a blockchain analysis platform. In: 29th USENIX Security Symposium (USENIX Security 20), 2721\u20132738"},{"key":"326_CR22","doi-asserted-by":"crossref","unstructured":"Li R, Wang P, Zhu J, Zhao J, Di J, Yang X, Ye K (2021) Building fast and compact sketches for approximately multi-set multi-membership querying. In: Proceedings of the 2021 International Conference on Management of Data, 1077\u20131089","DOI":"10.1145\/3448016.3452829"},{"issue":"7","key":"326_CR23","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1145\/362686.362692","volume":"13","author":"BH Bloom","year":"1970","unstructured":"Bloom BH (1970) Space\/time trade-offs in hash coding with allowable errors. Commun ACM 13(7):422\u2013426","journal-title":"Commun ACM"},{"key":"326_CR24","doi-asserted-by":"crossref","unstructured":"Fan B, Andersen DG, Kaminsky M, Mitzenmacher MD (2014) Cuckoo filter: Practically better than bloom. In: Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, 75\u201388","DOI":"10.1145\/2674005.2674994"},{"key":"326_CR25","doi-asserted-by":"crossref","unstructured":"Gupta G, Yan M, Coleman B, Kille B, Elworth RL, Medini T, Treangen T, Shrivastava A (2021) Fast processing and querying of 170tb of genomics data via a repeated and merged bloom filter (rambo). In: Proceedings of the 2021 International Conference on Management of Data, 2226\u20132234","DOI":"10.1145\/3448016.3457333"},{"key":"326_CR26","doi-asserted-by":"crossref","unstructured":"Yoon MK, Son J, Shin S-H (2014) Bloom tree: A search tree based on bloom filters for multiple-set membership testing. In: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, 1429\u20131437","DOI":"10.1109\/INFOCOM.2014.6848077"},{"issue":"2","key":"326_CR27","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1038\/s41587-018-0010-1","volume":"37","author":"P Bradley","year":"2019","unstructured":"Bradley P, Den Bakker HC, Rocha EP, McVean G, Iqbal Z (2019) Ultrafast search of all deposited bacterial and viral genomic data. Nat Biotechnol 37(2):152\u2013159","journal-title":"Nat Biotechnol"},{"issue":"1","key":"326_CR28","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1109\/TKDE.2009.57","volume":"22","author":"D Guo","year":"2009","unstructured":"Guo D, Wu J, Chen H, Yuan Y, Luo X (2009) The dynamic bloom filters. IEEE Trans Knowl Data Eng 22(1):120\u2013133","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"326_CR29","first-page":"173","volume":"99","author":"M Castro","year":"1999","unstructured":"Castro M, Liskov B et al (1999) Practical byzantine fault tolerance In OsDI 99:173\u2013186","journal-title":"Practical byzantine fault tolerance In OsDI"},{"key":"326_CR30","doi-asserted-by":"crossref","unstructured":"Liu H, Luo X, Liu H, Xia X (2021) Merkle tree: A fundamental component of blockchains. In: 2021 International Conference on Electronic Information Engineering and Computer Science (EIECS), 556\u2013561","DOI":"10.1109\/EIECS53707.2021.9588047"},{"key":"326_CR31","doi-asserted-by":"crossref","unstructured":"Li S, Zhang Z, Xiao J, Zhang M, Yuan Y, Wang G (2024) Authenticated keyword search on large-scale graphs in hybrid-storage blockchains. In: 2024 IEEE 40th International Conference on Data Engineering (ICDE), 1958\u20131971","DOI":"10.1109\/ICDE60146.2024.00155"},{"issue":"11","key":"326_CR32","doi-asserted-by":"publisher","first-page":"1597","DOI":"10.14778\/3342263.3342636","volume":"12","author":"M El-Hindi","year":"2019","unstructured":"El-Hindi M, Binnig C, Arasu A, Kossmann D, Ramamurthy R (2019) Blockchaindb: a shared database on blockchains. Proc VLDB Endow 12(11):1597\u20131609","journal-title":"Proc VLDB Endow"},{"issue":"3","key":"326_CR33","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1109\/90.851975","volume":"8","author":"L Fan","year":"2000","unstructured":"Fan L, Cao P, Almeida J, Broder AZ (2000) Summary cache: a scalable wide-area web cache sharing protocol. IEEE ACM Trans Netw 8(3):281\u2013293","journal-title":"IEEE ACM Trans Netw"},{"key":"326_CR34","doi-asserted-by":"crossref","unstructured":"Shanmugasundaram K, Br\u00f6nnimann H, Memon N (2004) Payload attribution via hierarchical bloom filters. In: Proceedings of the 11th ACM Conference on Computer and Communications Security, 31\u201341","DOI":"10.1145\/1030083.1030089"},{"key":"326_CR35","doi-asserted-by":"crossref","unstructured":"Deng F, Rafiei D (2006) Approximately detecting duplicates for streaming data using stable bloom filters. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, 25\u201336","DOI":"10.1145\/1142473.1142477"},{"issue":"6","key":"326_CR36","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1016\/j.ipl.2006.10.007","volume":"101","author":"PS Almeida","year":"2007","unstructured":"Almeida PS, Baquero C, Pregui\u00e7a N, Hutchison D (2007) Scalable bloom filters. Inf Process Lett 101(6):255\u2013261","journal-title":"Inf Process Lett"},{"issue":"5","key":"326_CR37","doi-asserted-by":"publisher","first-page":"3116","DOI":"10.1109\/TNET.2017.2730227","volume":"25","author":"T Yang","year":"2017","unstructured":"Yang T, Liu AX, Shahzad M, Yang D, Fu Q, Xie G, Li X (2017) A shifting framework for set queries. IEEE ACM Trans Netw 25(5):3116\u20133131","journal-title":"IEEE ACM Trans Netw"},{"key":"326_CR38","doi-asserted-by":"crossref","unstructured":"Zhang H, Lim H, Leis V, Andersen DG, Kaminsky M, Keeton K, Pavlo A (2018) Surf: Practical range query filtering with fast succinct tries. In: Proceedings of the 2018 International Conference on Management of Data, 323\u2013336","DOI":"10.1145\/3183713.3196931"},{"key":"326_CR39","doi-asserted-by":"crossref","unstructured":"Yu M, Fabrikant A, Rexford J (2009) Buffalo: Bloom filter forwarding architecture for large organizations. In: Proceedings of the 5th International Conference on Emerging Networking Experiments and Technologies, 313\u2013324","DOI":"10.1145\/1658939.1658975"},{"key":"326_CR40","doi-asserted-by":"crossref","unstructured":"Hao F, Kodialam M, Lakshman T, Song H (2009) Fast multiset membership testing using combinatorial bloom filters. In: IEEE INFOCOM 2009, 513\u2013521","DOI":"10.1109\/INFCOM.2009.5061957"},{"key":"326_CR41","doi-asserted-by":"crossref","unstructured":"Dai H, Zhong Y, Liu AX, Wang W, Li M (2016) Noisy bloom filters for multi-set membership testing. In: Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 139\u2013151","DOI":"10.1145\/2896377.2901451"},{"issue":"1","key":"326_CR42","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1016\/j.jalgor.2003.12.001","volume":"55","author":"G Cormode","year":"2005","unstructured":"Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58\u201375","journal-title":"J Algorithms"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41019-025-00326-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-025-00326-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-025-00326-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T07:18:26Z","timestamp":1774250306000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41019-025-00326-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,13]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["326"],"URL":"https:\/\/doi.org\/10.1007\/s41019-025-00326-6","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"value":"2364-1185","type":"print"},{"value":"2364-1541","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,13]]},"assertion":[{"value":"11 July 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 October 2025","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 November 2025","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 January 2026","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}