{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:24:10Z","timestamp":1750220650149,"version":"3.41.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2020,2,9]],"date-time":"2020-02-09T00:00:00Z","timestamp":1581206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Business Finland 5G-FORCE research project"},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["314008"],"award-info":[{"award-number":["314008"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2020,4,30]]},"abstract":"<jats:p>\n            Bloom Filter is a space-efficient probabilistic data structure for checking the membership of elements in a set. Given multiple sets, a standard Bloom Filter is not sufficient when looking for the items to which an element or a set of input elements belong. An example case is searching for documents with keywords in a large text corpus, which is essentially a multiple set matching problem where the input is single or multiple keywords, and the result is a set of possible candidate documents. This article solves the multiple set matching problem by proposing two efficient Bloom Multifilters called Bloom Matrix and Bloom Vector, which generalize the standard Bloom Filter. Both structures are space-efficient and answer queries with a set of identifiers for multiple set matching problems. The space efficiency can be optimized according to the distribution of labels among multiple sets: Uniform and Zipf. Bloom Vector efficiently exploits the Zipf distribution of data for further space reduction. Indeed, both structures are much more space-efficient compared with the state-of-the-art, Bloofi. The results also highlight that a L\n            <jats:sc>ookup<\/jats:sc>\n            operation on Bloom Matrix is significantly faster than on Bloom Vector and Bloofi.\n          <\/jats:p>","DOI":"10.1145\/3372409","type":"journal-article","created":{"date-parts":[[2020,2,10]],"date-time":"2020-02-10T06:49:13Z","timestamp":1581317353000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Multiple Set Matching with Bloom Matrix and Bloom Vector"],"prefix":"10.1145","volume":"14","author":[{"given":"Francesco","family":"Concas","sequence":"first","affiliation":[{"name":"University of Helsinki, Finland"}]},{"given":"Pengfei","family":"Xu","sequence":"additional","affiliation":[{"name":"University of Helsinki, Finland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6135-932X","authenticated-orcid":false,"given":"Mohammad A.","family":"Hoque","sequence":"additional","affiliation":[{"name":"University of Helsinki, Finland"}]},{"given":"Jiaheng","family":"Lu","sequence":"additional","affiliation":[{"name":"University of Helsinki, Finland"}]},{"given":"Sasu","family":"Tarkoma","sequence":"additional","affiliation":[{"name":"University of Helsinki, Finland"}]}],"member":"320","published-online":{"date-parts":[[2020,2,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/1224252.1224501"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/362686.362692"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1151659.1159950"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipl.2008.05.018"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2012.10.007"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2004.1354643"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201904)","author":"Chazelle Bernard","year":"2004","unstructured":"Bernard Chazelle , Joe Kilian , Ronitt Rubinfeld , and Ayellet Tal . 2004 . The Bloomier Filter: An efficient data structure for static support lookup tables . In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201904) . Society for Industrial and Applied Mathematics, Philadelphia, PA, 30--39. Bernard Chazelle, Joe Kilian, Ronitt Rubinfeld, and Ayellet Tal. 2004. The Bloomier Filter: An efficient data structure for static support lookup tables. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201904). Society for Industrial and Applied Mathematics, Philadelphia, PA, 30--39."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872787"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2015.01.002"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing. 236--246","author":"Cuenca-Acuna F. M.","year":"2003","unstructured":"F. M. Cuenca-Acuna , C. Peery , R. P. Martin , and T. D. Nguyen . 2003. PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities . In Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing. 236--246 . DOI:https:\/\/doi.org\/10.1109\/HPDC. 2003 .1210033 10.1109\/HPDC.2003.1210033 F. M. Cuenca-Acuna, C. Peery, R. P. Martin, and T. D. Nguyen. 2003. PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities. In Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing. 236--246. DOI:https:\/\/doi.org\/10.1109\/HPDC.2003.1210033"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1109\/90.851975","article-title":"Summary cache: A scalable wide-area web cache sharing protocol","volume":"8","author":"Fan Li","year":"2000","unstructured":"Li Fan , Pei Cao , J. Almeida , and A. Z. Broder . 2000 . Summary cache: A scalable wide-area web cache sharing protocol . IEEE\/ACM Transactions on Networking 8 , 3 (Jun 2000), 281--293. DOI:https:\/\/doi.org\/10.1109\/90.851975 10.1109\/90.851975 Li Fan, Pei Cao, J. Almeida, and A. Z. Broder. 2000. Summary cache: A scalable wide-area web cache sharing protocol. IEEE\/ACM Transactions on Networking 8, 3 (Jun 2000), 281--293. DOI:https:\/\/doi.org\/10.1109\/90.851975","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1298306.1298310"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.57"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 9th Biennial Conference on Innovative Data Systems Research (CIDR\u201919)","author":"Hellerstein Joseph M.","year":"2019","unstructured":"Joseph M. Hellerstein , Jose M. Faleiro , Joseph Gonzalez , Johann Schleier-Smith , Vikram Sreekanti , Alexey Tumanov , and Chenggang Wu . 2019 . Serverless computing: One step forward, two steps back . In Proceedings of the 9th Biennial Conference on Innovative Data Systems Research (CIDR\u201919) . Retrieved from http:\/\/cidrdb.org\/cidr 2019\/papers\/p119-hellerstein-cidr19.pdf. Joseph M. Hellerstein, Jose M. Faleiro, Joseph Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2019. Serverless computing: One step forward, two steps back. In Proceedings of the 9th Biennial Conference on Innovative Data Systems Research (CIDR\u201919). Retrieved from http:\/\/cidrdb.org\/cidr2019\/papers\/p119-hellerstein-cidr19.pdf."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/1196085.1646532"},{"key":"e_1_2_1_17_1","volume-title":"Tollis","author":"Kakoulis Konstantinos G.","year":"2013","unstructured":"Konstantinos G. Kakoulis and Ioannis G . Tollis . 2013 . Labeling algorithms. In Handbook of Graph Drawing and Visualization. CRC Press , 489--515. Retrieved from http:\/\/cs.brown.edu\/&sim;rt\/gdhandbook\/chapters\/labeling.pdf. Konstantinos G. Kakoulis and Ioannis G. Tollis. 2013. Labeling algorithms. In Handbook of Graph Drawing and Visualization. CRC Press, 489--515. Retrieved from http:\/\/cs.brown.edu\/&sim;rt\/gdhandbook\/chapters\/labeling.pdf."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2716310"},{"key":"e_1_2_1_19_1","first-page":"12","article-title":"Space-code Bloom Filter for efficient per-flow traffic measurement","volume":"24","author":"Kumar A.","year":"2006","unstructured":"A. Kumar , J. Xu , and J. Wang . 2006 . Space-code Bloom Filter for efficient per-flow traffic measurement . IEEE J-SAC 24 , 12 (Dec 2006), 2327--2339. DOI:https:\/\/doi.org\/10.1109\/JSAC.2006.884032 10.1109\/JSAC.2006.884032 A. Kumar, J. Xu, and J. Wang. 2006. Space-code Bloom Filter for efficient per-flow traffic measurement. IEEE J-SAC 24, 12 (Dec 2006), 2327--2339. DOI:https:\/\/doi.org\/10.1109\/JSAC.2006.884032","journal-title":"IEEE J-SAC"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2018.2889329"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 24th Annual International Computer Software and Applications Conference. 389--394","author":"Park Yong Woon","year":"2000","unstructured":"Yong Woon Park , Kun Hyo Baek , and Ki Dong Chung . 2000 . Reducing network traffic using two-layered cache servers for continuous media data on the Internet . In Proceedings of the 24th Annual International Computer Software and Applications Conference. 389--394 . DOI:https:\/\/doi.org\/10.1109\/CMPSAC.2000.884754 10.1109\/CMPSAC.2000.884754 Yong Woon Park, Kun Hyo Baek, and Ki Dong Chung. 2000. Reducing network traffic using two-layered cache servers for continuous media data on the Internet. In Proceedings of the 24th Annual International Computer Software and Applications Conference. 389--394. DOI:https:\/\/doi.org\/10.1109\/CMPSAC.2000.884754"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/62044.62050"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1515915.1515918"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings. of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies","volume":"3","author":"Rhea S. C.","year":"2002","unstructured":"S. C. Rhea and J. Kubiatowicz . 2002. Probabilistic location and routing . In Proceedings. of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies , Vol. 3 . 1248--1257. DOI:https:\/\/doi.org\/10.1109\/INFCOM. 2002 .1019375 10.1109\/INFCOM.2002.1019375 S. C. Rhea and J. Kubiatowicz. 2002. Probabilistic location and routing. In Proceedings. of the 21st Annual Joint Conference of the IEEE Computer and Communications Societies, Vol. 3. 1248--1257. DOI:https:\/\/doi.org\/10.1109\/INFCOM.2002.1019375"},{"key":"e_1_2_1_25_1","unstructured":"Behrooz A. Shirazi Krishna M. Kavi and Ali R. Hurson (Eds.). 1995. Scheduling and Load Balancing in Parallel and Distributed Systems. IEEE Computer Society Press Los Alamitos CA.  Behrooz A. Shirazi Krishna M. Kavi and Ali R. Hurson (Eds.). 1995. Scheduling and Load Balancing in Parallel and Distributed Systems. IEEE Computer Society Press Los Alamitos CA."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2014.2342155"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/SURV.2011.031611.00024"},{"key":"e_1_2_1_28_1","volume-title":"BMF: An indexing structure to support multi-element check. In Web-Age Information Management","author":"Xu Chenyang","year":"2016","unstructured":"Chenyang Xu , Qin Liu , and Weixiong Rao . 2016 . BMF: An indexing structure to support multi-element check. In Web-Age Information Management , Bin Cui, Nan Zhang, Jianliang Xu, Xiang Lian, and Dexi Liu (Eds.). Springer International Publishing , Cham , 441--453. Chenyang Xu, Qin Liu, and Weixiong Rao. 2016. BMF: An indexing structure to support multi-element check. In Web-Age Information Management, Bin Cui, Nan Zhang, Jianliang Xu, Xiang Lian, and Dexi Liu (Eds.). Springer International Publishing, Cham, 441--453."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/2876473.2876476"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372409","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3372409","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:21Z","timestamp":1750197741000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372409"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,9]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,4,30]]}},"alternative-id":["10.1145\/3372409"],"URL":"https:\/\/doi.org\/10.1145\/3372409","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"type":"print","value":"1556-4681"},{"type":"electronic","value":"1556-472X"}],"subject":[],"published":{"date-parts":[[2020,2,9]]},"assertion":[{"value":"2019-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-02-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}