{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T13:43:28Z","timestamp":1762004608590},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"8","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2013,6]]},"abstract":"<jats:p>The unparalleled growth and popularity of the Internet coupled with the advent of diverse modern applications such as search engines, on-line transactions, climate warning systems, etc., has catered to an unprecedented expanse in the volume of data stored world-wide. Efficient storage, management, and processing of such massively exponential amount of data has emerged as a central theme of research in this direction. Detection and removal of redundancies and duplicates in real-time from such multi-trillion record-set to bolster resource and compute efficiency constitutes a challenging area of study. The infeasibility of storing the entire data from potentially unbounded data streams, with the need for precise elimination of duplicates calls for intelligent approximate duplicate detection algorithms. The literature hosts numerous works based on the well-known probabilistic bitmap structure, Bloom Filter and its variants.<\/jats:p><jats:p>In this paper we propose a novel data structure, Streaming Quotient Filter, (SQF) for efficient detection and removal of duplicates in data streams. SQF intelligently stores the signatures of elements arriving on a data stream, and along with an eviction policy provides near zero false positive and false negative rates. We show that the near optimal performance of SQF is achieved with a very low memory requirement, making it ideal for real-time memory-efficient de-duplication applications having an extremely low false positive and false negative tolerance rates. We present detailed theoretical analysis of the working of SQF, providing a guarantee on its performance. Empirically, we compare SQF to alternate methods and show that the proposed method is superior in terms of memory and accuracy compared to the existing solutions. We also discuss Dynamic SQF for evolving streams and the parallel implementation of SQF.<\/jats:p>","DOI":"10.14778\/2536354.2536359","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"589-600","source":"Crossref","is-referenced-by-count":27,"title":["Streaming quotient filter"],"prefix":"10.14778","volume":"6","author":[{"given":"Sourav","family":"Dutta","sequence":"first","affiliation":[{"name":"Max Planck Institute for Informatics, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ankur","family":"Narang","sequence":"additional","affiliation":[{"name":"IBM Research, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Suman K.","family":"Bera","sequence":"additional","affiliation":[{"name":"IBM Research, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,6]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"1","volume-title":"OSDI","author":"Adya A.","year":"2002","unstructured":"A. Adya , W. J. Bolosky , M. Castro , G. Cermak , R. Chaiken , J. R. Douceur , J. Howell , R. J. Lorch , M. Theimer , and R. Wattenhofer . Farsite: Federated, available, and reliable storage for an incompletely trusted environment . In OSDI , pages 1 - 14 , 2002 . A. Adya, W. J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, R. J. Lorch, M. Theimer, and R. Wattenhofer. Farsite: Federated, available, and reliable storage for an incompletely trusted environment. In OSDI, pages 1-14, 2002."},{"key":"e_1_2_1_2_1","first-page":"20","volume-title":"STOC","author":"Alon N.","year":"1996","unstructured":"N. Alon , Y. Matias , and M. Szegedy . The space complexity of approximating the frequency moments . In STOC , pages 20 - 29 , 1996 . N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In STOC, pages 20-29, 1996."},{"key":"e_1_2_1_3_1","first-page":"350","volume-title":"ICDE","author":"Babcock B.","year":"2004","unstructured":"B. Babcock , S. Singh , and G. Varghese . Load shedding for aggregation queries over data streams . In ICDE , pages 350 - 361 , 2004 . B. Babcock, S. Singh, and G. Varghese. Load shedding for aggregation queries over data streams. In ICDE, pages 350-361, 2004."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1145\/383059.383075","volume-title":"SIGCOMM","author":"Baboescu F.","year":"2001","unstructured":"F. Baboescu and G. Varghese . Scalable packet classification . In SIGCOMM , pages 199 - 210 , 2001 . F. Baboescu and G. Varghese. Scalable packet classification. In SIGCOMM, pages 199-210, 2001."},{"issue":"11","key":"e_1_2_1_5_1","first-page":"1627","article-title":"Don't thrash: How to cache your hash on flash","volume":"5","author":"Bender M. A.","year":"2012","unstructured":"M. A. Bender , M. Farach-Colton , R. Johnson , R. Kraner , B. C. Kuszmaul , D. Medjedovic , P. Montes , P. Shetty , R. P. Spillane , and E. Zadok . Don't thrash: How to cache your hash on flash . VLDB , 5 ( 11 ): 1627 - 1637 , 2012 . M. A. Bender, M. Farach-Colton, R. Johnson, R. Kraner, B. C. Kuszmaul, D. Medjedovic, P. Montes, P. Shetty, R. P. Spillane, and E. Zadok. Don't thrash: How to cache your hash on flash. VLDB, 5(11):1627-1637, 2012.","journal-title":"VLDB"},{"key":"e_1_2_1_6_1","first-page":"39","volume-title":"SIGKDD","author":"Bilenko M.","year":"2003","unstructured":"M. Bilenko and R. J. Mooney . Adaptive duplicate detection using learnable string similarity measures . In SIGKDD , pages 39 - 48 , 2003 . M. Bilenko and R. J. Mooney. Adaptive duplicate detection using learnable string similarity measures. In SIGKDD, pages 39-48, 2003."},{"issue":"7","key":"e_1_2_1_7_1","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1145\/362686.362692","article-title":"trade-offs in hash coding with allowable errors","volume":"13","author":"Bloom B. H.","year":"1970","unstructured":"B. H. Bloom . Space\/time trade-offs in hash coding with allowable errors . Communications of the ACM , 13 ( 7 ): 422 - 426 , 1970 . B. H. Bloom. Space\/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422-426, 1970.","journal-title":"Communications of the ACM"},{"issue":"4","key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1080\/15427951.2004.10129096","article-title":"Network applications of bloom filters: A survey","volume":"1","author":"Broder A. Z.","year":"2003","unstructured":"A. Z. Broder and M. Mitzenmacher . Network applications of bloom filters: A survey . Internet Mathematics , 1 ( 4 ): 485 - 509 , 2003 . A. Z. Broder and M. Mitzenmacher. Network applications of bloom filters: A survey. Internet Mathematics, 1(4):485-509, 2003.","journal-title":"Internet Mathematics"},{"key":"e_1_2_1_9_1","first-page":"1","volume-title":"GLOBECOMM","author":"Chen Y.","year":"2007","unstructured":"Y. Chen , A. Kumar , and J. Xu . A new design of bloom filter for packet inspection speedup . In GLOBECOMM , pages 1 - 5 , 2007 . Y. Chen, A. Kumar, and J. Xu. A new design of bloom filter for packet inspection speedup. In GLOBECOMM, pages 1-5, 2007."},{"issue":"2","key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1145\/506309.506311","article-title":"Collection statistics for fast duplicate document detection","volume":"20","author":"Chowdhury A.","year":"2002","unstructured":"A. Chowdhury , O. Frieder , D. Grossman , and M. McCabe . Collection statistics for fast duplicate document detection . ACM Transactions on Information Systems , 20 ( 2 ): 171 - 191 , 2002 . A. Chowdhury, O. Frieder, D. Grossman, and M. McCabe. Collection statistics for fast duplicate document detection. ACM Transactions on Information Systems, 20(2):171-191, 2002.","journal-title":"ACM Transactions on Information Systems"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1145\/956863.956946","volume-title":"CIKM","author":"Conrad J.","year":"2003","unstructured":"J. Conrad , X. Guo , and C. Schriber . Online duplicate document detection: Signature reliability in a dynamic retrieval environment . In CIKM , pages 443 - 452 , 2003 . J. Conrad, X. Guo, and C. Schriber. Online duplicate document detection: Signature reliability in a dynamic retrieval environment. In CIKM, pages 443-452, 2003."},{"key":"e_1_2_1_12_1","first-page":"25","volume-title":"SIGMOD","author":"Deng F.","year":"2006","unstructured":"F. Deng and D. Rafiei . Approximately detecting duplicates for streaming data using stable bloom filters . In SIGMOD , pages 25 - 36 , 2006 . F. Deng and D. Rafiei. Approximately detecting duplicates for streaming data using stable bloom filters. In SIGMOD, pages 25-36, 2006."},{"issue":"1","key":"e_1_2_1_13_1","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1109\/MM.2004.1268997","article-title":"Deep packet inspection using parallel bloom filters","volume":"24","author":"Dharmapurikar S.","year":"2004","unstructured":"S. Dharmapurikar , P. Krishnamurthy , T. S. Sproull , and J. W. Lockwood . Deep packet inspection using parallel bloom filters . IEEE Micro , 24 ( 1 ): 52 - 61 , 2004 . S. Dharmapurikar, P. Krishnamurthy, T. S. Sproull, and J. W. Lockwood. Deep packet inspection using parallel bloom filters. IEEE Micro, 24(1):52-61, 2004.","journal-title":"IEEE Micro"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1145\/863955.863979","volume-title":"ACM SIGCOMM","author":"Dharmapurikar S.","year":"2003","unstructured":"S. Dharmapurikar , P. Krishnamurthy , and D. Taylor . Longest prefix matching using bloom filters . In ACM SIGCOMM , pages 201 - 212 , 2003 . S. Dharmapurikar, P. Krishnamurthy, and D. Taylor. Longest prefix matching using bloom filters. In ACM SIGCOMM, pages 201-212, 2003."},{"key":"e_1_2_1_15_1","first-page":"367","volume-title":"FMCAD","author":"Dillinger P. C.","year":"2004","unstructured":"P. C. Dillinger and P. Manolios . Bloom filters in probabilistic verification . In FMCAD , pages 367 - 381 , 2004 . P. C. Dillinger and P. Manolios. Bloom filters in probabilistic verification. In FMCAD, pages 367-381, 2004."},{"key":"e_1_2_1_16_1","first-page":"59","volume-title":"USENIX","author":"Douglis F.","year":"2004","unstructured":"F. Douglis , J. Lavoie , J. M. Tracey , P. Kulkarni , and P. Kulkarni . Redundancy elimination within large collections of files . In USENIX , pages 59 - 72 , 2004 . F. Douglis, J. Lavoie, J. M. Tracey, P. Kulkarni, and P. Kulkarni. Redundancy elimination within large collections of files. In USENIX, pages 59-72, 2004."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1145\/2247596.2247624","volume-title":"EDBT","author":"Dutta S.","year":"2012","unstructured":"S. Dutta , S. Bhattacherjee , and A. Narang . Towards \"intelligent compression\" in streams: A biased reservoir sampling based bloom filter approach . In EDBT , pages 228 - 238 , 2012 . S. Dutta, S. Bhattacherjee, and A. Narang. Towards \"intelligent compression\" in streams: A biased reservoir sampling based bloom filter approach. In EDBT, pages 228-238, 2012."},{"key":"e_1_2_1_18_1","first-page":"281","volume-title":"IEEE\/ACM Transaction on Networking","author":"Fan L.","year":"2000","unstructured":"L. Fan , P. Cao , J. Almeida , and Z. Broder . Summary cache: a scalable wide area web cache sharing protocol . In IEEE\/ACM Transaction on Networking , pages 281 - 293 , 2000 . L. Fan, P. Cao, J. Almeida, and Z. Broder. Summary cache: a scalable wide area web cache sharing protocol. In IEEE\/ACM Transaction on Networking, pages 281-293, 2000."},{"key":"e_1_2_1_19_1","first-page":"1520","volume-title":"IEEE INFOCOM","author":"Feng W.","year":"2001","unstructured":"W. Feng , D. Kandlur , D. Sahu , and K. Shin . Stochastic fair blue: A queue management algorithm for enforcing fairness . In IEEE INFOCOM , pages 1520 - 1529 , 2001 . W. Feng, D. Kandlur, D. Sahu, and K. Shin. Stochastic fair blue: A queue management algorithm for enforcing fairness. In IEEE INFOCOM, pages 1520-1529, 2001."},{"issue":"2","key":"e_1_2_1_20_1","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1016\/0022-0000(85)90041-8","article-title":"Probabilistic counting algorithms for database applications","volume":"31","author":"Flajolet P.","year":"1985","unstructured":"P. Flajolet and G. N. Martin . Probabilistic counting algorithms for database applications . Computer and System Science , 31 ( 2 ): 182 - 209 , 1985 . P. Flajolet and G. N. Martin. Probabilistic counting algorithms for database applications. Computer and System Science, 31(2):182-209, 1985.","journal-title":"Computer and System Science"},{"key":"e_1_2_1_21_1","volume-title":"Database System Implementation","author":"Garcia-Molina H.","year":"1999","unstructured":"H. Garcia-Molina , J. D. Ullman , and J. Widom . Database System Implementation . Prentice Hall , 1999 . H. Garcia-Molina, J. D. Ullman, and J. Widom. Database System Implementation. Prentice Hall, 1999."},{"key":"e_1_2_1_22_1","first-page":"1259","volume-title":"CIKM","author":"Garg V. K.","year":"2010","unstructured":"V. K. Garg , A. Narang , and S. Bhattacherjee . Real-time memory efficient data redundancy removal algorithm . In CIKM , pages 1259 - 1268 , 2010 . V. K. Garg, A. Narang, and S. Bhattacherjee. Real-time memory efficient data redundancy removal algorithm. In CIKM, pages 1259-1268, 2010."},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1145\/375663.375665","volume-title":"SIGMOD","author":"Gehrke J.","year":"2001","unstructured":"J. Gehrke , F. Korn , and J. Srivastava . On computing correlated aggregates over continual data streams . In SIGMOD , pages 13 - 24 , 2001 . J. Gehrke, F. Korn, and J. Srivastava. On computing correlated aggregates over continual data streams. In SIGMOD, pages 13-24, 2001."},{"key":"e_1_2_1_24_1","first-page":"147","volume-title":"SIGCOMM","author":"Gupta P.","year":"1999","unstructured":"P. Gupta and N. McKeown . Packet classification on multiple fields . In SIGCOMM , pages 147 - 160 , 1999 . P. Gupta and N. McKeown. Packet classification on multiple fields. In SIGCOMM, pages 147-160, 1999."},{"key":"e_1_2_1_25_1","first-page":"219","volume-title":"World Wide Web","volume":"2","author":"Heydon A.","year":"1999","unstructured":"A. Heydon and M. Najork . Mercator: A scalable, extensive web crawler . In World Wide Web , volume 2 , pages 219 - 229 , 1999 . A. Heydon and M. Najork. Mercator: A scalable, extensive web crawler. In World Wide Web, volume 2, pages 219-229, 1999."},{"key":"e_1_2_1_26_1","volume-title":"Distributed Computing and Internet technology","author":"Hofmann T.","year":"2009","unstructured":"T. Hofmann . Optimizing distributed joins using bloom filters. Distributed Computing and Internet technology ( Springer\/LNCS) , 5375:145-156, 2009 . T. Hofmann. Optimizing distributed joins using bloom filters. Distributed Computing and Internet technology (Springer\/LNCS), 5375:145-156, 2009."},{"key":"e_1_2_1_27_1","first-page":"277","volume-title":"International Conference on High Performance Computing","author":"Hua Y.","year":"2006","unstructured":"Y. Hua and B. Xiao . A multi-attribute data structure with parallel bloom filters for network services . In International Conference on High Performance Computing , pages 277 - 288 , 2006 . Y. Hua and B. Xiao. A multi-attribute data structure with parallel bloom filters for network services. In International Conference on High Performance Computing, pages 277-288, 2006."},{"key":"e_1_2_1_28_1","first-page":"281","volume-title":"FAST","author":"Jain N.","year":"2005","unstructured":"N. Jain , M. Dahlin , and R. Tewari . Taper: Tiered approach for eliminating redundancy in replica synchronization . In FAST , pages 281 - 294 , 2005 . N. Jain, M. Dahlin, and R. Tewari. Taper: Tiered approach for eliminating redundancy in replica synchronization. In FAST, pages 281-294, 2005."},{"key":"e_1_2_1_29_1","volume-title":"The Art of Computer Programming: Sorting and Searching","author":"Knuth D. E.","year":"1973","unstructured":"D. E. Knuth . The Art of Computer Programming: Sorting and Searching , volume 3 . Addison Wesley , 1973 . D. E. Knuth. The Art of Computer Programming: Sorting and Searching, volume 3. Addison Wesley, 1973."},{"key":"e_1_2_1_30_1","first-page":"1762","volume-title":"IEEE INFOCOM","author":"Kumar A.","year":"2004","unstructured":"A. Kumar , J. Xu , J. Wang , O. Spatschek , and L. Li . Space-code bloom filter for efficient per-flow traffic measurement . In IEEE INFOCOM , pages 1762 - 1773 , 2004 . A. Kumar, J. Xu, J. Wang, O. Spatschek, and L. Li. Space-code bloom filter for efficient per-flow traffic measurement. In IEEE INFOCOM, pages 1762-1773, 2004."},{"key":"e_1_2_1_31_1","first-page":"305","volume-title":"ICDAR","author":"Lee D.","year":"1999","unstructured":"D. Lee and J. Hull . Duplicate detection in symbolically compressed documents . In ICDAR , pages 305 - 308 , 1999 . D. Lee and J. Hull. Duplicate detection in symbolically compressed documents. In ICDAR, pages 305-308, 1999."},{"key":"e_1_2_1_32_1","volume-title":"Using bloom filters to speed-up name lookup in distributed systems","author":"Little M.","year":"2002","unstructured":"M. Little , N. Speirs , and S. Shrivastava . Using bloom filters to speed-up name lookup in distributed systems . The Computer Journal (Oxford University Press) , 45(6):645-652, 2002 . M. Little, N. Speirs, and S. Shrivastava. Using bloom filters to speed-up name lookup in distributed systems. The Computer Journal (Oxford University Press), 45(6):645-652, 2002."},{"key":"e_1_2_1_33_1","first-page":"12","volume-title":"WWW","author":"Metwally A.","year":"2005","unstructured":"A. Metwally , D. Agrawal , and A. E. Abbadi . Duplicate detection in click streams . In WWW , pages 12 - 21 , 2005 . A. Metwally, D. Agrawal, and A. E. Abbadi. Duplicate detection in click streams. In WWW, pages 12-21, 2005."},{"key":"e_1_2_1_34_1","first-page":"604","volume-title":"IEEE\/ACM Transaction on Networking","author":"Mitzenmacher M.","year":"2002","unstructured":"M. Mitzenmacher . Compressed bloom filters . In IEEE\/ACM Transaction on Networking , pages 604 - 612 , 2002 . M. Mitzenmacher. Compressed bloom filters. In IEEE\/ACM Transaction on Networking, pages 604-612, 2002."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1145\/1498698.1594230","article-title":"Cache-, hash-, and space-efficient bloom filters","volume":"14","author":"Putze F.","year":"2009","unstructured":"F. Putze , P. Sanders , and J. Singler . Cache-, hash-, and space-efficient bloom filters . ACM Journal of Experimental Algorithmics , 14 : 4 - 18 , 2009 . F. Putze, P. Sanders, and J. Singler. Cache-, hash-, and space-efficient bloom filters. ACM Journal of Experimental Algorithmics, 14:4-18, 2009.","journal-title":"ACM Journal of Experimental Algorithmics"},{"key":"e_1_2_1_36_1","first-page":"89","volume-title":"FAST","author":"Quinlan S.","year":"2002","unstructured":"S. Quinlan and S. Dorward . Venti: A new approach to archival storage . In FAST , pages 89 - 101 , 2002 . S. Quinlan and S. Dorward. Venti: A new approach to archival storage. In FAST, pages 89-101, 2002."},{"key":"e_1_2_1_38_1","first-page":"155","volume-title":"USENIX","author":"Reiter M.","year":"1998","unstructured":"M. Reiter , V. Anupam , and A. Mayer . Detecting hit-shaving in click-through payment schemes . In USENIX , pages 155 - 166 , 1998 . M. Reiter, V. Anupam, and A. Mayer. Detecting hit-shaving in click-through payment schemes. In USENIX, pages 155-166, 1998."},{"key":"e_1_2_1_39_1","first-page":"241","volume-title":"ACM SIGMOD","author":"Saar C.","year":"2003","unstructured":"C. Saar and M. Yossi . Spectral bloom filters . In ACM SIGMOD , pages 241 - 252 , 2003 . C. Saar and M. Yossi. Spectral bloom filters. In ACM SIGMOD, pages 241-252, 2003."},{"issue":"6","key":"e_1_2_1_40_1","doi-asserted-by":"crossref","first-page":"973","DOI":"10.1007\/s11390-008-9192-1","article-title":"Improved approximate detection of duplicates for data streams over sliding windows","volume":"23","author":"Shen H.","year":"2008","unstructured":"H. Shen and Y. Zhang . Improved approximate detection of duplicates for data streams over sliding windows . Journal of Computer Science and Technology , 23 ( 6 ): 973 - 987 , 2008 . H. Shen and Y. Zhang. Improved approximate detection of duplicates for data streams over sliding windows. Journal of Computer Science and Technology, 23(6):973-987, 2008.","journal-title":"Journal of Computer Science and Technology"},{"key":"e_1_2_1_41_1","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1145\/1080091.1080114","volume-title":"ACM SIGCOMM","author":"Song H.","year":"2005","unstructured":"H. Song , S. Dharmapurikar , J. Turner , and J. Lockwood . Fast hash table lookup using extended bloom filter: An aid to network processing . In ACM SIGCOMM , pages 181 - 192 , 2005 . H. Song, S. Dharmapurikar, J. Turner, and J. Lockwood. Fast hash table lookup using extended bloom filter: An aid to network processing. In ACM SIGCOMM, pages 181-192, 2005."},{"key":"e_1_2_1_42_1","first-page":"127","volume-title":"USENIX","author":"Tolia N.","year":"2003","unstructured":"N. Tolia , M. Kozuch , M. Satyanarayanan , B. Karp , T. C. Bressoud , and A. Perrig . Opportunistic use of content addressable storage for distributed file systems . In USENIX , pages 127 - 140 , 2003 . N. Tolia, M. Kozuch, M. Satyanarayanan, B. Karp, T. C. Bressoud, and A. Perrig. Opportunistic use of content addressable storage for distributed file systems. In USENIX, pages 127-140, 2003."},{"key":"e_1_2_1_43_1","first-page":"431","volume-title":"ACM SIGMOD","author":"Weis M.","year":"2005","unstructured":"M. Weis and F. Naumann . Dogmatrix tracks down duplicates in xml . In ACM SIGMOD , pages 431 - 442 , 2005 . M. Weis and F. Naumann. Dogmatrix tracks down duplicates in xml. In ACM SIGMOD, pages 431-442, 2005."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2536354.2536359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,14]],"date-time":"2023-07-14T14:45:28Z","timestamp":1689345928000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2536354.2536359"}},"subtitle":["a near optimal approximate duplicate detection approach for data streams"],"short-title":[],"issued":{"date-parts":[[2013,6]]},"references-count":42,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2013,6]]}},"alternative-id":["10.14778\/2536354.2536359"],"URL":"https:\/\/doi.org\/10.14778\/2536354.2536359","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2013,6]]}}}