{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T06:54:12Z","timestamp":1760597652235},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2013,9]]},"abstract":"<jats:p>\n            In this paper, we address the problem of finding\n            <jats:italic>k<\/jats:italic>\n            -nearest neighbors (KNN) in sequence databases using the edit distance. Unlike most existing works using short and exact\n            <jats:italic>n<\/jats:italic>\n            -gram matchings together with a filter-and-refine framework for KNN sequence search, our new approach allows us to use longer but approximate\n            <jats:italic>n<\/jats:italic>\n            -gram matchings as a basis of KNN candidates pruning. Based on this new idea, we devise a pipeline framework over a two-level index for searching KNN in the sequence database. By coupling this framework together with several efficient filtering strategies, i.e. the frequency queue and the well-known Combined Algorithm (CA), our proposal brings various enticing advantages over existing works, including 1) huge reduction on false positive candidates to avoid large overheads on candidate verifications; 2) progressive result update and early termination; and 3) good extensibility to parallel computation. We conduct extensive experiments on three real datasets to verify the superiority of the proposed framework.\n          <\/jats:p>","DOI":"10.14778\/2732219.2732220","type":"journal-article","created":{"date-parts":[[2015,5,12]],"date-time":"2015-05-12T15:37:52Z","timestamp":1431445072000},"page":"1-12","source":"Crossref","is-referenced-by-count":20,"title":["Efficient and effective KNN sequence search with approximate n-grams"],"prefix":"10.14778","volume":"7","author":[{"given":"Xiaoli","family":"Wang","sequence":"first","affiliation":[{"name":"National University of Singapore"}]},{"given":"Xiaofeng","family":"Ding","sequence":"additional","affiliation":[{"name":"Huazhong University of Sci. &amp; Tech. and University of South Australia"}]},{"given":"Anthony K. H.","family":"Tung","sequence":"additional","affiliation":[{"name":"National University of Singapore"}]},{"given":"Zhenjie","family":"Zhang","sequence":"additional","affiliation":[{"name":"Advanced Digital Sciences Center"}]}],"member":"320","published-online":{"date-parts":[[2013,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1869790.1869802"},{"key":"e_1_2_1_2_1","first-page":"918","volume-title":"VLDB","author":"Arasu A.","year":"2006","unstructured":"A. Arasu , V. Ganti , and R. Kaushik . Efficient exact set-similarity joins . In VLDB , pages 918 -- 929 , 2006 . A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins. In VLDB, pages 918--929, 2006."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/11408079_4"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559919"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2013.6544886"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/375551.375567"},{"key":"e_1_2_1_7_1","first-page":"491","volume-title":"VLDB","author":"Gravano L.","year":"2001","unstructured":"L. Gravano , P. G. Ipeirotis , H. V. Jagadish , N. Koudas , S. Muthukrishnan , and D. Srivastava . Approximate string joins in a database (almost) for free . In VLDB , pages 491 -- 500 , 2001 . L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate string joins in a database (almost) for free. In VLDB, pages 491--500, 2001."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1002\/nav.3800020109"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497434"},{"key":"e_1_2_1_10_1","first-page":"303","volume-title":"VLDB","author":"Li C.","year":"2007","unstructured":"C. Li , B. Wang , and X. Yang . Vgram: improving performance of approximate queries on string collections using variable-length grams . In VLDB , pages 303 -- 314 , 2007 . C. Li, B. Wang, and X. Yang. Vgram: improving performance of approximate queries on string collections using variable-length grams. In VLDB, pages 303--314, 2007."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/2078331.2078340"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(80)90002-1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/375360.375365"},{"issue":"4","key":"e_1_2_1_14_1","first-page":"19","article-title":"Indexing methods for approximate string matching","volume":"24","author":"Navarro G.","year":"2001","unstructured":"G. Navarro , R. A. Baeza-Yates , E. Sutinen , and J. Tarhio . Indexing methods for approximate string matching . IEEE Data Eng. Bull. , 24 ( 4 ): 19 -- 27 , 2001 . G. Navarro, R. A. Baeza-Yates, E. Sutinen, and J. Tarhio. Indexing methods for approximate string matching. IEEE Data Eng. Bull., 24(4): 19--27, 2001.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03547-0_4"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989431"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICICIC.2008.422"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/647815.738434"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2006.05.004"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(92)90143-4"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557670.1557677"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920992"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213847"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559925"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.28"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01942606"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376655"},{"key":"e_1_2_1_28_1","volume-title":"AAAI","author":"Yang Z.","year":"2010","unstructured":"Z. Yang , J. Yu , and M. Kitsuregawa . Fast algorithms for top-k approximate string matching . In AAAI , 2010 . Z. Yang, J. Yu, and M. Kitsuregawa. Fast algorithms for top-k approximate string matching. In AAAI, 2010."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807167.1807266"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/AXMEDIS.2006.40"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2732219.2732220","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:50:57Z","timestamp":1672221057000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2732219.2732220"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,9]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,9]]}},"alternative-id":["10.14778\/2732219.2732220"],"URL":"https:\/\/doi.org\/10.14778\/2732219.2732220","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2013,9]]}}}