{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T17:10:08Z","timestamp":1760202608707,"version":"3.41.0"},"reference-count":15,"publisher":"Association for Computing Machinery (ACM)","license":[{"start":{"date-parts":[[2009,12,1]],"date-time":"2009-12-01T00:00:00Z","timestamp":1259625600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ACM J. Exp. Algorithmics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:p>The intersection of large ordered sets is a common problem in the context of the evaluation of boolean queries to a search engine. In this article, we propose several improved algorithms for computing the intersection of sorted arrays, and in particular for searching sorted arrays in the intersection context. We perform an experimental comparison with the algorithms from the previous studies from Demaine, L\u00f3pez-Ortiz, and Munro [ALENEX 2001] and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we implement and test the intersection algorithm from Barbay and Kenyon [SODA 2002] and its randomized variant [SAGA 2003]. We consider both the random data set from Baeza-Yates and Salinger, the Google queries used by Demaine et al., a corpus provided by Google, and a larger corpus from the TREC Terabyte 2006 efficiency query stream, along with its own query log. We measure the performance both in terms of the number of comparisons and searches performed, and in terms of the CPU time on two different architectures. Our results confirm or improve the results from both previous studies in their respective context (comparison model on real data, and CPU measures on random data) and extend them to new contexts. In particular, we show that value-based search algorithms perform well in posting lists in terms of the number of comparisons performed.<\/jats:p>","DOI":"10.1145\/1498698.1564507","type":"journal-article","created":{"date-parts":[[2010,4,7]],"date-time":"2010-04-07T02:56:32Z","timestamp":1270608992000},"source":"Crossref","is-referenced-by-count":39,"title":["An experimental investigation of set intersection algorithms for text searching"],"prefix":"10.1145","volume":"14","author":[{"given":"J\u00e9r\u00e9my","family":"Barbay","sequence":"first","affiliation":[{"name":"Universidad de Chile, Santiago, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alejandro","family":"L\u00f3pez-Ortiz","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tyler","family":"Lu","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alejandro","family":"Salinger","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2010,1,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-27801-6_30"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/11575832_2"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-39816-5_3"},{"volume-title":"Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics","author":"Barbay J.","key":"e_1_2_1_4_1","unstructured":"Barbay , J. and Kenyon , C . 2002. Adaptive intersection and t-threshold problems . In Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics , Philadelphia, 390--399. Barbay, J. and Kenyon, C. 2002. Adaptive intersection and t-threshold problems. In Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics, Philadelphia, 390--399."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1328911.1328915"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/11764298_13"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(76)90071-5"},{"volume-title":"Cse 413 -- analysis of algorithms -- fall. Department of Computer Science and Engineering","author":"Chen D. Z.","key":"e_1_2_1_8_1","unstructured":"Chen , D. Z. 2003. Cse 413 -- analysis of algorithms -- fall. Department of Computer Science and Engineering , University of Notre Dame , Notre Dame, IN . Chen, D. Z. 2003. Cse 413 -- analysis of algorithms -- fall. Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/PL00009235"},{"volume-title":"Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM","author":"Demaine E. D.","key":"e_1_2_1_10_1","unstructured":"Demaine , E. D. , Jones , T. R. , and Patrascu , M . 2004. Interpolation search for non-independent data . In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM , Philadelphia, 529--530. Demaine, E. D., Jones, T. R., and Patrascu, M. 2004. Interpolation search for non-independent data. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, Philadelphia, 529--530."},{"volume-title":"Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM","author":"Demaine E. D., L","key":"e_1_2_1_11_1","unstructured":"Demaine , E. D., L &amp;#243;pez- Ortiz , A. , and Munro , J. I . 2000. Adaptive set intersections, unions, and differences . In Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM , Philadelphia, 743--752. Demaine, E. D., L&amp;#243;pez-Ortiz, A., and Munro, J. I. 2000. Adaptive set intersections, unions, and differences. In Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, Philadelphia, 743--752."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments. SIAM","volume":"2153","author":"Demaine E. D., L","unstructured":"Demaine , E. D., L &amp;#243;pez-ortiz, A., and Munro , J. I . 2001. Experiments on adaptive set intersections for text retrieval systems . In Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments. SIAM , Philadelphia. Lecture Notes in Computer Science (LNCS) , vol. 2153 . Washington DC, 5--6. Demaine, E. D., L&amp;#243;pez-ortiz, A., and Munro, J. I. 2001. Experiments on adaptive set intersections for text retrieval systems. In Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments. SIAM, Philadelphia. Lecture Notes in Computer Science (LNCS), vol. 2153. Washington DC, 5--6."},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Hwang F. K. and Lin S. 1971. Optimal merging of 2 elements with n elements. Acta Inf. 145--158.  Hwang F. K. and Lin S. 1971. Optimal merging of 2 elements with n elements. Acta Inf. 145--158.","DOI":"10.1007\/BF00289521"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1137\/0201004"},{"volume-title":"The Algorithm Design Manual. TELOS","author":"Skiena S. S.","key":"e_1_2_1_15_1","unstructured":"Skiena , S. S. 1997. The Algorithm Design Manual. TELOS , State University of New York , Stony Brook, NY . Skiena, S. S. 1997. The Algorithm Design Manual. TELOS, State University of New York, Stony Brook, NY."}],"container-title":["ACM Journal of Experimental Algorithmics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1498698.1564507","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1498698.1564507","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T12:45:43Z","timestamp":1750250743000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1498698.1564507"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,12]]},"references-count":15,"alternative-id":["10.1145\/1498698.1564507"],"URL":"https:\/\/doi.org\/10.1145\/1498698.1564507","relation":{},"ISSN":["1084-6654","1084-6654"],"issn-type":[{"type":"print","value":"1084-6654"},{"type":"electronic","value":"1084-6654"}],"subject":[],"published":{"date-parts":[[2009,12]]}}}