{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T15:03:38Z","timestamp":1775055818734,"version":"3.50.1"},"reference-count":101,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2010,12,1]],"date-time":"2010-12-01T00:00:00Z","timestamp":1291161600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2010,12]]},"abstract":"<jats:p>\n            Inverted index data structures are the key to fast text search engines. We first investigate one of the predominant operation on inverted indexes, which asks for intersecting two sorted lists of document IDs of different lengths. We explore compression and performance of different inverted list data structures. In particular, we present\n            <jats:italic>Lookup<\/jats:italic>\n            , a new data structure that allows intersection in expected time linear in the smaller list.\n          <\/jats:p>\n          <jats:p>Based on this result, we present the algorithmic core of a full text data base that allows fast Boolean queries, phrase queries, and document reporting using less space than the input text. The system uses a carefully choreographed combination of classical data compression techniques and inverted-index-based search data structures. Our experiments show that inverted indexes are preferable over purely suffix-array-based techniques for in-memory (English) text search engines.<\/jats:p>\n          <jats:p>A similar system is now running in practice in each core of the distributed data base engine TREX of SAP.<\/jats:p>","DOI":"10.1145\/1877766.1877768","type":"journal-article","created":{"date-parts":[[2010,12,22]],"date-time":"2010-12-22T14:41:31Z","timestamp":1293028891000},"page":"1-37","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Engineering basic algorithms of an in-memory text search engine"],"prefix":"10.1145","volume":"29","author":[{"given":"Frederik","family":"Transier","sequence":"first","affiliation":[{"name":"University of Karlsruhe, SAP NetWeaver EIM TREX, Germany"}]},{"given":"Peter","family":"Sanders","sequence":"additional","affiliation":[{"name":"University of Karlsruhe, Germany"}]}],"member":"320","published-online":{"date-parts":[[2010,12,27]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/290941.291011"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 9th Australasian Database Conference (ADC'98)","author":"Anh V. N."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 15th Australasian Database Conference (ADC'04)","author":"Anh V. N."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:INRT.0000048490.99518.5c"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2006.99"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148235"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/2111648.2111682"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/11780441_29"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-27801-6_30"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1409220.1409223"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/11575832_2"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/11764298_13"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science (STACS'09)","author":"Barbay J."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2003.1196112"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-008-9048-x"},{"key":"e_1_2_1_16_1","unstructured":"Bast H. et al. 2007. Seminar: Searching with suffix arrays. http:\/\/search.mpi-inf.mpg. de\/wiki\/IrSeminarWs06.  Bast H. et al. 2007. Seminar: Searching with suffix arrays. http:\/\/search.mpi-inf.mpg. de\/wiki\/IrSeminarWs06."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the Workshop on Large Scale Distributed Systems for Information Retrieval. (Collocated with SIGIR.)","author":"Bender M., S."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(76)90071-5"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-006-6614-y"},{"key":"e_1_2_1_20_1","unstructured":"Blandford D. 2005. Compact data structures with fast queries. Ph.D. thesis Carnegie Mellon University Pittsburgh PA.   Blandford D. 2005. Compact data structures with fast queries. Ph.D. thesis Carnegie Mellon University Pittsburgh PA."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the Data Compression Conference (DCC'02)","author":"Blandford D."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/256163.256166"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89097-3_13"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 4th International Conference on Extending Database Technology (EDBT'94)","author":"Brown E. W."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383958"},{"key":"e_1_2_1_26_1","unstructured":"Claude F. Fari\u00f1a A. and Navarro G. 2008. Re-pair compression of inverted lists. Tech. rep. TR\/DCC-2008-16 Department of Computer Science University of Chile.  Claude F. Fari\u00f1a A. and Navarro G. 2008. Re-pair compression of inverted lists. Tech. rep. TR\/DCC-2008-16 Department of Computer Science University of Chile."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 14th International Symposium on String Processing and Information Retrieval (SPIRE'07)","author":"Culpepper J. S."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/96749.98245"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'00)","author":"Demaine E. D."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 3rd Workshop on Algorithm Engineering and Experimentation (ALENEX'01)","volume":"2153","author":"Demaine E. D."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1125857.1125859"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching (CPM'07)","author":"Ferragina P."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1082036.1082039"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1240233.1240243"},{"key":"e_1_2_1_35_1","unstructured":"Ferragina P. and Navarro G. 2005. homepage. http:\/\/pizzachili.dcc.uchile.cl.  Ferragina P. and Navarro G. 2005. homepage. http:\/\/pizzachili.dcc.uchile.cl."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277833"},{"key":"e_1_2_1_37_1","unstructured":"Gailly J. and Adler M. 2005. zlib compression library. http:\/\/www.zlib.net.  Gailly J. and Adler M. 2005. zlib compression library. http:\/\/www.zlib.net."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526768"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-73437-6_23"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 7th Text Retrieval Conference (TREC-7).","author":"Hawking D."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 8th Text Retrieval Conference (TREC-8). http:\/\/ir.dcs.gla.ac.uk\/test_collections\/access_to_data.html.","author":"Hawking D."},{"key":"e_1_2_1_42_1","unstructured":"Hopp D. Hector H.-W. Plattner H. Tschira K. E. and Wellenreuther C. 1972. SAP. http:\/\/www.sap.com.  Hopp D. Hector H.-W. Plattner H. Tschira K. E. and Wellenreuther C. 1972. SAP. http:\/\/www.sap.com."},{"key":"e_1_2_1_43_1","unstructured":"Hsieh P. 2004. Superfasthash. http:\/\/www.azillionmonkeys.com\/qed\/hash.html.  Hsieh P. 2004. Superfasthash. http:\/\/www.azillionmonkeys.com\/qed\/hash.html."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00289521"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1137\/0201004"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/281250.281253"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/PDCAT.2008.61"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276316"},{"key":"e_1_2_1_49_1","unstructured":"Kuhn M. 2000. homepage. http:\/\/www.cl.cam.ac.uk\/~mgk25\/download\/transtab.tar.gz.  Kuhn M. 2000. homepage. http:\/\/www.cl.cam.ac.uk\/~mgk25\/download\/transtab.tar.gz."},{"key":"e_1_2_1_50_1","unstructured":"Kurz R. et al. 1998. TREX. http:\/\/en.wikipedia.org\/wiki\/TREX_search_engine.  Kurz R. et al. 1998. TREX. http:\/\/en.wikipedia.org\/wiki\/TREX_search_engine."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2004.05.018"},{"key":"e_1_2_1_52_1","unstructured":"Libxml2. 1999. homepage. http:\/\/xmlsoft.org.  Libxml2. 1999. homepage. http:\/\/xmlsoft.org."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060745.1060785"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2006.06.001"},{"key":"e_1_2_1_55_1","doi-asserted-by":"crossref","unstructured":"M\u00e4kinen V. and Navarro G. 2004a. Compressed compact suffix arrays. In CPM. 420--433.  M\u00e4kinen V. and Navarro G. 2004a. Compressed compact suffix arrays. In CPM. 420--433.","DOI":"10.1007\/978-3-540-27801-6_32"},{"key":"e_1_2_1_56_1","volume-title":"Run-length FM-index. In Proceedings of DIMACS Workshop: The Burrows-Wheeler Transform: Ten Years Later. 17--19","author":"M\u00e4kinen V."},{"key":"e_1_2_1_57_1","first-page":"40","article-title":"Succinct suffix arrays based on run-length encoding","volume":"12","author":"M\u00e4kinen V.","year":"2005","journal-title":"Nordic J. Comput."},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"Manning C. D. Raghavan P. and Sch\u00fctze H. 2008. Introduction to Information Retrieval. Cambridge University Press.   Manning C. D. Raghavan P. and Sch\u00fctze H. 2008. Introduction to Information Retrieval. Cambridge University Press.","DOI":"10.1017\/CBO9780511809071"},{"key":"e_1_2_1_59_1","volume-title":"Proceedings of the 5th International Web Caching and Content Delivery Workshop.","author":"Markatos E. P.","year":"2000"},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the 12th Australasian Document Computing Symposium (ADCS'07)","author":"Moffat A."},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the 6th Conference on Data Compression (DCC'96}. IEEE Computer Society","author":"Moffat A."},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1013002601898"},{"key":"e_1_2_1_63_1","doi-asserted-by":"crossref","unstructured":"Moffat A. and Turpin A. 2002. Compression and Coding Algorithms. Kluwer Academic Publishers Norwell MA.   Moffat A. and Turpin A. 2002. Compression and Coding Algorithms. Kluwer Academic Publishers Norwell MA.","DOI":"10.1007\/978-1-4615-0935-6"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-006-9014-4"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/237496.237497"},{"key":"e_1_2_1_66_1","volume-title":"Proceedings of the 6th Australasian Database Conference (ADC'95)","author":"Moffat A."},{"key":"e_1_2_1_67_1","unstructured":"Mucci P. et al. 2005. Performance API. http:\/\/icl.cs.utk.edu\/papi\/.  Mucci P. et al. 2005. Performance API. http:\/\/icl.cs.utk.edu\/papi\/."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1570-8667(03)00066-2"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/1216370.1216372"},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR'06)","author":"Ounis I."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1007\/11880561_11"},{"key":"e_1_2_1_72_1","volume-title":"R: A Language and Environment for Statistical Computing","author":"Development Core Team","year":"2009"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0196-6774(03)00087-7"},{"key":"e_1_2_1_74_1","volume-title":"Data Compression: The Complete Reference","author":"Salomon D.","year":"2006"},{"key":"e_1_2_1_75_1","volume-title":"Proceedings of the 9th Workshop on Algorithm Engineering and Experiments (ALENEX'07)","author":"Sanders P."},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383959"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/564376.564416"},{"key":"e_1_2_1_78_1","unstructured":"Silvers C. e. a. 2009. Sparsehash. http:\/\/code.google.com\/p\/google-sparsehash\/.  Silvers C. e. a. 2009. Sparsehash. http:\/\/code.google.com\/p\/google-sparsehash\/."},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.5555\/1763653.1763668"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009046"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1002\/1097-4571(2000)9999:9999%3C::AID-ASI1591%3E3.3.CO;2-I"},{"key":"e_1_2_1_82_1","unstructured":"Strohman T. 2005. Dynamic collections in indri. Tech. rep. IR-426 Center for Intelligent Information Retrieval University of Massachusetts Amherst.  Strohman T. 2005. Dynamic collections in indri. Tech. rep. IR-426 Center for Intelligent Information Retrieval University of Massachusetts Amherst."},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277774"},{"key":"e_1_2_1_84_1","volume-title":"Proceedings of the 10th Workshop on Algorithm Engineering and Experiments (ALENEX'08)","author":"Transier F."},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89097-3_20"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022949613039"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/1321440.1321592"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277766"},{"key":"e_1_2_1_89_1","doi-asserted-by":"crossref","unstructured":"Venables W. N. and Ripley B. D. 2002. Modern Applied Statistics with S 4th Ed. Springer New York.  Venables W. N. and Ripley B. D. 2002. Modern Applied Statistics with S 4th Ed. Springer New York.","DOI":"10.1007\/978-0-387-21706-2"},{"key":"e_1_2_1_90_1","volume-title":"Proceedings of the 35th Conference on Very Large Data Bases (VLDB'09)","author":"Willhalm T."},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/42.3.193"},{"key":"e_1_2_1_92_1","unstructured":"Williams H. E. et al. 2004. The zettair search engine. http:\/\/www.seg.rmit.edu.au\/zettair.  Williams H. E. et al. 2004. The zettair search engine. http:\/\/www.seg.rmit.edu.au\/zettair."},{"key":"e_1_2_1_93_1","volume-title":"Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann.","author":"Witten I. H.","year":"1999"},{"key":"e_1_2_1_94_1","volume-title":"Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies IEEE (InfoCom). 1238--1247","author":"Xie Y.","year":"2002"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526764"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367550"},{"key":"e_1_2_1_97_1","unstructured":"Zipf G. K. 1932. Selected Studies of the Principle of Relative Frequency in Language. Harvard University Press.  Zipf G. K. 1932. Selected Studies of the Principle of Relative Frequency in Language. Harvard University Press."},{"key":"e_1_2_1_98_1","unstructured":"Zipf G. K. 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley Cambridge MA.  Zipf G. K. 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley Cambridge MA."},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1145\/1132956.1132959"},{"key":"e_1_2_1_100_1","volume-title":"Proceedings of the 4th Australian Database Conference. World Scientific, 26--38","author":"Zobel J."},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.150"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1877766.1877768","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1877766.1877768","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:52:47Z","timestamp":1750243967000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1877766.1877768"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,12]]},"references-count":101,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,12]]}},"alternative-id":["10.1145\/1877766.1877768"],"URL":"https:\/\/doi.org\/10.1145\/1877766.1877768","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,12]]},"assertion":[{"value":"2009-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-12-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}