{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T11:44:44Z","timestamp":1771501484113,"version":"3.50.1"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2001,4]]},"abstract":"<jats:p>\n            The proliferation of searchable text databases on corporate networks and the Internet causes a database selection problem for many people. Algorithms such as gGLOSS and CORI can automatically select which text databases to search for a given information need, but only if given a set of resource descriptions that accurately represent the contents of each database. The existing techniques for a acquiring resource descriptions have significant limitations when used in wide-area networks controlled by many parties. This paper presents\n            <jats:italic>query-based sampling<\/jats:italic>\n            , a new technicque for acquiring accurate resource descriptions. Query-based sampling does not require the cooperation of resource providers, nor does it require that resource providers use a particular search engine or  representation technique. An extensive set of experimental results demonstrates that accurate resource descriptions are crated, that computation and communication costs are reasonable, and that the resource descriptions do in fact enable accurate automatic dtabase selection.\n          <\/jats:p>","DOI":"10.1145\/382979.383040","type":"journal-article","created":{"date-parts":[[2002,7,27]],"date-time":"2002-07-27T11:29:00Z","timestamp":1027769340000},"page":"97-130","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":233,"title":["Query-based sampling of text databases"],"prefix":"10.1145","volume":"19","author":[{"given":"Jamie","family":"Callan","sequence":"first","affiliation":[{"name":"Carnegie Mellon Univ."}]},{"given":"Margaret","family":"Connell","sequence":"additional","affiliation":[{"name":"Univ., of Massachusetts"}]}],"member":"320","published-online":{"date-parts":[[2001,4]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"49","volume-title":"Proceedings of the 4th Text Retrieval Conference (TREC-4, Washington, D.C., Nov.), D. K. Harman, Ed. National Institute of Standards and Technology","author":"ALLAN J.","year":"1995"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 7th Conference on Text Retrieval (TREC-7","author":"ALLAN J.","year":"1999"},{"key":"e_1_2_1_3_1","first-page":"258","volume-title":"Proceedings of the 20th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR '97","author":"BAUMGARTEN C.","year":"1997"},{"key":"e_1_2_1_4_1","volume-title":"Advances in Information Retrieval","author":"CALLAN J."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1145\/304182.304224","volume-title":"Proceedings of the 1999 ACM International Conference on Management of Data (SIGMOD '99","author":"CALLAN J.","year":"1999"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(94)00050-D"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/215206.215328"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the ICSI Workshop on Design Issues in Anonymity and Unobservability","author":"CLARKE I.","year":"2000"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1145\/336597.336628","volume-title":"Proceedings of the 5th ACM Conference on Digital Libraries. ACM","author":"CRASWELL N.","year":"2000"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '99","author":"FRENCH J.","year":"1999"},{"key":"e_1_2_1_11_1","first-page":"121","volume-title":"Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Re-trieval (SIGIR '98","author":"FRENCH J.C.","year":"1998"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/314516.314517"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 21st International Conference on Very Large Data Bases (VLDB '95","author":"GRAVANO L.","year":"1995"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the International ACM Conference on Management of Data (SIGMOD '97","author":"GRAVANO L.","year":"1997"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD '94","author":"GRAVANO L.","year":"1994"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 3rd IEEE International Conference on Parallel and Distributed Information Systems (PDIS","author":"GRAVANO L.","year":"1994"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 2nd Conference on Text Retrieval. (TREC-2). National Institute of Standards and Technology","author":"HARMAN D.K.","year":"1994"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","DOI":"10.6028\/NIST.SP.500-225","volume-title":"Proceedings of the 3rd Conference on Text Retrieval. (TREC-3","author":"HARMAN D.","year":"1995"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/297117.297123"},{"key":"e_1_2_1_20_1","unstructured":"HEAPS J. 1978. Information Retrieval-Computational and Theoretical Aspects. Academic Press Inc. New York NY.   HEAPS J. 1978. Information Retrieval-Computational and Theoretical Aspects. Academic Press Inc. New York NY."},{"key":"e_1_2_1_21_1","unstructured":"KROVETZ R. J. 1995. Word sense disambiguation for large text databases. Ph.D. Dissertation. Computer and Information Science Department University of Massachusetts Amherst MA.   KROVETZ R. J. 1995. Word sense disambiguation for large text databases. Ph.D. Dissertation. Computer and Information Science Department University of Massachusetts Amherst MA."},{"key":"e_1_2_1_22_1","first-page":"282","volume-title":"Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM '00)","author":"LARKEY L.","year":"2000"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1147\/rd.22.0159","article-title":"The automatic creation of literature abstracts","volume":"2","author":"LUHN H. P.","year":"1958","journal-title":"IBM J. Res. Dev."},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1002\/asi.4630340605","article-title":"An experimental comparison of the effectiveness of computers and humans as search intermediaries","volume":"34","author":"MARCUS R. S.","year":"1983","journal-title":"J. Am. Soc. Inf. Sci."},{"key":"e_1_2_1_26_1","first-page":"14","volume-title":"Proceedings of the 24th International Conference on Very Large Data Bases, A. Gupta, O. Shmueli, and J. Widom, Eds. Morgan Kaufmann","author":"MENG W.","year":"1998"},{"key":"e_1_2_1_27_1","first-page":"146","volume-title":"Proceedings of the 15th International IEEE Conference on Data Engineering","author":"MENG W.","year":"1999"},{"key":"e_1_2_1_28_1","unstructured":"MORONEY M. J. 1951. Facts from Figures. Penguin Books New York NY.  MORONEY M. J. 1951. Facts from Figures. Penguin Books New York NY."},{"key":"e_1_2_1_30_1","first-page":"232","volume-title":"Proceedings of the 23rd Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR '00)","author":"POWELL A.","year":"2000"},{"key":"e_1_2_1_31_1","unstructured":"PRESS W.H. TEUKOLSKY S.A. VETTERLING W.T. AND FLANNERY B. P. 1992. Numerical Recipes in C: The Art of Scientific Computing. 2nd ed. Cambridge University Press New York NY.   PRESS W.H. TEUKOLSKY S.A. VETTERLING W.T. AND FLANNERY B. P. 1992. Numerical Recipes in C: The Art of Scientific Computing. 2nd ed. Cambridge University Press New York NY."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"TURTLE H. R. 1991. Inference networks for document retrieval. Ph.D. Dissertation. Computer and Information Science Department University of Massachusetts Amherst MA.   TURTLE H. R. 1991. Inference networks for document retrieval. Ph.D. Dissertation. Computer and Information Science Department University of Massachusetts Amherst MA.","DOI":"10.1145\/96749.98006"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/125187.125188"},{"key":"e_1_2_1_34_1","first-page":"12","volume-title":"Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '95","author":"VILES C.L.","year":"1995"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1145\/263690.263800","volume-title":"Proceedings of the 2nd ACM International Conference on Digital Libraries (DL '97","author":"VOORHEES E.M.","year":"1997"},{"key":"e_1_2_1_36_1","first-page":"172","volume-title":"Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '95","author":"VOORHEES E.M.","year":"1995"},{"key":"e_1_2_1_37_1","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1145\/234828.234846","volume-title":"Proceedings of the Seventh ACM Conference on Hypertext '96 (Washington, D.C., Mar. 16-20)","author":"WEISS R.","year":"1996"},{"key":"e_1_2_1_38_1","first-page":"112","volume-title":"Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98","author":"XU J.","year":"1998"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '99","author":"XU J.","year":"1999"},{"key":"e_1_2_1_40_1","first-page":"164","volume-title":"Proceedings of the 12th IEEE International Conference on Data Engineering (ICDE '97","author":"YUWONO B.","year":"1996"},{"key":"e_1_2_1_41_1","first-page":"41","volume-title":"Proceedings of the 5th International Conference on Database Systems for Advanced Applications","author":"YUWONO B.","year":"1997"},{"key":"e_1_2_1_42_1","unstructured":"ZIPF G. K. 1949. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley Reading MA.  ZIPF G. K. 1949. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley Reading MA."}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/382979.383040","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,2]],"date-time":"2023-01-02T20:57:16Z","timestamp":1672693036000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/382979.383040"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2001,4]]},"references-count":40,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2001,4]]}},"alternative-id":["10.1145\/382979.383040"],"URL":"https:\/\/doi.org\/10.1145\/382979.383040","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"value":"1046-8188","type":"print"},{"value":"1558-2868","type":"electronic"}],"subject":[],"published":{"date-parts":[[2001,4]]},"assertion":[{"value":"2001-04-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}