{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:27:31Z","timestamp":1777854451574,"version":"3.51.4"},"reference-count":54,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2011,7,29]],"date-time":"2011-07-29T00:00:00Z","timestamp":1311897600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2011,10]]},"abstract":"<jats:p>A search engine user with a well-defined information need is not interested in getting thousands of hits, but a few hits that are all highly relevant to their search. Often search words need to be refined and augmented to narrow results to more relevant pages. However, an overly specific query may lead to no hits at all, while most typical queries lead to thousands or even millions of them, both undesirable outcomes. This paper suggests a query rewriting method for generating alternative query strings and proposes a hit count prediction model for predicting the number of search engine hits for each alternative query string, based on the English language frequencies of the words in the search terms. Using the hit count prediction model, different types of search strategies, such as a lowest hit count query preference, can be utilized to improve users\u2019 search experience. We present an evaluation experiment of the hit count prediction model for three major search engines. We also discuss and quantify how far the Google, Yahoo! and Bing search engines diverge from monotonic behaviour, considering negative and positive search terms separately.<\/jats:p>","DOI":"10.1177\/0165551511415183","type":"journal-article","created":{"date-parts":[[2011,7,29]],"date-time":"2011-07-29T22:56:07Z","timestamp":1311980167000},"page":"462-475","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":4,"title":["A prediction model for web search hit counts using word frequencies"],"prefix":"10.1177","volume":"37","author":[{"given":"Tian","family":"Tian","sequence":"first","affiliation":[{"name":"New Jersey Institute of Technology, USA"}]},{"given":"Soon Ae","family":"Chun","sequence":"additional","affiliation":[{"name":"City University of New York, USA"}]},{"given":"James","family":"Geller","sequence":"additional","affiliation":[{"name":"New Jersey Institute of Technology, USA"}]}],"member":"179","published-online":{"date-parts":[[2011,7,29]]},"reference":[{"key":"bibr1-0165551511415183","volume-title":"NAACL workshop on automatic summarization","author":"Radev DR","year":"2001"},{"key":"bibr2-0165551511415183","first-page":"951","volume-title":"CHI\u201910: proceeding of the 28th international conference on human factors in computer systems","author":"Capra R"},{"key":"bibr3-0165551511415183","first-page":"49","volume-title":"Proceeding of the 28th annual international computer science software and applications conference (COMPSAC\u201904)","volume":"2","author":"Yuen L"},{"key":"bibr4-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20704"},{"key":"bibr5-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20079"},{"key":"bibr6-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2007.289"},{"key":"bibr7-0165551511415183","unstructured":"\u2018iProspect Search Engine User Behavior Study\u2019, http:\/\/www.iprospect.com\/premiumPDFs\/WhitePaper_2006_SearchEngineUserBehavior.pdf (2006, accessed February 2011)."},{"key":"bibr8-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85654-2_9"},{"key":"bibr9-0165551511415183","first-page":"1667","volume-title":"Proceedings of the 2007 ACM symposium on applied computing (ACM-SAC)","author":"An Y"},{"key":"bibr10-0165551511415183","volume-title":"Proceeding of ODBASE: OTM confederated international conferences","author":"Fu G"},{"key":"bibr11-0165551511415183","volume-title":"workshop on adaptive text extraction and mining (ATEM 2003). 14th European conference on machine learning (ECML 2003)","author":"Navigli R"},{"key":"bibr12-0165551511415183","unstructured":"Andreou A. Ontologies and query expansion. M.S. thesis. School of Informatics, University of Edinburgh, Edinburgh, UK, 2005."},{"key":"bibr13-0165551511415183","first-page":"382","volume-title":"10th IEEE conference one-commerce technology and the fifth IEEE conference on enterprise computing, e-commerce and eservices","author":"An Y"},{"key":"bibr14-0165551511415183","volume-title":"Spinning the semantic web: bringing the world wide web to its full potential","author":"McGuinness DL","year":"2003"},{"key":"bibr15-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1177\/0165551509103598"},{"key":"bibr16-0165551511415183","first-page":"162","volume-title":"2010 IEEE\/WIC\/ACM International Conference On web intelligence (WI 2010)","author":"Tian T"},{"key":"bibr17-0165551511415183","first-page":"175","volume-title":"10th international conference on web engineering, icwe 2010 workshops: second international workshop on semantic web information management (SWIM), LNCS","author":"Tian T"},{"key":"bibr18-0165551511415183","doi-asserted-by":"publisher","DOI":"10.2200\/S00176ED1V01Y200903ICR004"},{"key":"bibr19-0165551511415183","volume-title":"Handbook of natural language processing","author":"Cilibrasi RL","year":"2010","edition":"2"},{"issue":"1","key":"bibr20-0165551511415183","first-page":"39","volume":"4","author":"S\u00e1nchez D","year":"2010","journal-title":"International journal of software and informatics"},{"key":"bibr21-0165551511415183","first-page":"1089","volume-title":"19th ACM international conference on information and knowledge management (CIKM)","author":"Mirizzi R"},{"key":"bibr22-0165551511415183","first-page":"337","volume-title":"10th international conference on web engineering (ICWE)","author":"Mirizzi R"},{"key":"bibr23-0165551511415183","first-page":"367","volume-title":"WWW 2006: 15th International World Wide Web conference","author":"Bar-Yossef Z"},{"key":"bibr24-0165551511415183","unstructured":"British National Corpus, http:\/\/www.natcorp.ox.ac.uk\/ (accessed February 2011)."},{"key":"bibr25-0165551511415183","first-page":"1395","volume-title":"WWW 2007:16th international World Wide Web conference","author":"Matsuo Y"},{"key":"bibr26-0165551511415183","volume-title":"RANLP 2005: proceedings of recent advances in natural language processing","author":"Nakov P"},{"key":"bibr27-0165551511415183","unstructured":"Pollard J. \u2018Google result counts are a meaningless metric\u2019, http:\/\/homepage.ntlworld.com\/jonathan.deboynepollard\/FGA\/google-result-counts-are-a-meaningless-metric.html (accessed February 2011)."},{"issue":"1","key":"bibr28-0165551511415183","first-page":"107","volume":"30","author":"Brin S","year":"1998","journal-title":"Computer networks"},{"key":"bibr29-0165551511415183","first-page":"114","volume-title":"1st international workshop on quality in web engineering","author":"Funahashi T"},{"key":"bibr30-0165551511415183","unstructured":"\u2018Google Web Search API\u2019, http:\/\/code.google.com\/apis\/websearch\/ (accessed May 2011)."},{"key":"bibr31-0165551511415183","unstructured":"\u2018Google Custom Search API\u2019, http:\/\/code.google.com\/apis\/customsearch\/v1\/overview.html (accessed May 2011)."},{"key":"bibr32-0165551511415183","unstructured":"\u2018Google SOAP Search API\u2019, http:\/\/code.google.com\/apis\/soapsearch\/ (accessed February 2011)."},{"key":"bibr33-0165551511415183","volume-title":"10th international conference of the International Society for Scientometrics and Informetrics: proceeding of the ISSI 2005 conference","author":"Mayr P"},{"key":"bibr34-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1162\/coli.2007.33.1.147"},{"key":"bibr35-0165551511415183","first-page":"767","volume-title":"16th international world Wide Web conference","author":"Gligorov R"},{"key":"bibr36-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1162\/089120103322711604"},{"key":"bibr37-0165551511415183","unstructured":"\u2018Top 5000 most frequent English words from Brown Corpus\u2019, http:\/\/www.edict.com.hk\/lexiconindex\/frequencylists\/words2000.htm, http:\/\/www.edict.com.hk\/lexiconindex\/frequencylists\/words2-5k.htm (accessed March 2010)."},{"key":"bibr38-0165551511415183","unstructured":"\u2018Stop words\u2019, http:\/\/ir.dcs.gla.ac.uk\/resources\/linguistic_utils\/stop_words (accessed February 2011)."},{"key":"bibr39-0165551511415183","doi-asserted-by":"publisher","DOI":"10.4159\/harvard.9780674434929"},{"key":"bibr40-0165551511415183","volume-title":"Human behavior and the principle of least-effort","author":"Zipf G","year":"1949"},{"key":"bibr41-0165551511415183","volume-title":"The psycho-biology of language: an introduction to dynamic philology","author":"Zipf G","year":"1935"},{"key":"bibr42-0165551511415183","first-page":"206","volume-title":"LNCS: proceedings of workshop WADS\u201989","author":"Szpankowski W"},{"key":"bibr43-0165551511415183","volume-title":"Proceedings of the 31st annual Boston University conference on language development","author":"Goldwater S"},{"key":"bibr44-0165551511415183","first-page":"36","volume-title":"LNCS 2001: Proceedings of the 14th biennial conference of the Canadian Society on Computational Studies of Intelligence: advances in artificial intelligence","author":"Pantel P"},{"key":"bibr45-0165551511415183","unstructured":"Srikanth M. Exploiting query features in language modeling approach for information retrieval. PhD thesis, State University of New York at Buffalo, 2004."},{"key":"bibr46-0165551511415183","first-page":"214","volume-title":"Proceedings of sigir\u201999","author":"Miller D"},{"key":"bibr47-0165551511415183","first-page":"316","volume-title":"Proceedings of SIGIR\u201999","author":"Song F"},{"key":"bibr48-0165551511415183","doi-asserted-by":"publisher","DOI":"10.1002\/0470013192.bsa689"},{"key":"bibr49-0165551511415183","volume-title":"Data mining: practical machine learning tools and techniques","author":"Witten IH","year":"2005","edition":"2"},{"key":"bibr50-0165551511415183","unstructured":"\u2018Google Search Basics: Basic Search Help\u2019, http:\/\/www.google.com\/support\/websearch\/bin\/answer.py?hl=en&answer=134479 (accessed February 2011)."},{"key":"bibr51-0165551511415183","unstructured":"\u2018Yahoo Help: Search Tips\u2019, http:\/\/help.yahoo.com\/l\/us\/yahoo\/search\/narrowyoursearch\/basics04.html;_ylt=AggxCl0pmWi9tjzBuKskoYh6YXhG (accessed February 2011)."},{"key":"bibr52-0165551511415183","unstructured":"\u2018Bing Help: Search Tips\u2019, http:\/\/help.live.com\/help.aspx?project=wl_searchv1&market=en-US (accessed February 2011)."},{"key":"bibr53-0165551511415183","unstructured":"\u2018American National Corpus (ANC)\u2019, http:\/\/www.americannationalcorpus.org\/ (accessed February 2011)."},{"key":"bibr54-0165551511415183","unstructured":"\u2018Google plus operator\u2019, http:\/\/www.google.com\/support\/websearch\/bin\/answer.py?answer=136861 (accessed May 2011)."}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551511415183","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551511415183","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551511415183","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:08:07Z","timestamp":1777504087000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551511415183"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,7,29]]},"references-count":54,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2011,10]]}},"alternative-id":["10.1177\/0165551511415183"],"URL":"https:\/\/doi.org\/10.1177\/0165551511415183","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,7,29]]}}}