{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T06:18:43Z","timestamp":1775283523141,"version":"3.50.1"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2010,10,1]],"date-time":"2010-10-01T00:00:00Z","timestamp":1285891200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000153","name":"Division of Biological Infrastructure","doi-asserted-by":"publisher","award":["DBI-0640543"],"award-info":[{"award-number":["DBI-0640543"]}],"id":[{"id":"10.13039\/100000153","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2010,10]]},"abstract":"<jats:p>A fundamental challenge in utilizing Web search click data is to infer user-perceived relevance from the search log. Not only is the inference a difficult problem involving statistical reasonings but the bulky size, together with the ever-increasing nature, of the log data imposes extra requirements on scalability. In this paper, we propose the Bayesian Browsing Model (BBM), which performs exact inference of the document relevance, only requires a single pass of the data (i.e., the optimal scalability), and is shown effective.<\/jats:p>\n          <jats:p>\n            We present two sets of experiments to evaluate the model effectiveness and scalability. On the first set of over 50 million search instances of 1.1 million distinct queries, BBM outperforms the state-of-the-art competitor by 29.2% in log-likelihood while being\n            <jats:italic>57 times<\/jats:italic>\n            faster. On the second click log set, spanning a quarter of petabyte, we showcase the scalability of BBM: we implemented it on a commercial MapReduce cluster, and it took only 3 hours to compute the relevance for 1.15 billion distinct query-URL pairs.\n          <\/jats:p>","DOI":"10.1145\/1857947.1857951","type":"journal-article","created":{"date-parts":[[2010,11,1]],"date-time":"2010-11-01T13:32:33Z","timestamp":1288618353000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Bayesian Browsing Model"],"prefix":"10.1145","volume":"4","author":[{"given":"Chao","family":"Liu","sequence":"first","affiliation":[{"name":"Microsoft Research"}]},{"given":"Fan","family":"Guo","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}]},{"given":"Christos","family":"Faloutsos","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}]}],"member":"320","published-online":{"date-parts":[[2010,10]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92185-1_68"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148177"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148175"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/543613.543615"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-31865-1_2"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the EDBT Workshops on Clustering Information Over the Web. 588--596","author":"Baeza-Yates R."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367505"},{"key":"e_1_2_1_8_1","unstructured":"Bishop C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag Berlin. Bishop C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics) . Springer-Verlag Berlin."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102363"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646033"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526711"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1718487.1718531"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341545"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 6th Conference on Symposium on Operating Systems Design &amp; Implementation (OSDI&rsquo;\u201904)","author":"Dean J."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/347090.347107"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390392"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/945365.964285"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1083784.1083789"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1572003"},{"key":"e_1_2_1_20_1","unstructured":"Giannella C. Han J. Pei J. Yan X. and Yu P. S. 2003. Mining frequent patterns in data streams at multiple time granularities. In Data Mining: Next Generation Challenges and Future Directions. Giannella C. Han J. Pei J. Yan X. and Yu P. S. 2003. Mining frequent patterns in data streams at multiple time granularities. In Data Mining: Next Generation Challenges and Future Directions ."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1198387"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1507509.1507523"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526712"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498818"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775067"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1076034.1076063"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1229179.1229181"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92185-1_65"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646035"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000016"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/647235.720257"},{"key":"e_1_2_1_32_1","unstructured":"Murphy K. P. 2001. An introduction to graphical models. www.cs.iastate.edu\/~honavar\/bayes0.pdf. Murphy K. P. 2001. An introduction to graphical models. www.cs.iastate.edu\/~honavar\/bayes0.pdf."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1081870.1081899"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242643"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 30th European Conference on Information Retrieval (ECIR\u201908)","author":"Shokouhi M."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/331403.331405"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277808"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1561\/2200000001"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557164"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150497"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1031171.1031192"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2005.850085"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1135777.1136004"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1718487.1718528"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1857947.1857951","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1857947.1857951","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:59:51Z","timestamp":1750244391000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1857947.1857951"}},"subtitle":["Exact Inference of Document Relevance from Petabyte-Scale Data"],"short-title":[],"issued":{"date-parts":[[2010,10]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2010,10]]}},"alternative-id":["10.1145\/1857947.1857951"],"URL":"https:\/\/doi.org\/10.1145\/1857947.1857951","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,10]]},"assertion":[{"value":"2010-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-10-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}