{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T19:45:59Z","timestamp":1759693559628,"version":"3.41.0"},"reference-count":68,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2016,8,29]],"date-time":"2016-08-29T00:00:00Z","timestamp":1472428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"IBM Shared University (SUR) Award"},{"name":"Discovery grant and CREATE award from the Natural Sciences & Engineering Research Council (NSERC) of Canada"},{"name":"Early Researcher Award\/Premiers Research Excellence Award"},{"name":"Information Retrieval and Knowledge Management Research Laboratory"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2016,9,14]]},"abstract":"<jats:p>\n            Traditional pseudo relevance feedback (PRF) models choose top\n            <jats:italic>k<\/jats:italic>\n            feedback documents for query expansion and treat those documents equally. When\n            <jats:italic>k<\/jats:italic>\n            is determined, feedback terms are selected without considering the reliability of these documents for relevance. Because the performance of PRF is sensitive to the selection of feedback terms, noisy terms imported from these irrelevant documents or partially relevant documents will harm the final results extensively. Intuitively, terms in these documents should be considered less important for feedback term selection. Nonetheless, how to measure the reliability of feedback documents is a difficult problem.\n          <\/jats:p>\n          <jats:p>Recently, topic modeling has become more and more popular in the information retrieval (IR) area. In order to identify how reliable a feedback document is to be relevant, we attempt to adapt the topical information into PRF. However, topics are hard to be quantified and therefore the identification of topic is usually fuzzy. It is very challenging for integrating the obtained topical information effectively into IR and other text-processing-related areas. Current research work mainly focuses on mining relevant information from particular topics. This is extremely difficult when the boundaries of different topics are hard to define. In this article, we investigate a key factor of this problem, the topic number for topic modeling and how it makes topics \u201cfuzzy.\u201d To effectively and efficiently apply topical information, we propose a new probabilistic framework, \u201cTopPRF,\u201d and three models, TS-COS, TS-EU, and TS-Entropy, via integrating \u201cTopic Space\u201d (TS) information into pseudo relevance feedback. These methods discover how reliable a document is to be relevant through both term and topical information. When selecting feedback terms, candidate terms in more reliable feedback documents should obtain extra weights. Experimental results on various public collections justify that our proposed methods can significantly reduce the influence of \u201cfuzzy topics\u201d and obtain stable, good results over the strong baseline models. Our proposed probabilistic framework, TopPRF, and three topic-space-based models are capable of searching documents beyond traditional term matching only and provide a promising avenue for constructing better topic-space-based IR systems. Moreover, in-depth discussions and conclusions are made to help other researchers apply topical information effectively.<\/jats:p>","DOI":"10.1145\/2956234","type":"journal-article","created":{"date-parts":[[2016,8,31]],"date-time":"2016-08-31T12:30:21Z","timestamp":1472646621000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":25,"title":["TopPRF"],"prefix":"10.1145","volume":"34","author":[{"given":"Jun","family":"Miao","sequence":"first","affiliation":[{"name":"Information Retrieval &amp; Knowledge Management Research Lab, York University, Canada"}]},{"given":"Jimmy Xiangji","family":"Huang","sequence":"additional","affiliation":[{"name":"Information Retrieval &amp; Knowledge Management Research Lab, York University, Canada"}]},{"given":"Jiashu","family":"Zhao","sequence":"additional","affiliation":[{"name":"Information Retrieval &amp; Knowledge Management Research Lab, York University, Canada"}]}],"member":"320","published-online":{"date-parts":[[2016,8,29]]},"reference":[{"volume-title":"Proceedings of the 9th Text REtrieval Conference, 13","author":"Allan J.","key":"e_1_2_2_1_1","unstructured":"J. Allan , M. E. Connell , W. B. Croft , F. Feng , D. Fisher , and X. Li . 2000. INQUERY and TREC-9 . In Proceedings of the 9th Text REtrieval Conference, 13 . J. Allan, M. E. Connell, W. B. Croft, F. Feng, D. Fisher, and X. Li. 2000. INQUERY and TREC-9. In Proceedings of the 9th Text REtrieval Conference, 13."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020503"},{"volume-title":"Proceedings of the 5th Text REtrieval Conference. NIST Special Publication SP, 143166","author":"Beaulieu M.","key":"e_1_2_2_3_1","unstructured":"M. Beaulieu , M. Gatford , X. Huang , S. Robertson , S. Walker , and P. Williams . 1997. Okapi at TREC-5 . In Proceedings of the 5th Text REtrieval Conference. NIST Special Publication SP, 143166 . M. Beaulieu, M. Gatford, X. Huang, S. Robertson, S. Walker, and P. Williams. 1997. Okapi at TREC-5. In Proceedings of the 5th Text REtrieval Conference. NIST Special Publication SP, 143166."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505652"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_2_2_6_1","first-page":"17","article-title":"Hierarchical topic models and the nested chinese restaurant process","volume":"16","author":"Blei G.","year":"2004","unstructured":"G. Blei and J. Tenenbaum . 2004 . Hierarchical topic models and the nested chinese restaurant process . Advances in Neural Information Processing Systems 16 : 17 -- 25 . G. Blei and J. Tenenbaum. 2004. Hierarchical topic models and the nested chinese restaurant process. Advances in Neural Information Processing Systems 16:17--25.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348485"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390377"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/366836.366860"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2484028.2484057"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390446"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646059"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609531"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2746231"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1984.4767596"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307752101"},{"key":"e_1_2_2_17_1","unstructured":"J. He 2011. Exploring Topic Structure: Coherence Diversity and Relatedness. ISBN 9789490371814.  J. He 2011. Exploring Topic Structure: Coherence Diversity and Relatedness. ISBN 9789490371814."},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/312624.312649"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2012.08.002"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1571941.1571995"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2006.22"},{"volume-title":"Proceedings of the 14th Text REtrieval Conference.","author":"Huang X.","key":"e_1_2_2_22_1","unstructured":"X. Huang , M. Zhong , and L. Si . 2005. York University at TREC 2005: Genomics track . In Proceedings of the 14th Text REtrieval Conference. X. Huang, M. Zhong, and L. Si. 2005. York University at TREC 2005: Genomics track. In Proceedings of the 14th Text REtrieval Conference."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914748"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487788.2487861"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.383972"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143917"},{"key":"e_1_2_2_27_1","volume-title":"Proceedings of the 29th Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence","author":"Liu Y.","year":"2015","unstructured":"Y. Liu , Z. Liu , T. Chua , and M. Sun . 2015. Topical word embeddings . In Proceedings of the 29th Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence , January 25-30, 2015 , Austin, Texas,USA.. 2418--2424. Y. Liu, Z. Liu, T. Chua, and M. Sun. 2015. Topical word embeddings. In Proceedings of the 29th Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas,USA.. 2418--2424."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646259"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835546"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2009942"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1281192.1281246"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348356"},{"key":"e_1_2_2_33_1","volume-title":"Proceedings of the OSIR Workshop. 18--25","author":"Ounis I.","year":"2006","unstructured":"I. Ounis , G. Amati , V. Plachouras , B. He , C. Macdonald , and C. Lioma 2006 . Terrier: A high performance and scalable information retrieval platform . In Proceedings of the OSIR Workshop. 18--25 . I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma 2006. Terrier: A high performance and scalable information retrieval platform. In Proceedings of the OSIR Workshop. 18--25."},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1401890.1401960"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1108\/eb046814"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12275-0_50"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000019"},{"volume-title":"Proceedings of the 3rd Text REtrieval Conference.","author":"Robertson S. E.","key":"e_1_2_2_38_1","unstructured":"S. E. Robertson , S. Walker , S. Jones , Hancock- M. Beaulieu , and Gatford, M . 1994. Okapi at TREC-3 . In Proceedings of the 3rd Text REtrieval Conference. S. E. Robertson, S. Walker, S. Jones, Hancock-M. Beaulieu, and Gatford, M. 1994. Okapi at TREC-3. In Proceedings of the 3rd Text REtrieval Conference."},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/188490.188561"},{"volume-title":"Relevance feedback in information retrieval, 313--323","author":"Rocchio J.","key":"e_1_2_2_40_1","unstructured":"J. Rocchio . 1971. Relevance feedback in information retrieval, 313--323 . Prentice-Hall Englewood Cliffs . J. Rocchio. 1971. Relevance feedback in information retrieval, 313--323. Prentice-Hall Englewood Cliffs."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/361219.361220"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-37256-8_31"},{"key":"e_1_2_2_44_1","unstructured":"H. Stark Y. Yang and Y. Yang. 1998. Vector space projections: A numerical approach to signal and image processing neural nets and optics. John Wiley & Sons Inc. ISBN:0471241407.   H. Stark Y. Yang and Y. Yang. 1998. Vector space projections: A numerical approach to signal and image processing neural nets and optics. John Wiley & Sons Inc. ISBN:0471241407."},{"key":"e_1_2_2_45_1","volume-title":"Proceedings of the International Conference on Intelligent Analysis.","volume":"2","author":"Strohman T.","year":"2005","unstructured":"T. Strohman , D. Metzler , H. Turtle , and W. B. Croft 2005 . Indri: A language model-based search engine for complex queries . In Proceedings of the International Conference on Intelligent Analysis. Vol. 2 . 2--6. T. Strohman, D. Metzler, H. Turtle, and W. B. Croft 2005. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis. Vol. 2. 2--6."},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2008.71"},{"key":"e_1_2_2_47_1","article-title":"Hierarchical Dirichlet processes","volume":"2006","author":"Teh Y. W.","year":"2012","unstructured":"Y. W. Teh , M. I. Jordan , M. J. Beal , and D. M. Blei . 2012 . Hierarchical Dirichlet processes . Journal of the American Statistical Association , 2006. 101{476}:1566--1581. Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. 2012. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 2006. 101{476}:1566--1581.","journal-title":"Journal of the American Statistical Association"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0306-4573(99)00043-6"},{"key":"e_1_2_2_49_1","series-title":"Lecture Notes for EEB 581","volume-title":"Markov chain Monte Carlo and Gibbs sampling","author":"Walsh B.","unstructured":"B. Walsh . 2004. Markov chain Monte Carlo and Gibbs sampling . Lecture Notes for EEB 581 . http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi&equals;10.1.1.131.4064. B. Walsh. 2004. Markov chain Monte Carlo and Gibbs sampling. Lecture Notes for EEB 581. http:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi&equals;10.1.1.131.4064."},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020480"},{"key":"e_1_2_2_51_1","first-page":"1511","article-title":"LDA based pseudo relevance feedback for cross language information retrieval","volume":"03","author":"Wang X.","year":"2012","unstructured":"X. Wang , Q. Zhang , X. Wang , and Y. Sun . 2012 . LDA based pseudo relevance feedback for cross language information retrieval . In Cloud Computing and Intelligent Systems (CCIS) , volume 03 , 1511 -- 1516 . X. Wang, Q. Zhang, X. Wang, and Y. Sun. 2012. LDA based pseudo relevance feedback for cross language information retrieval. In Cloud Computing and Intelligent Systems (CCIS), volume 03, 1511--1516.","journal-title":"Cloud Computing and Intelligent Systems (CCIS)"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148204"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2006.06.005"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/333135.333138"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-17187-1_14"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2600428.2609636"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.23430"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21501"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348451"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458317"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00958-7_6"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2012.24"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000008"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/383952.384019"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/984321.984322"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2009941"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2590988"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2507868"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2956234","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2956234","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:39:43Z","timestamp":1750217983000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2956234"}},"subtitle":["A Probabilistic Framework for Integrating Topic Space into Pseudo Relevance Feedback"],"short-title":[],"issued":{"date-parts":[[2016,8,29]]},"references-count":68,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,9,14]]}},"alternative-id":["10.1145\/2956234"],"URL":"https:\/\/doi.org\/10.1145\/2956234","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2016,8,29]]},"assertion":[{"value":"2015-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-08-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}