{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T20:44:10Z","timestamp":1761597850041,"version":"3.41.2"},"reference-count":45,"publisher":"Emerald","issue":"6","license":[{"start":{"date-parts":[[2017,10,9]],"date-time":"2017-10-09T00:00:00Z","timestamp":1507507200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["OIR"],"published-print":{"date-parts":[[2017,10,9]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>The purpose of this paper is to find a representative subset from large-scale online reviews for consumers. The subset is significantly small in size, but covers the majority amount of information in the original reviews and contains little redundant information.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>A heuristic approach named RewSel is proposed to successively select representatives until the number of representatives meets the requirement. To reveal the advantages of the approach, extensive data experiments and a user study are conducted on real data.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The proposed approach has the advantage over the benchmarks in terms of coverage and redundancy. People show preference to the representative subsets provided by RewSel. The proposed approach also has good scalability, and is more adaptive to big data applications.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Research limitations\/implications<\/jats:title><jats:p>The paper contributes to the literature of review selection, by proposing a heuristic approach which achieves both high coverage and low redundancy. This study can be applied as the basis for conducting further analysis of large-scale online reviews.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Practical implications<\/jats:title><jats:p>The proposed approach offers a novel way to select a representative subset of online reviews to facilitate consumer decision making. It can also enhance the existing information retrieval system to provide representative information to users rather than a large amount of results.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>The proposed approach finds the representative subset by adopting the concept of relative entropy and sentiment analysis methods. Compared with state-of-the-art approaches, it offers a more effective and efficient way for users to handle a large amount of online information.<\/jats:p><\/jats:sec>","DOI":"10.1108\/oir-05-2016-0125","type":"journal-article","created":{"date-parts":[[2017,8,25]],"date-time":"2017-08-25T07:23:15Z","timestamp":1503645795000},"page":"877-899","source":"Crossref","is-referenced-by-count":4,"title":["Providing consumers with a representative subset from online reviews"],"prefix":"10.1108","volume":"41","author":[{"given":"Jin","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ming","family":"Ren","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xian","family":"Xiao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jilong","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","reference":[{"issue":"8","key":"key2020120502295871100_ref001","doi-asserted-by":"crossref","first-page":"1485","DOI":"10.1287\/mnsc.1110.1370","article-title":"Deriving the pricing power of product features by mining consumer reviews","volume":"57","year":"2011","journal-title":"Management Science"},{"key":"key2020120502295871100_ref002","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1016\/j.proeng.2013.02.059","article-title":"Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization","volume":"53","year":"2013","journal-title":"Procedia Engineering"},{"issue":"2","key":"key2020120502295871100_ref003","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1177\/0165551508095781","article-title":"The dark side of information: overload, anxiety and other paradoxes and pathologies","volume":"35","year":"2009","journal-title":"Journal of Information Science"},{"issue":"1","key":"key2020120502295871100_ref004","first-page":"1","article-title":"Understanding satisfied and dissatisfied hotel customers: text mining of online hotel reviews","volume":"21","year":"2016","journal-title":"Journal of Hospitality Marketing & Management"},{"key":"key2020120502295871100_ref005","doi-asserted-by":"crossref","first-page":"808","DOI":"10.1016\/j.procs.2015.03.159","article-title":"Sentiment analysis: measuring opinions","volume":"45","year":"2015","journal-title":"Procedia Computer Science"},{"issue":"3","key":"key2020120502295871100_ref006","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1108\/OIR-07-2015-0225","article-title":"Can two-sided messages increase the helpfulness of online reviews?","volume":"40","year":"2016","journal-title":"Online Information Review"},{"issue":"5","key":"key2020120502295871100_ref007","doi-asserted-by":"crossref","first-page":"1977","DOI":"10.1016\/j.chb.2007.08.004","article-title":"Herd behavior in purchasing books online","volume":"24","year":"2008","journal-title":"Computers in Human Behavior"},{"issue":"3","key":"key2020120502295871100_ref008","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1108\/IntR-03-2013-0046","article-title":"Whose online reviews have the most influences on consumers in cultural offerings? Professional vs consumer commentators","volume":"24","year":"2014","journal-title":"Internet Research"},{"issue":"2","key":"key2020120502295871100_ref009","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1017\/S193029750000303X","article-title":"The effect of incomplete information on the compromise effect","volume":"7","year":"2012","journal-title":"Judgment and Decision Making"},{"first-page":"231","article-title":"A holistic lexicon-based approach to opinion mining","year":"2008","key":"key2020120502295871100_ref010"},{"key":"key2020120502295871100_ref011","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.dss.2014.10.004","article-title":"Sentiment analysis: Bayesian ensemble learning","volume":"68","year":"2014","journal-title":"Decision Support Systems"},{"issue":"10","key":"key2020120502295871100_ref012","doi-asserted-by":"crossref","first-page":"1498","DOI":"10.1109\/TKDE.2010.188","article-title":"Estimating the helpfulness and economic impact of product re-views: mining text and reviewer characteristics","volume":"23","year":"2011","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"first-page":"478","article-title":"Eye-tracking analysis of user behavior in WWW search","year":"2004","key":"key2020120502295871100_ref013"},{"issue":"4","key":"key2020120502295871100_ref014","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1108\/14684521111161954","article-title":"Relevance ranking on Google: are top ranked results really considered more relevant by the users?","volume":"35","year":"2011","journal-title":"Online Information Review"},{"issue":"1","key":"key2020120502295871100_ref015","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1111\/j.1467-985X.2004.298_11.x","article-title":"The elements of statistical learning","volume":"167","year":"2004","journal-title":"Journal of the Royal Statistical Society"},{"first-page":"168","article-title":"Mining and summarizing customer reviews","year":"2004","key":"key2020120502295871100_ref016"},{"key":"key2020120502295871100_ref017","unstructured":"iResearch Consulting Group (2015), \u201ceCommerce in 2015Q2\u201d, available at: www.iresearch com.cn\/data\/256178.html (accessed April 26, 2016)."},{"issue":"8","key":"key2020120502295871100_ref018","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1016\/j.im.2011.09.005","article-title":"A feedback control approach to maintain consumer information load in online shopping environments","volume":"48","year":"2011","journal-title":"Information & Management"},{"key":"key2020120502295871100_ref019","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1007\/978-3-642-15883-4_13","article-title":"Efficient confident search in large review corpora","volume":"6322","year":"2010","journal-title":"Lecture Notes in Computer Science"},{"first-page":"832","article-title":"Selecting a characteristic set of reviews","year":"2012","key":"key2020120502295871100_ref020"},{"issue":"3","key":"key2020120502295871100_ref021","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1108\/IntR-11-2013-0238","article-title":"The role of online product reviews on information adoption of new product development professionals","volume":"25","year":"2015","journal-title":"Internet Research"},{"issue":"1","key":"key2020120502295871100_ref022","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1177\/1938965512464513","article-title":"An analysis of one-star online reviews and responses in the Washington, DC, lodging market","volume":"54","year":"2013","journal-title":"Cornell Hospitality Quarterly"},{"volume-title":"Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data","year":"2007","key":"key2020120502295871100_ref023"},{"first-page":"1085","article-title":"CRO: a system for online review structurization","year":"2008","key":"key2020120502295871100_ref024"},{"first-page":"691","article-title":"Exploiting social context for review quality prediction","year":"2010","key":"key2020120502295871100_ref025"},{"issue":"4","key":"key2020120502295871100_ref026","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1108\/17410391111148567","article-title":"A combined measure for representative information retrieval in enterprise information systems","volume":"24","year":"2011","journal-title":"Journal of Enterprise Information Management"},{"issue":"7","key":"key2020120502295871100_ref027","doi-asserted-by":"crossref","first-page":"1090","DOI":"10.1108\/OIR-11-2015-0373","article-title":"The influence of EWOM characteristics on online repurchase intention: mediating roles of trust and perceived usefulness","volume":"40","year":"2016","journal-title":"Online Information Review"},{"issue":"11","key":"key2020120502295871100_ref028","doi-asserted-by":"crossref","first-page":"2288","DOI":"10.1002\/asi.21400","article-title":"Fine-grained opinion mining by integrating multiple review sources","volume":"61","year":"2010","journal-title":"Journal of American Society for Information Science and Technology"},{"first-page":"531","article-title":"The tagAdvisor: luring the lurkers to review web items","year":"2015","key":"key2020120502295871100_ref029"},{"issue":"6","key":"key2020120502295871100_ref030","doi-asserted-by":"crossref","first-page":"1133","DOI":"10.1016\/j.ijinfomgt.2016.02.010","article-title":"Are customers\u2019 reviews creating value in the hospitality industry? Exploring the moderating effects of market positioning","volume":"36","year":"2016","journal-title":"International Journal of Information Management"},{"first-page":"1067","article-title":"Using micro-reviews to select an efficient set of reviews","year":"2013","key":"key2020120502295871100_ref031"},{"issue":"4","key":"key2020120502295871100_ref032","doi-asserted-by":"crossref","first-page":"1098","DOI":"10.1109\/TKDE.2014.2356456","article-title":"Review selection using micro-reviews","volume":"27","year":"2015","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"first-page":"516","article-title":"When more is less: the paradox of choice in search engine use","year":"2009","key":"key2020120502295871100_ref033"},{"first-page":"338","article-title":"Finding representative set from massive data","year":"2005","key":"key2020120502295871100_ref034"},{"issue":"4","key":"key2020120502295871100_ref035","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1016\/j.jretai.2011.05.002","article-title":"Born unequal: a study of the helpfulness of user-generated product reviews","volume":"87","year":"2011","journal-title":"Journal of Retailing"},{"first-page":"79","article-title":"Thumbs up? Sentiment classification using machine learning techniques","year":"2002","key":"key2020120502295871100_ref036"},{"issue":"4","key":"key2020120502295871100_ref037","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1016\/j.elerap.2007.11.004","article-title":"eWOM overload and its effect on consumer behavioral intention depending on consumer involvement","volume":"7","year":"2008","journal-title":"Electronic Commerce Research and Applications"},{"issue":"8","key":"key2020120502295871100_ref038","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1016\/j.im.2016.06.002","article-title":"Mining customer requirements from online reviews: a product improvement perspective","volume":"53","year":"2016","journal-title":"Information & Management"},{"first-page":"959","article-title":"Hidden sentiment association in Chinese web opinion mining","year":"2008","key":"key2020120502295871100_ref039"},{"first-page":"168","article-title":"Selecting a comprehensive set of reviews","year":"2011","key":"key2020120502295871100_ref040"},{"issue":"1","key":"key2020120502295871100_ref041","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1108\/OIR-03-2015-0073","article-title":"Discovering shilling groups in a real e-commerce platform","volume":"40","year":"2016","journal-title":"Online Information Review"},{"issue":"6","key":"key2020120502295871100_ref042","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1108\/OIR-01-2015-0033","article-title":"Online word-of-mouth as a predictor of television rating","volume":"39","year":"2015","journal-title":"Online Information Review"},{"issue":"6","key":"key2020120502295871100_ref043","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1109\/TNNLS.2012.2193415","article-title":"Extracting representative information to enhance flexible data queries","volume":"23","year":"2012","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"issue":"3","key":"key2020120502295871100_ref044","doi-asserted-by":"crossref","first-page":"762","DOI":"10.1016\/j.joi.2016.05.003","article-title":"Finding a representative subset from large-scale documents","volume":"10","year":"2016","journal-title":"Journal of Informetrics"},{"key":"key2020120502295871100_ref045","doi-asserted-by":"crossref","first-page":"825","DOI":"10.1016\/j.ins.2014.03.017","article-title":"A heuristic approach for \u03bb-representative information retrieval from large-scale data","volume":"277","year":"2014","journal-title":"Information Sciences"}],"container-title":["Online Information Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/OIR-05-2016-0125\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/OIR-05-2016-0125\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:42:59Z","timestamp":1753396979000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/oir\/article\/41\/6\/877-899\/315230"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,9]]},"references-count":45,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2017,10,9]]}},"alternative-id":["10.1108\/OIR-05-2016-0125"],"URL":"https:\/\/doi.org\/10.1108\/oir-05-2016-0125","relation":{},"ISSN":["1468-4527"],"issn-type":[{"type":"print","value":"1468-4527"}],"subject":[],"published":{"date-parts":[[2017,10,9]]}}}