{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,22]],"date-time":"2026-01-22T23:55:12Z","timestamp":1769126112639,"version":"3.49.0"},"reference-count":72,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,7,11]],"date-time":"2023-07-11T00:00:00Z","timestamp":1689033600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000006","name":"Office of Naval Research","doi-asserted-by":"crossref","award":["N00014-18-1-2670 and N00014-20-1-2407"],"award-info":[{"award-number":["N00014-18-1-2670 and N00014-20-1-2407"]}],"id":[{"id":"10.13039\/100000006","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>\n            Review platforms are viral online services where users share and read opinions about products (e.g., a smartphone) or experiences (e.g., a meal at a restaurant). Other users may be influenced by such opinions when deciding what to buy. The usability of review platforms is currently limited by the massive number of opinions on many products. Therefore, showing only the most\n            <jats:italic>helpful<\/jats:italic>\n            reviews for each product is in the best interest of both users and the platform (e.g., Amazon). The current state of the art is far from accurate in predicting how helpful a review is. First, most existing works lack compelling comparisons as many studies are conducted on datasets that are not publicly available. As a consequence, new studies are not always built on top of prior baselines. Second, most existing research focuses only on features derived from the review text, ignoring other fundamental aspects of the review platforms (e.g., the other reviews of a product, the order in which they were submitted).\n          <\/jats:p>\n          <jats:p>In this article, we first carefully review the most relevant works in the area published during the last 20 years. We then propose the User-Review-Item (URI) paradigm, a novel abstraction for modeling the problem that moves the focus of the feature engineering from the review to the platform level. We empirically validate the URI paradigm on a dataset of products from six Amazon categories with 270 trained models: on average, classifiers gain +4% in F1-score when considering the whole review platform context. In our experiments, we further emphasize some problems with the helpfulness prediction task: (1) the users\u2019 writing style changes over time (i.e., concept drift), (2) past models do not generalize well across different review categories, and (3) past methods to generate the ground truth produced unreliable helpfulness scores, affecting the model evaluation phase.<\/jats:p>","DOI":"10.1145\/3585280","type":"journal-article","created":{"date-parts":[[2023,2,23]],"date-time":"2023-02-23T13:08:29Z","timestamp":1677157709000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["A Novel Review Helpfulness Measure Based on the User-Review-Item Paradigm"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6749-6608","authenticated-orcid":false,"given":"Luca","family":"Pajola","sequence":"first","affiliation":[{"name":"University of Padua, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9248-5265","authenticated-orcid":false,"given":"Dongkai","family":"Chen","sequence":"additional","affiliation":[{"name":"Dartmouth College, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3612-1934","authenticated-orcid":false,"given":"Mauro","family":"Conti","sequence":"additional","affiliation":[{"name":"University of Padua, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7191-0296","authenticated-orcid":false,"given":"V.S.","family":"Subrahmanian","sequence":"additional","affiliation":[{"name":"Northwestern University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,7,11]]},"reference":[{"key":"e_1_3_2_2_2","volume-title":"Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC\u201910)","author":"Baccianella Stefano","year":"2010","unstructured":"Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC\u201910). European Language Resources Association (ELRA)."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_3_2_4_2","first-page":"440","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics","author":"Blitzer John","year":"2007","unstructured":"John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 440\u2013447."},{"key":"e_1_3_2_5_2","first-page":"108","volume-title":"ECML PKDD Workshop: Languages for Data Mining and Machine Learning","author":"Buitinck Lars","year":"2013","unstructured":"Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake VanderPlas, Arnaud Joly, Brian Holt, and Ga\u00ebl Varoquaux. 2013. API design for machine learning software: Experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108\u2013122."},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2029"},{"key":"e_1_3_2_7_2","first-page":"602","volume-title":"Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)","author":"Chen Cen","year":"2018","unstructured":"Cen Chen, Yinfei Yang, Jun Zhou, Xiaolong Li, and Forrest Sheng Bao. 2018. Cross-domain review helpfulness prediction based on convolutional neural networks with auxiliary domain discriminators. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, 602\u2013607."},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.918083"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1509\/jmkr.43.3.345"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526729"},{"key":"e_1_3_2_11_2","doi-asserted-by":"crossref","first-page":"698","DOI":"10.18653\/v1\/P18-1065","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Diaz Gerardo Ocampo","year":"2018","unstructured":"Gerardo Ocampo Diaz and Vincent Ng. 2018. Modeling and prediction of online product review helpfulness: A survey. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 698\u2013708."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0226902"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jretai.2008.04.005"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2018.06.012"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-018-0316-z"},{"key":"e_1_3_2_16_2","first-page":"2715","volume-title":"The World Wide Web Conference (WWW\u201919)","author":"Fan Miao","year":"2019","unstructured":"Miao Fan, Chao Feng, Lin Guo, Mingming Sun, and Ping Li. 2019. Product-aware helpfulness prediction of online reviews. In The World Wide Web Conference (WWW\u201919). Association for Computing Machinery, New York, NY, 2715\u20132721."},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbusres.2014.11.006"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/2523813"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.188"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1287\/mksc.1110.0653"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1002\/rob.21918"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/2872427.2883037"},{"key":"e_1_3_2_23_2","first-page":"495","volume-title":"Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201912)","author":"Hong Yu","year":"2012","unstructured":"Yu Hong, Jun Lu, Jianmin Yao, Qiaoming Zhu, and Guodong Zhou. 2012. What reviews are satisfactory: Novel features for automatic helpfulness voting. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR\u201912). Association for Computing Machinery, New York, NY, 495\u2013504."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014073"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.chb.2015.01.010"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1609\/icwsm.v8i1.14550"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341560"},{"key":"e_1_3_2_29_2","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1007\/978-3-319-99073-6_7","volume-title":"European Symposium on Research in Computer Security","author":"Juuti Mika","year":"2018","unstructured":"Mika Juuti, Bo Sun, Tatsuya Mori, and N. Asokan. 2018. Stay on-topic: Generating context-specific fake restaurant reviews. In European Symposium on Research in Computer Security. Springer International Publishing, Cham, 132\u2013151."},{"key":"e_1_3_2_30_2","unstructured":"Shashank Kapadia. 2019. Topic Modeling in Python: Latent Dirichlet Allocation (LDA) . Retrieved 2021-12-01 from https:\/\/towardsdatascience.com\/end-to-end-topic-modeling-in-python-latent-dirichlet-allocation-lda-35ce4ed6b3e0."},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.5555\/1610075.1610135"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.elerap.2011.10.003"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2014.12.044"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2013.10.034"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/1506250.1506254"},{"key":"e_1_3_2_36_2","first-page":"334","volume-title":"Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907)","author":"Liu Jingjing","year":"2007","unstructured":"Jingjing Liu, Yunbo Cao, Chin-Yew Lin, Yalou Huang, and Ming Zhou. 2007. Low-quality product review detection in opinion summarization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL\u201907). Association for Computational Linguistics, 334\u2013342."},{"key":"e_1_3_2_37_2","first-page":"443","volume-title":"IEEE International Conference on Data Mining","author":"Liu Yang","year":"2008","unstructured":"Yang Liu, Xiangji Huang, Aijun An, and Xiaohui Yu. 2008. Modeling and predicting the helpfulness of online reviews. In IEEE International Conference on Data Mining. 443\u2013452."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cad.2012.07.008"},{"key":"e_1_3_2_39_2","article-title":"NLTK: The natural language toolkit","volume":"0205028","author":"Loper Edward","year":"2002","unstructured":"Edward Loper and Steven Bird. 2002. NLTK: The natural language toolkit. CoRR cs.CL\/0205028 (2002).","journal-title":"CoRR"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772761"},{"issue":"1","key":"e_1_3_2_41_2","article-title":"Prediction of helpful reviews using emotions extraction","volume":"28","author":"Martin Lionel","year":"2014","unstructured":"Lionel Martin and Pearl Pu. 2014. Prediction of helpful reviews using emotions extraction. Proceedings of the AAAI Conference on Artificial Intelligence 28, 1 (June2014), 1551\u20131557.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.5555\/2999792.2999959"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2012.01.116"},{"key":"e_1_3_2_44_2","volume-title":"20th USENIX Security Symposium (USENIX Security\u201911)","author":"Motoyama Marti","year":"2011","unstructured":"Marti Motoyama, Damon McCoy, Kirill Levchenko, Stefan Savage, and Geoffrey M. Voelker. 2011. Dirty jobs: The role of freelance labor in web service abuse. In 20th USENIX Security Symposium (USENIX Security\u201911). USENIX Association, San Francisco, CA."},{"key":"e_1_3_2_45_2","doi-asserted-by":"crossref","first-page":"185","DOI":"10.2307\/20721420","article-title":"Research note: What makes a helpful online review? A study of customer reviews on amazon.com","author":"Mudambi Susan M.","year":"2010","unstructured":"Susan M. Mudambi and David Schuff. 2010. Research note: What makes a helpful online review? A study of customer reviews on amazon.com. MIS Quarterly 34, 1 (2010), 185\u2013200.","journal-title":"MIS Quarterly"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2014.01.011"},{"key":"e_1_3_2_47_2","first-page":"305","volume-title":"Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys\u201909)","author":"O\u2019Mahony Michael P.","year":"2009","unstructured":"Michael P. O\u2019Mahony and Barry Smyth. 2009. Learning to recommend helpful hotel reviews. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys\u201909). Association for Computing Machinery, New York, NY, 305\u2013308."},{"key":"e_1_3_2_48_2","first-page":"164","volume-title":"Adaptivity, Personalization and Fusion of Heterogeneous Information (RIAO\u201910)","author":"O\u2019Mahony Michael P.","year":"2010","unstructured":"Michael P. O\u2019Mahony and Barry Smyth. 2010. Using readability tests to predict helpful product reviews. In Adaptivity, Personalization and Fusion of Heterogeneous Information (RIAO\u201910). Le Centre De Hautes Etudes Internationales D\u2019Informatique Documentaire, Paris, FRA, 164\u2013167."},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jretai.2011.05.002"},{"key":"e_1_3_2_50_2","volume-title":"The Development and Psychometric Properties of LIWC2015","author":"Pennebaker James W.","year":"2015","unstructured":"James W. Pennebaker, Ryan L. Boyd, Kayla Jordan, and Kate Blackburn. 2015. The Development and Psychometric Properties of LIWC2015. Technical Report."},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_52_2","unstructured":"Marketplace Pulse. 2022. Amazon Number of Prime Members . Retrieved 2022-04-08 from https:\/\/www.marketplacepulse.com\/stats\/amazon\/amazon-number-of-prime-members-50."},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2012.08.020"},{"key":"e_1_3_2_54_2","doi-asserted-by":"crossref","first-page":"836","DOI":"10.1007\/978-3-030-45439-5_55","article-title":"An attention model of customer expectation to improve review helpfulness prediction","volume":"12035","author":"Qu Xianshan","year":"2020","unstructured":"Xianshan Qu, Xiaopeng Li, Csilla Farkas, and John Rose. 2020. An attention model of customer expectation to improve review helpfulness prediction. Advances in Information Retrieval 12035 (2020), 836.","journal-title":"Advances in Information Retrieval"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3412691"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.elerap.2012.06.003"},{"key":"e_1_3_2_57_2","first-page":"45","volume-title":"Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks","author":"\u0158eh\u016f\u0159ek Radim","year":"2010","unstructured":"Radim \u0158eh\u016f\u0159ek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, 45\u201350."},{"key":"e_1_3_2_58_2","article-title":"DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter","author":"Sanh Victor","year":"2019","unstructured":"Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).","journal-title":"arXiv preprint arXiv:1910.01108"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1177\/0539018405058216"},{"issue":"1","key":"e_1_3_2_60_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1080\/14792779143000006","article-title":"The linguistic category model, its bases, applications and range","volume":"2","author":"Semin G\u00fcn R.","year":"1991","unstructured":"G\u00fcn R. Semin and Klaus Fiedler. 1991. The linguistic category model, its bases, applications and range. European Review of Social Psychology 2, 1 (1991), 1\u201330.","journal-title":"European Review of Social Psychology"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbusres.2016.08.008"},{"key":"e_1_3_2_62_2","unstructured":"Statista. 2022. Total Number of User Reviews and Opinions on Tripadvisor Worldwide from 2014 to 2021. Retrieved 2022-03-11 from https:\/\/www.statista.com\/statistics\/684862\/tripadvisor-number-of-reviews\/."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1002\/bs.3830070412"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/2507157.2507183"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21662"},{"issue":"1","key":"e_1_3_2_66_2","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1609\/icwsm.v3i1.13945","article-title":"RevRank: A fully unsupervised algorithm for selecting the most helpful book reviews","volume":"3","author":"Tsur Oren","year":"2009","unstructured":"Oren Tsur and Ari Rappoport. 2009. RevRank: A fully unsupervised algorithm for selecting the most helpful book reviews. Proceedings of the International AAAI Conference on Web and Social Media 3, 1 (March2009), 154\u2013161.","journal-title":"Proceedings of the International AAAI Conference on Web and Social Media"},{"issue":"2","key":"e_1_3_2_67_2","first-page":"58","article-title":"The problem of concept drift: Definitions and related work","volume":"106","author":"Tsymbal Alexey","year":"2004","unstructured":"Alexey Tsymbal. 2004. The problem of concept drift: Definitions and related work. Computer Science Department, Trinity College Dublin 106, 2 (2004), 58.","journal-title":"Computer Science Department, Trinity College Dublin"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1145\/2187836.2187928"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-2007"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3133990"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183626"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2016.09.016"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3585280","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3585280","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:56Z","timestamp":1750178276000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3585280"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,11]]},"references-count":72,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3585280"],"URL":"https:\/\/doi.org\/10.1145\/3585280","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"value":"1559-1131","type":"print"},{"value":"1559-114X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,11]]},"assertion":[{"value":"2022-07-06","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-01-27","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}