{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T22:41:31Z","timestamp":1761864091657,"version":"3.37.3"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T00:00:00Z","timestamp":1582848000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T00:00:00Z","timestamp":1582848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Launching new products in the consumer electronics market is challenging. Developing and marketing the same in limited time affect the sustainability of such companies. This research work introduces a model that can predict the success of a product. A Feature Information Gain (FIG) measure is used for significant feature identification and Distributed Memory-based Resilient Dataset Filter (DMRDF) is used to eliminate duplicate reviews, which in turn improves the reliability of the product reviews. The pre-processed dataset is used for prediction of product pre-launch in the market using classifiers such as Logistic regression and Support vector machine. DMRDF method is fault-tolerant because of its resilience property and also reduces the dataset redundancy; hence, it increases the prediction accuracy of the model. The proposed model works in a distributed environment to handle a massive volume of the dataset and therefore, it is scalable. The output of this feature modelling and prediction allows the manufacturer to optimize the design of his new product.<\/jats:p>","DOI":"10.1186\/s40537-020-00292-y","type":"journal-article","created":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T13:02:45Z","timestamp":1582894965000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Improving prediction with enhanced Distributed Memory-based Resilient Dataset Filter"],"prefix":"10.1186","volume":"7","author":[{"given":"Sandhya","family":"Narayanan","sequence":"first","affiliation":[]},{"given":"Philip","family":"Samuel","sequence":"additional","affiliation":[]},{"given":"Mariamma","family":"Chacko","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,2,28]]},"reference":[{"issue":"4","key":"292_CR1","first-page":"25","volume":"2","author":"RY Lau","year":"2011","unstructured":"Lau RY, Liao SY, Kwok RC, Xu K, Xia Y, Li Y. Text mining and probabilistic modeling for online review spam detection. ACM Trans Manag Inform Syst. 2011;2(4):25.","journal-title":"ACM Trans Manag Inform Syst"},{"key":"292_CR2","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1016\/j.ijinfomgt.2016.06.006","volume":"37","author":"X Lin","year":"2017","unstructured":"Lin X, Li Y, Wang X. Social commerce research: definition, research themes and the trends. Int J Inform Manag. 2017;37:190\u2013201.","journal-title":"Int J Inform Manag"},{"issue":"4","key":"292_CR3","doi-asserted-by":"publisher","first-page":"578","DOI":"10.1007\/s11747-008-0121-1","volume":"36","author":"CAD Matos","year":"2008","unstructured":"Matos CAD, Rossi CAV. Word-of-mouth communications in marketing: a meta-analytic review of the antecedents and moderators. J Acad Market Sci. 2008;36(4):578\u201396.","journal-title":"J Acad Market Sci."},{"key":"292_CR4","first-page":"427","volume":"7.4","author":"S Jeon","year":"2013","unstructured":"Jeon S, et al. Redundant data removal technique for efficient big data search processing. Int J Softw Eng Appl. 2013;7.4:427\u201336.","journal-title":"Int J Softw Eng Appl."},{"key":"292_CR5","doi-asserted-by":"crossref","unstructured":"Dave K, Lawrence S, and Pennock D. Mining the peanut gallery: opinion extraction and semantic classification of product reviews. WWW\u20192003.","DOI":"10.1145\/775152.775226"},{"key":"292_CR6","doi-asserted-by":"publisher","unstructured":"Zhou Y, Wilkinson D, Schreiber R, Pan R. Large-scale parallel collaborative filtering for the netflix prize. 2008. p. 337\u201348. https:\/\/doi.org\/10.1007\/978-3-540-68880-8_32.","DOI":"10.1007\/978-3-540-68880-8_32"},{"key":"292_CR7","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1016\/j.dss.2016.04.001","volume":"86","author":"KZK Zhang","year":"2016","unstructured":"Zhang KZK, Benyoucef M. Consumer behavior in social commerce: a literature review. Dec Support Syst. 2016;86:95\u2013108.","journal-title":"Dec Support Syst."},{"issue":"1","key":"292_CR8","doi-asserted-by":"publisher","first-page":"39","DOI":"10.2753\/JEC1086-4415170102","volume":"17","author":"Geng Cui","year":"2012","unstructured":"Cui Geng, Lui Hon-Kwong, Guo Xiaoning. The effect of online consumer reviews on new product sales. Int J Electron Comm. 2012;17(1):39\u201358.","journal-title":"Int J Electron Comm"},{"issue":"3","key":"292_CR9","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1007\/s11280-015-0381-x","volume":"5","author":"AS Manek","year":"2016","unstructured":"Manek AS, Shenoy PD, Mohan MC, et al. Detection of fraudulent and malicious websites by analysing user reviews for online shopping websites. Int J Knowl Web Intell. 2016;5(3):171\u201389. https:\/\/doi.org\/10.1007\/s11280-015-0381-x.","journal-title":"Int J Knowl Web Intell."},{"key":"292_CR10","doi-asserted-by":"crossref","unstructured":"Singh S, and Singh N. Big data analytics. In: Proceedings of the 2012 international conference on communication, information & computing technology (ICCICT), institute of electrical and electronics engineers (IEEE). 2012. p. 1\u20134. http:\/\/dx.doi.org\/10.1109\/iccict.2012.6398180.","DOI":"10.1109\/ICCICT.2012.6398180"},{"key":"292_CR11","doi-asserted-by":"crossref","unstructured":"Demchenko Yuri et al. Addressing big data challenges for scientific data infrastructure. In: IEEE 4th Int. conference cloud computing technology and science (CloudCom). 2012.","DOI":"10.1109\/CloudCom.2012.6427494"},{"key":"292_CR12","unstructured":"Sihong Xie, Guan Wang, Shuyang Lin and Yu Philip S. Review spam detection via time-series pattern discovery. In: ACM Proceedings of the 21st international conference companion on World Wide Web. 2012. p. 635\u20136."},{"key":"292_CR13","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1109\/MC.2009.263","volume":"8","author":"Y Koren","year":"2009","unstructured":"Koren Y, Bell R, Volinsky C. matrix factorization technique for recommender systems. Computer. 2009;8:30\u20137.","journal-title":"Computer."},{"key":"292_CR14","doi-asserted-by":"crossref","unstructured":"Salakhutdinov R, Mnih A, & Hinton G. Restricted boltzmann machines for collaborative filtering. In: Proc. of the 24th Int. conference on machine learning. 2007. p. 791\u20138.","DOI":"10.1145\/1273496.1273596"},{"issue":"3","key":"292_CR15","first-page":"29","volume":"2","author":"MA Hao","year":"2011","unstructured":"Hao MA, King I, Lyu MR. Learning to recommend with explicit and implicit social relations. ACM Trans Intell Syst Technol. 2011;2(3):29.","journal-title":"ACM Trans Intell Syst Technol."},{"issue":"2","key":"292_CR16","first-page":"310","volume":"2","author":"V Bandakkanavar","year":"2014","unstructured":"Bandakkanavar V, Ramesh M, Geeta V. A survey on detection of reviews using sentiment classification of methods. IJRITCC. 2014;2(2):310\u20134.","journal-title":"IJRITCC"},{"key":"292_CR17","doi-asserted-by":"publisher","unstructured":"Gu V, and Li H. Memory or time\u2014performance evaluation for iterative operation on hadoop and spark. In: Proc. of the 2013 IEEE 10th Int. Con. on high-performance computing and communications. 2013. https:\/\/doi.org\/10.1109\/hpcc.and.euc.2013.106.","DOI":"10.1109\/hpcc.and.euc.2013.106"},{"issue":"2","key":"292_CR18","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1016\/j.im.2018.05.001","volume":"56","author":"Hanpeng Zhang","year":"2019","unstructured":"Zhang Hanpeng, Wang Zhaohua, Chen Shengjun, Guo Chengqi. Product recommendation in online social networking communities\u2014an empirical study of antecedents and a mediator. J Inform Manag. 2019;56(2):185\u201395.","journal-title":"J Inform Manag"},{"key":"292_CR19","doi-asserted-by":"crossref","unstructured":"Ghose A, Ipeirotis PG. Designing novel review ranking systems: predicting the usefulness and impact of reviews. In: Int Conference Electron Comm ACM. 2007. p. 303\u201310.","DOI":"10.1145\/1282100.1282158"},{"key":"292_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1080\/00207543.2015.1066519","volume":"55","author":"AY Chong","year":"2015","unstructured":"Chong AY, Ch\u2019ng E, Liu MJ, Li B. Predicting consumer product demands via Big Data: the roles of online promotional marketing and online reviews. Int J Prod Res. 2015;55:1\u201315. https:\/\/doi.org\/10.1080\/00207543.2015.1066519.","journal-title":"Int J Prod Res."},{"key":"292_CR21","doi-asserted-by":"publisher","unstructured":"Yang H, Fujimaki R, Kusumura Y, & Liu J. Online Feature Selection. In: Proceedings of the 22nd ACM SIGKDD Int. Conference on KDD \u201816, 2016.\u00a0https:\/\/doi.org\/10.1145\/2939672.2939881.","DOI":"10.1145\/2939672.2939881"},{"key":"292_CR22","unstructured":"Breese JS, Heckerman D, and Kadie C. Empirical analysis of predictive algorithms for collaborative filtering. In: Proc. of the 14th Conf. on Uncertainty in Artifical Intelligence, 1998."},{"key":"292_CR23","doi-asserted-by":"crossref","unstructured":"Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R. Spotting opinion spammers using behavioral footprints. In: Proc. of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining Chicago, ACM. 2013. p. 632\u201340.","DOI":"10.1145\/2487575.2487580"},{"issue":"3","key":"292_CR24","doi-asserted-by":"publisher","first-page":"e0194889","DOI":"10.1371\/journal.pone.0194889","volume":"13","author":"S Makridakis","year":"2018","unstructured":"Makridakis S, Spiliotis E, Assimakopoulos V. Statistical and Machine Learning forecasting methods: concerns and ways forward. PLoS ONE. 2018;13(3):e0194889. https:\/\/doi.org\/10.1371\/journal.pone.0194889.","journal-title":"PLoS ONE"},{"key":"292_CR25","doi-asserted-by":"publisher","DOI":"10.18187\/pjsor.v8i3.535","author":"A Imon","year":"2012","unstructured":"Imon A, Roy C, Manos C, Bhattacharjee S. Prediction of rainfall using logistic regression. Pak J Stat Oper Res. 2012. https:\/\/doi.org\/10.18187\/pjsor.v8i3.535.","journal-title":"Pak J Stat Oper Res."},{"issue":"1","key":"292_CR26","first-page":"3619","volume":"13","author":"T Chen","year":"2012","unstructured":"Chen T, Zhang W, Lu Q, Chen K, Zheng Z, Yu Y. SVD Feature: a toolkit for feature-based collaborative filtering. J Mach Learn Res. 2012;13(1):3619\u201322.","journal-title":"J Mach Learn Res"},{"issue":"1","key":"292_CR27","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1145\/2556270","volume":"47","author":"Y Shi","year":"2014","unstructured":"Shi Y, Larson M, Hanjalic A. Collaborative filtering beyond the user-item matrix\u2014a survey of the state of art and future challenges. ACM Comput Surv. 2014;47(1):3.","journal-title":"ACM Comput Surv"},{"key":"292_CR28","doi-asserted-by":"crossref","unstructured":"Shan H, & Banerjee A. Generalized probabilistic matrix factorizations for collaborative filtering, In Data mining (ICDM), IEEE 10th international conference. 2010. p. 1025\u201330.","DOI":"10.1109\/ICDM.2010.116"},{"key":"292_CR29","doi-asserted-by":"crossref","unstructured":"Salakhutdinov R, & Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: Proc. of the 25th int. conference on machine learning. 2008. p. 880\u20137.","DOI":"10.1145\/1390156.1390267"},{"issue":"1","key":"292_CR30","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/s40537-015-0029-9","volume":"2","author":"M Crawford","year":"2015","unstructured":"Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H. Survey of review spam detection using machine learning techniques. J Big Data. 2015;2(1):23.","journal-title":"J Big Data."},{"key":"292_CR31","first-page":"15","volume-title":"Product reviews in mobile decision aid systems","author":"TA Wietsma","year":"2005","unstructured":"Wietsma TA, Ricci F. Product reviews in mobile decision aid systems. Francesco: PERMID; 2005. p. 15\u20138."},{"key":"292_CR32","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1016\/j.ins.2018.01.001","volume":"435","author":"C Jianguo","year":"2018","unstructured":"Jianguo C, et al. A disease diagnosis and treatment recommendation system based on big data mining and cloud computing. Inform Sci. 2018;435:124\u201349.","journal-title":"Inform Sci."},{"key":"292_CR33","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1007\/s11280-015-0381-x","volume":"20","author":"AS Manek","year":"2017","unstructured":"Manek AS, Shenoy PD, Mohan MC, Venugopal KR. Aspect term extraction for sentiment analysis in large movie reviews using Gini-index feature selection method and SVM classifier. World Wide Web. 2017;20:135\u201354. https:\/\/doi.org\/10.1007\/s11280-015-0381-x.","journal-title":"World Wide Web."},{"key":"292_CR34","first-page":"1871","volume":"9","author":"RE Fan","year":"2008","unstructured":"Fan RE, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J. LIBLINEAR: A library for large linear classification. J Mach Learn Res. 2008;9:1871\u20134.","journal-title":"J Mach Learn Res"},{"key":"292_CR35","doi-asserted-by":"crossref","unstructured":"Ribeiro MT, Singh S, and Guestrin C. Why should I trust you?: Explaining the predictions of any classifier. In: Proc. ACMSIGKDD Int. Conf. Knowl. Discov. Data Mining. 2016. p. 1135\u201344.","DOI":"10.1145\/2939672.2939778"},{"issue":"4","key":"292_CR36","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1109\/TSC.2016.2597829","volume":"12","author":"X Luo","year":"2019","unstructured":"Luo X, et al. An effective scheme for QoS estimation via alternating direction method-based matrix factorization. IEEE Trans Serv Comput. 2019;12(4):503\u201318.","journal-title":"IEEE Trans Serv Comput"},{"issue":"3","key":"292_CR37","doi-asserted-by":"publisher","first-page":"397","DOI":"10.1109\/TSMCC.2011.2136334","volume":"42","author":"Chien-Liang Liu","year":"2012","unstructured":"Liu CL, Hsaio WH, Lee CH, Lu GC and Jou E. Movie rating and review summarization in mobile environment. In: IEEE trans. systems, man and cybernetics, Part C: applications and reviews. 2012. p. 397\u2013407.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)"},{"key":"292_CR38","doi-asserted-by":"crossref","unstructured":"Vapnik, VN. The nature of statistical learning theory, Springer, 2nd ed, 1999. Translated by Xu Jianghua, Zhang Xuegong. Beijing: China Machine Press; 2000.","DOI":"10.1007\/978-1-4757-3264-1"},{"key":"292_CR39","unstructured":"[Dataset] Flipkart-products. http:\/\/www.kaggle.com\/PromptCloudHQ\/flipkart-products."},{"key":"292_CR40","unstructured":"[Dataset] https:\/\/snap.stanford.edu\/data\/web-Amazon.html."},{"key":"292_CR41","doi-asserted-by":"crossref","unstructured":"[Dataset] He R, McAuley J. Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. WWW; 2016.","DOI":"10.1145\/2872427.2883037"},{"key":"292_CR42","doi-asserted-by":"crossref","unstructured":"Popescu AM, Etzioni O. Extracting product features and opinions from reviews. 2005; EMNLP.","DOI":"10.3115\/1220575.1220618"},{"key":"292_CR43","volume-title":"Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing Technical Report UCB\/EECS-2011-82","author":"M Zaharia","year":"2011","unstructured":"Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin M, Shenker S, Stoica I. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing Technical Report UCB\/EECS-2011-82. UC Berkeley: EECS Department; 2011."},{"key":"292_CR44","doi-asserted-by":"crossref","unstructured":"Davis J, Goadrich M. The relationship between precision-recall and ROC curves, In ICML. 2006. p. 233\u201340.","DOI":"10.1145\/1143844.1143874"},{"key":"292_CR45","doi-asserted-by":"publisher","DOI":"10.1016\/j.sbspro.2014.04.451","author":"JS Lee","year":"2014","unstructured":"Lee JS, Lee ES. Exploring the usefulness of predicting people\u2019s locations. Procedia Soc Beh Sci. 2014. https:\/\/doi.org\/10.1016\/j.sbspro.2014.04.451.","journal-title":"Procedia Soc Beh Sci."}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00292-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s40537-020-00292-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-020-00292-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:29:16Z","timestamp":1614385756000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-020-00292-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,28]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["292"],"URL":"https:\/\/doi.org\/10.1186\/s40537-020-00292-y","relation":{},"ISSN":["2196-1115"],"issn-type":[{"type":"electronic","value":"2196-1115"}],"subject":[],"published":{"date-parts":[[2020,2,28]]},"assertion":[{"value":"25 October 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 February 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"13"}}