{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T17:34:42Z","timestamp":1754156082597,"version":"3.41.2"},"reference-count":33,"publisher":"Emerald","issue":"4","license":[{"start":{"date-parts":[[2019,10,7]],"date-time":"2019-10-07T00:00:00Z","timestamp":1570406400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJWIS"],"published-print":{"date-parts":[[2019,10,7]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.<\/jats:p><\/jats:sec>","DOI":"10.1108\/ijwis-10-2018-0075","type":"journal-article","created":{"date-parts":[[2019,6,3]],"date-time":"2019-06-03T04:36:36Z","timestamp":1559536596000},"page":"454-473","source":"Crossref","is-referenced-by-count":5,"title":["On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method"],"prefix":"10.1108","volume":"15","author":[{"given":"Tran Khanh","family":"Dang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Duc Minh Chau","family":"Pham","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Duc Dan","family":"Ho","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","reference":[{"key":"key2019092010555477600_ref001","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2014\/614342","article-title":"Comparison of ARIMA and artificial neural networks models for stock price prediction","volume":"2014","year":"2014","journal-title":"Journal of Applied Mathematics"},{"volume-title":"Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection","year":"2015","key":"key2019092010555477600_ref002"},{"key":"key2019092010555477600_ref003","doi-asserted-by":"crossref","first-page":"1827","DOI":"10.1016\/S2212-5671(15)01485-9","article-title":"Detecting and preventing fraud with data analytics","volume":"32","year":"2015","journal-title":"Procedia Economics and Finance"},{"issue":"2","key":"key2019092010555477600_ref004","first-page":"779","article-title":"A bayesian dichotomous model with asymmetric link for fraud in insurance","volume":"42","year":"2008","journal-title":"Insurance: Mathematics and Economics"},{"issue":"3","key":"key2019092010555477600_ref005","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","year":"1995","journal-title":"Machine Learning"},{"key":"key2019092010555477600_ref006","first-page":"313","article-title":"A NoSQL data-based personalized recommendation system for C2C e-commerce","volume-title":"International Conference on Database and Expert Systems Applications","year":"2017"},{"issue":"5","key":"key2019092010555477600_ref007","doi-asserted-by":"publisher","first-page":"1","DOI":"10.9734\/AIR\/2016\/24175","article-title":"Time-series modeling and short term prediction of annual temperature trend on Coast Libya using the Box-Jenkins ARIMA model","volume":"6","year":"2016","journal-title":"Advances in Research"},{"issue":"2","key":"key2019092010555477600_ref008","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1080\/07350015.1993.10509952","article-title":"Arima processes with arima parameters","volume":"11","year":"1993","journal-title":"Journal of Business and Economic Statistics"},{"issue":"1","key":"key2019092010555477600_ref009","article-title":"A solution for preventing fraudulent financial reporting using descriptive data mining techniques","volume":"58","year":"2012","journal-title":"International Journal of Computer Applications"},{"key":"key2019092010555477600_ref031","first-page":"350","article-title":"Bayes classification methods","volume-title":"Data Mining: Concepts and Techniques","year":"2011"},{"key":"key2019092010555477600_ref010","first-page":"137","article-title":"Text categorization with support vector machines: learning with many relevant features","volume-title":"European Conference on Machine Learning","year":"1998"},{"issue":"4","key":"key2019092010555477600_ref011","doi-asserted-by":"crossref","first-page":"995","DOI":"10.1016\/j.eswa.2006.02.016","article-title":"Data mining techniques for the detection of fraudulent financial statements","volume":"32","year":"2007","journal-title":"Expert Systems with Applications"},{"issue":"2","key":"key2019092010555477600_ref032","first-page":"372","article-title":"Neural network model vs. SARIMA model in forecasting Korean stock price index (KOSPI)","volume":"8","year":"2007","journal-title":"Issues in Information System"},{"issue":"5","key":"key2019092010555477600_ref012","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1016\/j.elerap.2015.05.001","article-title":"A secure m-commerce system based on credit card transaction","volume":"14","year":"2015","journal-title":"Electronic Commerce Research and Applications"},{"volume-title":"Introduction to Information Retrieval","year":"2008","key":"key2019092010555477600_ref013"},{"issue":"2","key":"key2019092010555477600_ref014","first-page":"23","article-title":"A comparison between hybrid approaches of ann and arima for Indian stock trend forecasting","volume":"3","year":"2010","journal-title":"Business Intelligence Journal"},{"issue":"3","key":"key2019092010555477600_ref015","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1007\/s10660-013-9120-5","article-title":"Enhanced security in internet voting protocol using blind signature and dynamic ballots","volume":"13","year":"2013","journal-title":"Electronic Commerce Research"},{"first-page":"55","article-title":"A survey of focused web crawling algorithms","year":"2004","key":"key2019092010555477600_ref016"},{"issue":"4","key":"key2019092010555477600_ref017","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1108\/02686900210424358","article-title":"An empirical analysis of the likelihood of detecting fraud in New Zealand","volume":"17","year":"2002","journal-title":"Managerial Auditing Journal"},{"issue":"5","key":"key2019092010555477600_ref018","doi-asserted-by":"crossref","first-page":"838","DOI":"10.1002\/bimj.201300149","article-title":"Outlier detection method in GEEs","volume":"56","year":"2014","journal-title":"Biometrical Journal. Biometrische Zeitschrift"},{"issue":"3","key":"key2019092010555477600_ref019","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.2466\/pr0.1994.75.3.1215","article-title":"Professors\u2019 interactional attributes: how do they relate to one another?","volume":"75","year":"1994","journal-title":"Psychological Reports"},{"issue":"8","key":"key2019092010555477600_ref020","doi-asserted-by":"crossref","first-page":"7054","DOI":"10.1166\/asl.2017.9288","article-title":"Financial statement fraud detection using published data based on fraud triangle theory","volume":"23","year":"2017","journal-title":"Advanced Science Letters"},{"issue":"22","key":"key2019092010555477600_ref021","first-page":"41","article-title":"An empirical study of the naive bayes classifier","volume":"3","year":"2001","journal-title":"IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence"},{"issue":"5","key":"key2019092010555477600_ref022","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1016\/0306-4573(88)90021-0","article-title":"Term-weighting approaches in automatic text retrieval","volume":"24","year":"1988","journal-title":"Inf. Process. Manage"},{"issue":"2","key":"key2019092010555477600_ref023","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/s10514-015-9431-6","article-title":"Improving multi-modal data fusion by anomaly detection","volume":"39","year":"2015","journal-title":"Autonomous Robots"},{"issue":"12","key":"key2019092010555477600_ref024","doi-asserted-by":"publisher","first-page":"4011","DOI":"10.1016\/j.jspi.2007.04.018","article-title":"Weighted dickey-fuller processes for detecting stationarity","volume":"137","year":"2007","journal-title":"Journal of Statistical Planning and Inference"},{"key":"key2019092010555477600_ref025","first-page":"32","article-title":"A cross-checking based method for fraudulent detection on e-commercial crawling data","volume-title":"Advanced Computing and Applications (ACOMP)2016 International Conference on","year":"2016"},{"issue":"5","key":"key2019092010555477600_ref027","doi-asserted-by":"crossref","first-page":"612","DOI":"10.1109\/TKDE.2004.1277822","article-title":"A case study of applying boosting naive bayes to claim fraud diagnosis","volume":"16","year":"2004","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"issue":"3","key":"key2019092010555477600_ref026","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1016\/j.eswa.2005.04.030","article-title":"Auto claim fraud detection using bayesian learning neural networks","volume":"29","year":"2005","journal-title":"Expert Systems with Applications"},{"issue":"1","key":"key2019092010555477600_ref028","doi-asserted-by":"publisher","first-page":"1","DOI":"10.14445\/22312803\/IJCTT-V21P101","article-title":"Web crawl detection and analysis of semantic data","volume":"21","year":"2015","journal-title":"International Journal of Computer Trends and Technology"},{"first-page":"412","article-title":"A comparative study on feature selection in text categorization","year":"1997","key":"key2019092010555477600_ref029"},{"key":"key2019092010555477600_ref030","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1016\/j.elerap.2016.03.003","article-title":"An analysis of popularity information effects: field experiments in an online marketplace","volume":"17","year":"2016","journal-title":"Electronic Commerce Research and Applications"},{"issue":"2","key":"key2019092010555477600_ref033","first-page":"322","article-title":"The effects of manager compensation and market competition on financial fraud in public companies: an empirical study in China","volume":"25","year":"2008","journal-title":"International Journal of Management"}],"container-title":["International Journal of Web Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-10-2018-0075\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-10-2018-0075\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:24:20Z","timestamp":1753395860000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ijwis\/article\/15\/4\/454-473\/163730"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,7]]},"references-count":33,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,10,7]]}},"alternative-id":["10.1108\/IJWIS-10-2018-0075"],"URL":"https:\/\/doi.org\/10.1108\/ijwis-10-2018-0075","relation":{},"ISSN":["1744-0084","1744-0084"],"issn-type":[{"type":"print","value":"1744-0084"},{"type":"print","value":"1744-0084"}],"subject":[],"published":{"date-parts":[[2019,10,7]]}}}