{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:37:33Z","timestamp":1760240253120,"version":"build-2065373602"},"reference-count":80,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2019,4,19]],"date-time":"2019-04-19T00:00:00Z","timestamp":1555632000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Customer Relationship Management (CRM) is a fundamental tool in the hospitality industry nowadays, which can be seen as a big-data scenario due to the large amount of recordings which are annually handled by managers. Data quality is crucial for the success of these systems, and one of the main issues to be solved by businesses in general and by hospitality businesses in particular in this setting is the identification of duplicated customers, which has not received much attention in recent literature, probably and partly because it is not an easy-to-state problem in statistical terms. In the present work, we address the problem statement of duplicated customer identification as a large-scale data analysis, and we propose and benchmark a general-purpose solution for it. Our system consists of four basic elements: (a) A generic feature representation for the customer fields in a simple table-shape database; (b) An efficient distance for comparison among feature values, in terms of the Wagner-Fischer algorithm to calculate the Levenshtein distance; (c) A big-data implementation using basic map-reduce techniques to readily support the comparison of strategies; (d) An X-from-M criterion to identify those possible neighbors to a duplicated-customer candidate. We analyze the mass density function of the distances in the CRM text-based fields and characterized their behavior and consistency in terms of the entropy and of the mutual information for these fields. Our experiments in a large CRM from a multinational hospitality chain show that the distance distributions are statistically consistent for each feature, and that neighbourhood thresholds are automatically adjusted by the system at a first step and they can be subsequently more-finely tuned according to the manager experience. The entropy distributions for the different variables, as well as the mutual information between pairs, are characterized by multimodal profiles, where a wide gap between close and far fields is often present. This motivates the proposal of the so-called X-from-M strategy, which is shown to be computationally affordable, and can provide the expert with a reduced number of duplicated candidates to supervise, with low X values being enough to warrant the sensitivity required at the automatic detection stage. The proposed system again encourages and supports the benefits of big-data technologies in CRM scenarios for hotel chains, and rather than the use of ad-hoc heuristic rules, it promotes the research and development of theoretically principled approaches.<\/jats:p>","DOI":"10.3390\/e21040419","type":"journal-article","created":{"date-parts":[[2019,4,22]],"date-time":"2019-04-22T03:15:53Z","timestamp":1555902953000},"page":"419","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Entropic Statistical Description of Big Data Quality in Hotel Customer Relationship Management"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5203-3806","authenticated-orcid":false,"given":"Lydia","family":"Gonz\u00e1lez-Serrano","sequence":"first","affiliation":[{"name":"Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0171-901X","authenticated-orcid":false,"given":"Pilar","family":"Tal\u00f3n-Ballestero","sequence":"additional","affiliation":[{"name":"Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1356-2646","authenticated-orcid":false,"given":"Sergio","family":"Mu\u00f1oz-Romero","sequence":"additional","affiliation":[{"name":"Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain"},{"name":"Department of Theory and Comunications, Telematics and Computing Systems, Rey Juan Carlos University, 28943 Madrid, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5817-989X","authenticated-orcid":false,"given":"Cristina","family":"Soguero-Ruiz","sequence":"additional","affiliation":[{"name":"Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0426-8912","authenticated-orcid":false,"given":"Jos\u00e9 Luis","family":"Rojo-\u00c1lvarez","sequence":"additional","affiliation":[{"name":"Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain"},{"name":"Department of Theory and Comunications, Telematics and Computing Systems, Rey Juan Carlos University, 28943 Madrid, Spain"}]}],"member":"1968","published-online":{"date-parts":[[2019,4,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.engappai.2016.08.012","article-title":"Evolutionary computing applied to customer relationship management: A survey","volume":"56","author":"Krishna","year":"2016","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1007\/s11747-007-0028-2","article-title":"Measuring and maximizing customer equity: A critical analysis","volume":"35","author":"Kumar","year":"2007","journal-title":"J. Acad. Mark. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1509\/jmkg.72.1.027","article-title":"Interaction orientation and firm performance","volume":"72","author":"Ramani","year":"2008","journal-title":"J. Mark."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1170","DOI":"10.1016\/j.indmarman.2010.02.001","article-title":"A process-oriented perspective on customer relationship management and organizational performance: An empirical investigation","volume":"39","author":"Keramati","year":"2010","journal-title":"Ind. Mark. Manag."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1111\/j.1530-9134.2010.00256.x","article-title":"Customer information sharing: Strategic incentives and new implications","volume":"19","author":"Kim","year":"2010","journal-title":"J. Econ. Manag. Strategy"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1016\/j.ijhm.2004.08.008","article-title":"Integrating customer relationship management in hotel operations: Managerial and operational implications","volume":"24","author":"Sigala","year":"2005","journal-title":"Int. J. Hosp. Manag."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1108\/08876041111149676","article-title":"Satisfaction, inertia, and customer loyalty in the varying levels of the zone of tolerance and alternative attractiveness","volume":"25","author":"Wu","year":"2011","journal-title":"J. Serv. Mark."},{"key":"ref_8","first-page":"297","article-title":"Linking CRM strategy, customer performance measures and performance in the hotel industry","volume":"3","author":"Kasim","year":"2009","journal-title":"Int. J. Econ. Manag."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.12816\/0018976","article-title":"Case Study of Hotel Taj in the Context of CRM and Customer Retention","volume":"4","author":"Chadha","year":"2015","journal-title":"Kuwait Chapter Arab. J. Bus. Manag. Rev."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1177\/001088040004100122","article-title":"Marketing challenges for the next decade","volume":"41","author":"Dev","year":"2000","journal-title":"Cornell Hotel Restaur. Adm. Q."},{"key":"ref_11","unstructured":"Kotler, P. (2002, January 30). When to use CRM and When to forget it. Paper Presented at the Academy of Marketing Science, Sanibel Harbour Resort and Spa, Fort Myers, FL, USA."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1080\/1478336032000053843","article-title":"Strategic analysis of customer relationship management-a field study on hotel enterprises","volume":"14","author":"Lin","year":"2003","journal-title":"Total Qual. Manag. Bus. Excell."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1108\/03090560810853020","article-title":"Organisational capabilities: Antecedents and implications for customer value","volume":"42","author":"Nasution","year":"2008","journal-title":"Eur. J. Mark."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1108\/09685220710748001","article-title":"Strategies for successful CRM implementation","volume":"15","author":"Nguyen","year":"2007","journal-title":"Inf. Manag. Comput. Secur."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1080\/13683500.2013.805734","article-title":"Customer relationship management in hotels: Examining critical success factors","volume":"17","year":"2014","journal-title":"Curr. Issues Tour."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1007\/s11747-009-0164-y","article-title":"Customer relationship management and firm performance: The mediating role of business strategy","volume":"38","author":"Reimann","year":"2010","journal-title":"J. Acad. Mark. Sci."},{"key":"ref_17","unstructured":"Beg, J., and Hussain, S. (2003). Data Quality\u2014A Problem and An Approach, Wipro Technologies. White paper."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1016\/j.indmarman.2010.08.006","article-title":"Organisational, technical and data quality factors in CRM adoption-SMEs perspective","volume":"40","author":"Alshawi","year":"2011","journal-title":"Ind. Mark. Manag."},{"key":"ref_19","unstructured":"Moore, C. (2019, April 04). How to Create a Business Case for Data Quality Improvement. Available online: http:\/\/www.gartner.com\/smarterwithgartner\/howto-create-a-business-case-for-data-quality-improvement\/."},{"key":"ref_20","unstructured":"Turban, E., Leidner, D., McLean, E., and Wetherbe, J. (2008). Information Technology for Management, John Wiley & Sons."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1016\/j.chb.2016.03.008","article-title":"Customer relationship management mechanisms: A systematic review of the state of the art literature and recommendations for future research","volume":"61","author":"Soltani","year":"2016","journal-title":"Comput. Hum. Behav."},{"key":"ref_22","unstructured":"Akoka1a, J., Berti-Equille, L., Boucelma, O., Bouzeghoub, M., Comyn-Wattiau, I., Cosquer, M., Goasdou\u00e9-Thion, V., Kedad, Z., Nugier, S., and Peralta, V. (2007, January 12\u201316). A framework for quality evaluation in data integration systems. Proceedings of the 9th International Conference on Entreprise Information Systems, Madeira, Portugal."},{"key":"ref_23","unstructured":"Thompson, E., and Sarner, A. (2009). Key Issues for CRM Strategy and Implementations, Gartner Research. Technical Report."},{"key":"ref_24","unstructured":"Alonso, \u00d3., Delgado, A., and Pedrosa, P. (2008). Las Soluciones CRM en Espa\u00f1a, Penteo, ESADE Business School. Technical Report."},{"key":"ref_25","unstructured":"Eckerson, W.W. (2002). Data Quality and Bottom Line: Achieving Business Success through High Quality Data (TDWI Report Series), The Data Warehousing Institute."},{"key":"ref_26","unstructured":"Missi, F., Alshawi, S., and Fitzgerald, G. (2005, January 3\u20136). Why CRM efforts fail? A study of the impact of data quality and data integration. Proceedings of the 38th Annual Hawaii International Conference on System Sciences, Big Island, HI, USA."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1108\/02635570210414668","article-title":"Data quality issues in implementing an ERP","volume":"102","author":"Xu","year":"2002","journal-title":"Ind. Manag. Data Syst."},{"key":"ref_28","unstructured":"Moss, L., Abai, M., and Adelman, S. (2005). How to improve data quality. Data Strategy, Addison-Wesley Professional."},{"key":"ref_29","unstructured":"Goga, O. (2014). Matching User Accounts Across Online Social Networks: Methods and Spplications. [Ph.D. Thesis, LIP6-Laboratoire d\u2019Informatique de Paris 6]."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TKDE.2007.250581","article-title":"Duplicate record detection: A survey","volume":"19","author":"Elmagarmid","year":"2007","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1016\/j.eswa.2018.07.012","article-title":"Interactive feature selection for efficient customer recognition in contact centers: Dealing with common names","volume":"113","author":"Saberi","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_32","unstructured":"Helander, D. (2019, February 12). Solving the Hotel Data Management Problem in 3 Steps-Revinate. Available online: https:\/\/www.revinate.com\/es\/blog\/solving-hotel-data-management-problem-3-steps\/."},{"key":"ref_33","unstructured":"Schutz, T. (2019, April 01). The State of Data Quality. An Experian Data Quality White Paper. Available online: https:\/\/www.experian.com\/assets\/decision-analytics\/white-papers\/the%20state%20of%20data%20quality.pdf."},{"key":"ref_34","unstructured":"Pinto, F., Santos, M.F., Cortez, P., and Quintela, H. (2004). Data pre-processing for database marketing. Data Gadgets, Workshop."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1109\/TPAMI.2007.1078","article-title":"A normalized Levenshtein distance metric","volume":"29","author":"Yujian","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/0001-8708(76)90202-4","article-title":"Some biological sequence metrics","volume":"20","author":"Waterman","year":"1976","journal-title":"Adv. Math."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"482","DOI":"10.1016\/0196-8858(81)90046-4","article-title":"Comparison of biosequences","volume":"2","author":"Smith","year":"1981","journal-title":"Adv. Appl. Math."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1080\/01621459.1989.10478785","article-title":"Advances in record linkage-methodoly as applied to matching the 1985 census of Tampa, Florida","volume":"84","author":"Jaro","year":"1989","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1145\/1378727.1378745","article-title":"Information integration in the enterprise","volume":"51","author":"Bernstein","year":"2008","journal-title":"Commun. ACM"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Villaverde, A.F., Ross, J., Moran, F., and Banga, J.R. (2014). MIDER: Network inference with mutual information distance and entropy reduction. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0096732"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/j.neucom.2018.09.077","article-title":"Theoretical foundations of forward feature selection methods based on mutual information","volume":"325","author":"Macedo","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/j.eswa.2017.03.010","article-title":"Entity reconciliation in big data sources: A systematic mapping study","volume":"80","author":"Escalona","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.ijar.2017.01.003","article-title":"ERBlox: Combining matching dependencies with machine learning for entity resolution","volume":"83","author":"Bahmani","year":"2017","journal-title":"Int. J. Approx. Reason."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Maddodi, S., Attigeri, G.V., and Karunakar, A. (2010, January 19\u201321). Data deduplication techniques and analysis. Proceedings of the Third International Conference on Emerging Trends in Engineering and Technology, Goa, India.","DOI":"10.1109\/ICETET.2010.42"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Gaikwad, S., and Bogiri, N. (2015, January 8\u201310). A survey analysis on duplicate detection in hierarchical data. Proceedings of the 2015 International Conference on Pervasive Computing (ICPC), Pune, India.","DOI":"10.1109\/PERVASIVE.2015.7087099"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1007\/s00607-016-0490-0","article-title":"A systematic review and comparative analysis of cross-document coreference resolution methods and tools","volume":"99","author":"Beheshti","year":"2017","journal-title":"Computing"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"684","DOI":"10.14778\/2947618.2947624","article-title":"Comparative analysis of approximate blocking techniques for entity resolution","volume":"9","author":"Papadakis","year":"2016","journal-title":"Proc. VLDB Endow."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.jss.2016.02.022","article-title":"Enhancements for duplication detection in bug reports with manifold correlation features","volume":"121","author":"Lin","year":"2016","journal-title":"J. Syst. Softw."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Daniel, C., Serre, P., Orlova, N., Br\u00e9ant, S., Paris, N., and Griffon, N. (2018). Initializing a hospital-wide data quality program. The AP-HP experience. Comput. Methods Prog. Biomed.","DOI":"10.1016\/j.cmpb.2018.10.016"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Faed, A. (2013). An Intelligent Customer Complaint Management System with Application to the Transport and Logistics Industry, Springer Science & Business Media.","DOI":"10.1007\/978-3-319-00324-5"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"950","DOI":"10.1016\/j.compedu.2009.05.010","article-title":"Dropout prediction in e-learning courses through the combination of machine learning techniques","volume":"53","author":"Lykourentzou","year":"2009","journal-title":"Comput. Educ."},{"key":"ref_52","unstructured":"Chandran, K., Veeraraghavan, K., and Tb, A. (2016). Inquire management for hospital websystem using SaaS. Int. J. Adv. Res. Comput. Sci., 7."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1016\/j.fcij.2018.11.003","article-title":"A systematic review for the determination and classification of the CRM critical success factors supporting with their metrics","volume":"3","author":"Farhan","year":"2018","journal-title":"Future Comput. Inform. J."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Reid, A., and Catterall, M. (2015). Hidden data quality problems in CRM implementation. Marketing, Technology and Customer Commitment in the New Economy, Springer.","DOI":"10.1007\/978-3-319-11779-9_67"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Anshari, M., Almunawar, M.N., Lim, S.A., and Al-mudimigh, A. (2018). Customer Relationship Management and Big Data Enabled: Personalization & Customization of Services. Appl. Comput. Inform.","DOI":"10.1016\/j.aci.2018.05.004"},{"key":"ref_56","unstructured":"Maguire, E. (2019, February 08). The Data Differentiator. How Improving Data Quality Improves Business. Available online: https:\/\/www.forbes.com\/forbes-insights\/our-work\/data-differentiator-report\/."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.websem.2013.06.001","article-title":"Active learning of expressive linkage rules using genetic programming","volume":"23","author":"Isele","year":"2013","journal-title":"Web Semant."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","article-title":"MapReduce: Simplified Data Processing on Large Clusters","volume":"51","author":"Dean","year":"2008","journal-title":"Commun. ACM"},{"key":"ref_59","first-page":"707","article-title":"Binary Codes Capable of Correcting Deletions, Insertions and Reversals","volume":"10","author":"Levenshtein","year":"1966","journal-title":"Sov. Phys. Doklady"},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1145\/375360.375365","article-title":"A Guided Tour to Approximate String Matching","volume":"33","author":"Navarro","year":"2001","journal-title":"ACM Comput. Surv."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1145\/321796.321811","article-title":"The String-to-String Correction Problem","volume":"21","author":"Wagner","year":"1974","journal-title":"J. ACM"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1057\/palgrave.jt.5740086","article-title":"Marketing data analysis and data quality management","volume":"11","author":"Courtheoux","year":"2003","journal-title":"J. Target. Meas. Anal. Mark."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1057\/palgrave.jdm.3240105","article-title":"Managing the quality and completeness of customer data","volume":"10","author":"Foss","year":"2002","journal-title":"J. Database Mark. Cust. Strategy Manag."},{"key":"ref_64","first-page":"26","article-title":"Relationship marketing and data quality management","volume":"64","author":"Khalil","year":"1999","journal-title":"SAM Adv. Manag. J."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1016\/j.tourman.2018.03.017","article-title":"Using big data from Customer Relationship Management information systems to determine the client profile in the hotel sector","volume":"68","year":"2018","journal-title":"Tour. Manag."},{"key":"ref_66","first-page":"94","article-title":"Rethinking marketing","volume":"88","author":"Rust","year":"2010","journal-title":"Harv. Bus. Rev."},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"305","DOI":"10.2307\/20721429","article-title":"A multi-project model of key factors affecting organizational benefits from enterprise systems","volume":"34","author":"Seddon","year":"2010","journal-title":"MIS Q."},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1016\/j.indmarman.2003.10.002","article-title":"Sources, uses, and forms of data in the new product development process","volume":"33","author":"Zahay","year":"2004","journal-title":"Ind. Mark. Manag."},{"key":"ref_69","unstructured":"Aloini, D., Dulmin, R., Mininno, V., and Zerbino, P. (2016, January 15\u201317). Big Data: A proposal for enabling factors in Customer Relationship Management. Proceedings of the 11th International Forum on Knowledge Asset Dynamics, Dresden, Germany."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1109\/TKDE.2013.109","article-title":"Data mining with big data","volume":"26","author":"Wu","year":"2014","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"1662","DOI":"10.1016\/j.neucom.2017.10.010","article-title":"Semi-supervised learning for big social data analysis","volume":"275","author":"Hussain","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Huh, J.H. (2018). Big data analysis for personalized health activities: Machine learning processing for automatic keyword extraction approach. Symmetry, 10.","DOI":"10.3390\/sym10040093"},{"key":"ref_73","unstructured":"Oliver, A., Odena, A., Raffel, C.A., Cubuk, E.D., and Goodfellow, I. (2018, January 2\u20138). Realistic evaluation of deep semi-supervised learning algorithms. Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, QC, Canada."},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1007\/s11548-018-1772-0","article-title":"Exploiting the potential of unlabeled endoscopic video data with self-supervised learning","volume":"13","author":"Ross","year":"2018","journal-title":"Int. J. Comput. Assisted Radiol. Surg."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.imavis.2016.08.006","article-title":"Random multi-graphs: A semi-supervised learning framework for classification of high dimensional data","volume":"60","author":"Zhang","year":"2017","journal-title":"Image Vision Comput."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.engappai.2016.01.007","article-title":"A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets","volume":"51","author":"Charalampakis","year":"2016","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"3019","DOI":"10.1080\/00207543.2016.1154208","article-title":"Understanding big consumer opinion data for market-driven product design","volume":"54","author":"Jin","year":"2016","journal-title":"Int. J. Prod. Res."},{"key":"ref_78","first-page":"342","article-title":"Survey on intrusion detection using data mining methods","volume":"3","author":"Parihar","year":"2016","journal-title":"Int. J. Sci. Adv. Res. Technol."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.media.2016.06.032","article-title":"Machine Learning Approaches in Medical Image Analysis: From Detection to Diagnosis","volume":"33","year":"2016","journal-title":"Med. Image Anal."},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.dss.2010.07.011","article-title":"Evaluating a model for cost-effective data quality management in a real-world CRM setting","volume":"50","author":"Even","year":"2010","journal-title":"Decis. Support Syst."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/4\/419\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:46:50Z","timestamp":1760186810000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/4\/419"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,19]]},"references-count":80,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2019,4]]}},"alternative-id":["e21040419"],"URL":"https:\/\/doi.org\/10.3390\/e21040419","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2019,4,19]]}}}