{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T15:39:23Z","timestamp":1766158763417,"version":"3.40.5"},"reference-count":56,"publisher":"Wiley","license":[{"start":{"date-parts":[[2021,7,5]],"date-time":"2021-07-05T00:00:00Z","timestamp":1625443200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Security and Communication Networks"],"published-print":{"date-parts":[[2021,7,5]]},"abstract":"<jats:p>Nearly most of the organizations store massive amounts of data in large databases for research, statistics, and mining purposes. In most cases, much of the accumulated data contain sensitive information belonging to individuals which may breach privacy. Hence, ensuring privacy in big data is considered a very important issue. The concept of privacy aims to protect sensitive information from various attacks that may violate the identity of individuals. Anonymization techniques are considered the best way to ensure privacy in big data. Various works have been already realized, taking into account horizontal clustering. The L-diversity technique is one of those techniques dealing with sensitive numerical and categorical attributes. However, the majority of anonymization techniques using L-diversity principle for hierarchical data cannot resist the similarity attack and therefore cannot ensure privacy carefully. In order to prevent the similarity attack while preserving data utility, a hybrid technique dealing with categorical attributes is proposed in this paper. Furthermore, we highlighted all the steps of our proposed algorithm with detailed comments. Moreover, the algorithm is implemented and evaluated according to a well-known information loss-based criterion which is Normalized Certainty Penalty (NCP). The obtained results show a good balance between privacy and data utility.<\/jats:p>","DOI":"10.1155\/2021\/6612923","type":"journal-article","created":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T18:50:07Z","timestamp":1625597407000},"page":"1-17","source":"Crossref","is-referenced-by-count":6,"title":["Proximity Measurement for Hierarchical Categorical Attributes in Big Data"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4775-5967","authenticated-orcid":true,"given":"Zakariae","family":"El Ouazzani","sequence":"first","affiliation":[{"name":"Rabat-IT Center, ENSIAS, Mohammed V University in Rabat, Rabat, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9965-915X","authenticated-orcid":true,"given":"An","family":"Braeken","sequence":"additional","affiliation":[{"name":"Industrial Engineering Department (INDI), Vrije Universiteit Brussel (VUB), Brussels, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2941-3768","authenticated-orcid":true,"given":"Hanan","family":"El Bakkali","sequence":"additional","affiliation":[{"name":"Rabat-IT Center, ENSIAS, Mohammed V University in Rabat, Rabat, Morocco"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","reference":[{"doi-asserted-by":"publisher","key":"1","DOI":"10.1109\/ICC.2017.7997113"},{"doi-asserted-by":"publisher","key":"2","DOI":"10.1186\/s40537-017-0110-7"},{"doi-asserted-by":"publisher","key":"3","DOI":"10.1504\/ijbdi.2016.073904"},{"doi-asserted-by":"publisher","key":"4","DOI":"10.1109\/ISCO.2015.7282237"},{"doi-asserted-by":"publisher","key":"5","DOI":"10.1016\/j.procs.2019.01.190"},{"doi-asserted-by":"publisher","key":"6","DOI":"10.2991\/icence-16.2016.114"},{"doi-asserted-by":"publisher","key":"7","DOI":"10.1145\/2976738"},{"year":"2019","author":"X. Huang","article-title":"K-anonymity and L-diversity data anonymization in an in-memory database","key":"8"},{"author":"Z. El Ouazzani","first-page":"47","article-title":"Variable distinct L-diversity algorithm applied on highly sensitive correlated attributes","key":"9"},{"doi-asserted-by":"publisher","key":"10","DOI":"10.3390\/sym10080333"},{"author":"Z. El Ouazzani","first-page":"1","article-title":"New technique ensuring privacy in big data: variable t-closeness for sensitive numerical attributes","key":"11"},{"doi-asserted-by":"publisher","key":"12","DOI":"10.1109\/TVCG.2017.2745139"},{"doi-asserted-by":"publisher","key":"13","DOI":"10.5120\/12179-8291"},{"doi-asserted-by":"publisher","key":"14","DOI":"10.1007\/978-3-030-05054-2-36"},{"doi-asserted-by":"publisher","key":"15","DOI":"10.1145\/3164541.3164583"},{"author":"M. Orooji","first-page":"1415","article-title":"Improving suppression to reduce disclosure risk and enhance data utility","key":"16"},{"doi-asserted-by":"publisher","key":"17","DOI":"10.1007\/s10844-015-0373-4"},{"doi-asserted-by":"publisher","key":"18","DOI":"10.1109\/ICMLC.2017.8108955"},{"doi-asserted-by":"publisher","key":"19","DOI":"10.1016\/j.procs.2018.01.097"},{"doi-asserted-by":"publisher","key":"20","DOI":"10.1007\/978-3-030-00202-2-24"},{"doi-asserted-by":"publisher","key":"21","DOI":"10.1145\/3286606.3286793"},{"doi-asserted-by":"publisher","key":"22","DOI":"10.1109\/IBIGDELFT.2018.8625358"},{"doi-asserted-by":"publisher","key":"23","DOI":"10.1007\/978-3-030-10543-3_2"},{"doi-asserted-by":"publisher","key":"24","DOI":"10.3390\/e20050373"},{"author":"Y. Sei","first-page":"596","article-title":"(l1,\u2026lq)-diversity for anonymizing sensitive quasi-identifiers","key":"25"},{"doi-asserted-by":"publisher","key":"26","DOI":"10.1109\/TVT.2017.2738018"},{"doi-asserted-by":"publisher","key":"27","DOI":"10.1109\/ACCESS.2016.2577036"},{"doi-asserted-by":"publisher","key":"28","DOI":"10.1007\/978-3-319-98539-8_16"},{"doi-asserted-by":"publisher","key":"29","DOI":"10.1186\/s40537-018-0130-y"},{"doi-asserted-by":"publisher","key":"30","DOI":"10.32628\/CSEIT19516"},{"issue":"11","key":"31","first-page":"40","article-title":"A review of big data in the healthcare sector: evaluation and analysis of cervical cancer data","volume":"10","author":"W. Iftikhar","year":"2018","journal-title":"Journal of Advanced Research in Dynamical and Control Systems"},{"doi-asserted-by":"publisher","key":"32","DOI":"10.1186\/s40537-018-0141-8"},{"doi-asserted-by":"publisher","key":"33","DOI":"10.4018\/IJITWE.2019040102"},{"doi-asserted-by":"publisher","key":"34","DOI":"10.1007\/978-981-10-7641-1-12"},{"doi-asserted-by":"publisher","key":"35","DOI":"10.1109\/MITP.2018.032501750"},{"issue":"12","key":"36","first-page":"172","article-title":"A study on k-anonymity, l-diversity, and t-closeness techniques focusing medical data","volume":"17","author":"K. Rajendran","year":"2017","journal-title":"International Journal of Computer Science and Network Security (IJCSNS)"},{"author":"E. Poovammal","first-page":"1","article-title":"Preserving micro data release: categorical and numerical data","key":"37"},{"doi-asserted-by":"publisher","key":"38","DOI":"10.1016\/j.jvcir.2018.12.052"},{"doi-asserted-by":"publisher","key":"39","DOI":"10.1109\/KST.2016.7440495"},{"doi-asserted-by":"publisher","key":"40","DOI":"10.1109\/ITNEC.2017.8284835"},{"doi-asserted-by":"publisher","key":"41","DOI":"10.1109\/ICOMET.2018.8346323"},{"doi-asserted-by":"publisher","key":"42","DOI":"10.1007\/978-3-319-64471-4_3"},{"issue":"5","key":"43","first-page":"6","article-title":"Enhancing utility and privacy using t-closeness for multiple sensitive attributes","volume":"10","author":"S. Saraswathi","year":"2016","journal-title":"Advances in Natural and Applied Sciences"},{"doi-asserted-by":"publisher","key":"44","DOI":"10.1007\/s11390-018-1884-6"},{"doi-asserted-by":"publisher","key":"45","DOI":"10.1109\/CompComm.2017.8322783"},{"doi-asserted-by":"publisher","key":"46","DOI":"10.1007\/978-3-662-58384-5_2"},{"doi-asserted-by":"publisher","key":"47","DOI":"10.4304\/jcp.9.1.59-64"},{"doi-asserted-by":"publisher","key":"48","DOI":"10.1109\/SSCI.2017.8280973"},{"doi-asserted-by":"publisher","key":"49","DOI":"10.1093\/jamia\/ocx079"},{"doi-asserted-by":"publisher","key":"50","DOI":"10.1007\/s10207-017-0392-y"},{"doi-asserted-by":"publisher","key":"51","DOI":"10.1051\/matecconf\/201818903007"},{"doi-asserted-by":"publisher","key":"52","DOI":"10.1007\/s10586-017-0795-6"},{"doi-asserted-by":"publisher","key":"53","DOI":"10.1145\/1538909.1538911"},{"doi-asserted-by":"publisher","key":"54","DOI":"10.1016\/j.eswa.2012.02.179"},{"doi-asserted-by":"publisher","key":"55","DOI":"10.1109\/CSCWD.2016.7565966"},{"year":"2018","author":"C. Hebert","article-title":"Anonymization techniques to protect data","key":"56"}],"container-title":["Security and Communication Networks"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2021\/6612923.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2021\/6612923.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2021\/6612923.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T18:50:12Z","timestamp":1625597412000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/scn\/2021\/6612923\/"}},"subtitle":[],"editor":[{"given":"Fulvio","family":"Valenza","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,7,5]]},"references-count":56,"alternative-id":["6612923","6612923"],"URL":"https:\/\/doi.org\/10.1155\/2021\/6612923","relation":{},"ISSN":["1939-0122","1939-0114"],"issn-type":[{"type":"electronic","value":"1939-0122"},{"type":"print","value":"1939-0114"}],"subject":[],"published":{"date-parts":[[2021,7,5]]}}}