{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T07:34:57Z","timestamp":1775460897848,"version":"3.50.1"},"reference-count":35,"publisher":"Emerald","issue":"8","license":[{"start":{"date-parts":[[2017,9,11]],"date-time":"2017-09-11T00:00:00Z","timestamp":1505088000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IMDS"],"published-print":{"date-parts":[[2017,9,11]]},"abstract":"<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title>\n<jats:p>As the number of users on social network services (SNSs) continues to increase at a remarkable rate, privacy and security issues are consistently arising. Although users may not want to disclose their private attributes, these can be inferred from their public behavior on social media. In order to investigate the severity of the leakage of private information in this manner, the purpose of this paper is to present a method to infer undisclosed personal attributes of users based only on the data available on their public profiles on Facebook.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title>\n<jats:p>Facebook profile data consisting of 32 attributes were collected for 111,123 Korean users. Inferences were made for four private attributes (gender, age, marital status, and relationship status) based on five machine learning-based classification algorithms and three regression algorithms.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Findings<\/jats:title>\n<jats:p>Experimental results showed that users\u2019 gender can be inferred very accurately, whereas marital status and relationship status can be predicted more accurately with the authors\u2019 algorithms than with a random model. Moreover, the average difference between the actual and predicted ages of users was only 0.5 years. The results show that some private attributes can be easily inferred from only a few pieces of user profile information, which can jeopardize personal information and may increase the risk to dignity.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Research limitations\/implications<\/jats:title>\n<jats:p>In this paper, the authors\u2019 only utilized each user\u2019s own profile data, especially text information. Since users in SNSs are directly or indirectly connected, inference performance can be improved if the profile data of the friends of a given user are additionally considered. Moreover, utilizing non-text profile information, such as profile images, can help increase inference accuracy. The authors\u2019 can also provide a more generalized inference performance if a larger data set of Facebook users is available.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Practical implications<\/jats:title>\n<jats:p>A private attribute leakage alarm system based on the inference model would be helpful for users not desirous of the disclosure of their private attributes on SNSs. SNS service providers can measure and monitor the risk of privacy leakage in their system to protect their users and optimize the target marketing based on the inferred information if users agree to use it.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title>\n<jats:p>This paper investigates whether private attributes of SNS users can be inferred with a few pieces of publicly available information although users are not willing to disclose them. The experimental results showed that gender, age, marital status, and relationship status, can be inferred by machine-learning algorithms. Based on these results, an early warning system was designed to help both service providers and users to protect the users\u2019 privacy.<\/jats:p>\n<\/jats:sec>","DOI":"10.1108\/imds-07-2016-0276","type":"journal-article","created":{"date-parts":[[2017,9,4]],"date-time":"2017-09-04T09:25:54Z","timestamp":1504517154000},"page":"1687-1706","source":"Crossref","is-referenced-by-count":10,"title":["Private attribute inference from Facebook\u2019s public text metadata: a case study of Korean users"],"prefix":"10.1108","volume":"117","author":[{"given":"Daeseon","family":"Choi","sequence":"first","affiliation":[]},{"given":"Younho","family":"Lee","sequence":"additional","affiliation":[]},{"given":"Seokhyun","family":"Kim","sequence":"additional","affiliation":[]},{"given":"Pilsung","family":"Kang","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"key":"key2020120519595419000_ref001","first-page":"302","article-title":"Predicting personality with social behavior","year":"2012"},{"key":"key2020120519595419000_ref002","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.cose.2014.04.004","article-title":"Unintended disclosure of information: inference attacks by third-party extensions to social network systems","volume":"44","year":"2014","journal-title":"Computers & Security"},{"key":"key2020120519595419000_ref003","first-page":"739","article-title":"Language independent gender classification on Twitter","year":"2013"},{"key":"key2020120519595419000_ref004","first-page":"492","article-title":"Predicting the future with social media","year":"2010"},{"key":"key2020120519595419000_ref005","first-page":"1","article-title":"Privacy leakage in mobile online social networks","year":"2010"},{"key":"key2020120519595419000_ref006","volume-title":"Network Science","year":"2016"},{"key":"key2020120519595419000_ref007","volume-title":"Linked: The New Science of Networks","year":"2002"},{"key":"key2020120519595419000_ref008","volume-title":"Neural Network for Pattern Recognition","year":"1995"},{"issue":"1","key":"key2020120519595419000_ref009","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jocs.2010.12.007","article-title":"Twitter mood predicts the stock market","volume":"2","year":"2011","journal-title":"Journal of Computational Science"},{"key":"key2020120519595419000_ref010","volume-title":"Analyzing Social Networks","year":"2013"},{"key":"key2020120519595419000_ref011","volume-title":"Classification and Regression Tree","year":"1984"},{"issue":"2","key":"key2020120519595419000_ref012","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1023\/A:1009715923555","article-title":"A tutorial on support vector machines for pattern recognition","volume":"2","year":"1998","journal-title":"Data Mining and Knowledge Discovery"},{"key":"key2020120519595419000_ref013","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1038\/449644a","article-title":"Data sharing threatens privacy","volume":"449","year":"2007","journal-title":"Nature"},{"key":"key2020120519595419000_ref014","first-page":"428","article-title":"Leveraging online social networks and external data sources to predict personality","year":"2011"},{"key":"key2020120519595419000_ref015","first-page":"2836","article-title":"Estimating age privacy leakage in online social networks","year":"2012"},{"issue":"6","key":"key2020120519595419000_ref016","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/S1361-3723(10)70066-X","article-title":"Social media: opportunity or risk?","volume":"2010","year":"2010","journal-title":"Computer Fraud & Security"},{"key":"key2020120519595419000_ref017","volume-title":"Statistical Models: Theory and Practice","year":"2009","edition":"2nd ed."},{"key":"key2020120519595419000_ref019","unstructured":"Hymowitz, K., Carroll, J.S., Wilcox, W.B. and Kaye, K. (2013), \u201cKnot yet: the benefits and costs of delayed marriage in America\u201d, technical report, The National Campaign to Prevent Teen and Unplanned Pregnancy, the National Marriage Project at the University of Virginia, and the RELATE Institute."},{"issue":"11","key":"key2020120519595419000_ref021","doi-asserted-by":"crossref","first-page":"3507","DOI":"10.1016\/j.patcog.2008.04.009","article-title":"Locally linear reconstruction for instance-based learning","volume":"41","year":"2008","journal-title":"Pattern Recognition"},{"issue":"2","key":"key2020120519595419000_ref022","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1016\/j.ijforecast.2014.05.006","article-title":"Box office forecasting using machine learning algorithms based on SNS data","volume":"31","year":"2015","journal-title":"International Journal of Forecasting"},{"issue":"15","key":"key2020120519595419000_ref023","doi-asserted-by":"crossref","first-page":"5802","DOI":"10.1073\/pnas.1218772110","article-title":"Private traits and attributes are predictable from digital records of human behavior","volume":"110","year":"2013","journal-title":"PNAS"},{"key":"key2020120519595419000_ref020","first-page":"361","article-title":"A machine learning based approach for predicting undisclosed attributes in social networks","year":"2012"},{"key":"key2020120519595419000_ref024","first-page":"239","article-title":"Privacy leakage analysis in online social networks","volume":"49","year":"2014","journal-title":"Computers & Security"},{"key":"key2020120519595419000_ref025","first-page":"1","article-title":"Age and gender identification in social media","year":"2014"},{"key":"key2020120519595419000_ref026","first-page":"66","article-title":"The small world problem","volume":"2","year":"1967","journal-title":"Psychology Today"},{"key":"key2020120519595419000_ref027","first-page":"251","article-title":"You are who you know: inferring user profiles in online social networks","year":"2010"},{"key":"key2020120519595419000_ref028","first-page":"111","article-title":"Robust de-anonymization of large sparse datasets","year":"2008"},{"key":"key2020120519595419000_ref029","first-page":"439","article-title":"\u2018How old do you think i am?\u2019 A study of language and age in Twitter","year":"2013"},{"key":"key2020120519595419000_ref030","first-page":"563","article-title":"Inferring user personality in social networks: a case study in Facebook","year":"2011"},{"key":"key2020120519595419000_ref031","first-page":"266","article-title":"Does age make a difference in the behaviour of online social network users","year":"2011"},{"key":"key2020120519595419000_ref032","first-page":"37","article-title":"Classifying latent user attributes in Twitter","year":"2010"},{"key":"key2020120519595419000_ref033","volume-title":"Introduction to Probability and Statistic for Engineers and Scientists","year":"2004"},{"key":"key2020120519595419000_ref034","first-page":"53","article-title":"Predicting the 2011 Dutch Senate election results with Twitter","year":"2012"},{"issue":"3","key":"key2020120519595419000_ref035","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1080\/13600869.2014.913874","article-title":"Privacy principles, risks and harms","volume":"28","year":"2014","journal-title":"International Review of Law, Computers & Technology"},{"issue":"4","key":"key2020120519595419000_ref036","doi-asserted-by":"crossref","first-page":"1036","DOI":"10.1073\/pnas.1418680112","article-title":"Computer-based personality judgments are more accurate than those made by humans","volume":"112","year":"2015","journal-title":"PNAS"}],"container-title":["Industrial Management &amp; Data Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IMDS-07-2016-0276\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IMDS-07-2016-0276\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T21:52:43Z","timestamp":1753393963000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/imds\/article\/117\/8\/1687-1706\/179974"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,9,11]]},"references-count":35,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2017,9,11]]}},"alternative-id":["10.1108\/IMDS-07-2016-0276"],"URL":"https:\/\/doi.org\/10.1108\/imds-07-2016-0276","relation":{},"ISSN":["0263-5577"],"issn-type":[{"value":"0263-5577","type":"print"}],"subject":[],"published":{"date-parts":[[2017,9,11]]}}}