{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T17:39:27Z","timestamp":1754156367116,"version":"3.41.2"},"reference-count":39,"publisher":"Emerald","issue":"1","license":[{"start":{"date-parts":[[2019,12,17]],"date-time":"2019-12-17T00:00:00Z","timestamp":1576540800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["OIR"],"published-print":{"date-parts":[[2019,12,17]]},"abstract":"<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title>\n<jats:p>This work studies automated user classification on Twitter in the public health domain, a task that is essential to many public health-related research works on social media but has not been addressed. The purpose of this paper is to obtain empirical knowledge on how to optimise the classifier performance on this task.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title>\n<jats:p>A sample of 3,100 Twitter users who tweeted about different health conditions were manually coded into six most common stakeholders. The authors propose new, simple features extracted from the short Twitter profiles of these users, and compare a large set of classification models (including state-of-the-art) that use more complex features and with different algorithms on this data set.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Findings<\/jats:title>\n<jats:p>The authors show that user classification in the public health domain is a very challenging task, as the best result the authors can obtain on this data set is only 59 per cent in terms of F1 score. Compared to state-of-the-art, the methods can obtain significantly better (10 percentage points in F1 on a \u201cbest-against-best\u201d basis) results when using only a small set of 40 features extracted from the short Twitter user profile texts.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title>\n<jats:p>The work is the first to study the different types of users that engage in health-related communication on social media, applicable to a broad range of health conditions rather than specific ones studied in the previous work. The methods are implemented as open source tools, and together with data, are the first of this kind. The authors believe these will encourage future research to further improve this important task.<\/jats:p>\n<\/jats:sec>","DOI":"10.1108\/oir-05-2019-0143","type":"journal-article","created":{"date-parts":[[2019,12,31]],"date-time":"2019-12-31T08:16:02Z","timestamp":1577780162000},"page":"213-237","source":"Crossref","is-referenced-by-count":10,"title":["\u201cLess is more\u201d"],"prefix":"10.1108","volume":"44","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8587-8618","authenticated-orcid":false,"given":"Ziqi","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Georgica","family":"Bors","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","reference":[{"article-title":"A new model for classifying social media users according to their behaviors","year":"2015","key":"key2020012211505245800_ref001","doi-asserted-by":"publisher","DOI":"10.1109\/WSWAN.2015.7209085"},{"issue":"9","key":"key2020012211505245800_ref002","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1016\/j.urolonc.2016.02.021","article-title":"Activity, content, contributors, and influencers of the twitter discussion on urologic oncology","volume":"34","year":"2016","journal-title":"Urologic Oncology: Seminars and Original Investigations"},{"key":"key2020012211505245800_ref003","doi-asserted-by":"crossref","first-page":"1470","DOI":"10.1002\/bjs.10615","article-title":"#colorectalsurgery","volume":"104","year":"2017","journal-title":"British Journal of Surgery"},{"issue":"6","key":"key2020012211505245800_ref004","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1109\/TDSC.2012.75","article-title":"Detecting automation of Twitter accounts: are you a human, bot, or cyborg?","volume":"9","year":"2012","journal-title":"IEEE Transactions on Dependable and Secure Computing"},{"first-page":"91","article-title":"Classifying political orientation on Twitter: it\u2019s not easy!","year":"2013","key":"key2020012211505245800_ref005"},{"key":"key2020012211505245800_ref006","article-title":"Social media for arthritis-related comparative effectiveness and safety research and the impact of direct-to-consumer advertising","volume":"19","year":"2017","journal-title":"Arthritis Research & Therapy"},{"issue":"2","key":"key2020012211505245800_ref007","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/j.colegn.2014.03.002","article-title":"Social media: a tool to spread information: a case study analysis of Twitter conversation at the cardiac society of Australia & New Zealand 61st Annual Scientific Meeting 2013","volume":"21","year":"2014","journal-title":"Collegian"},{"year":"2014","key":"key2020012211505245800_ref008","article-title":"Inferring user social class in online social networks"},{"year":"2014","key":"key2020012211505245800_ref009","article-title":"Mining Twitter for adverse drug reaction mentions: a corpus and classification benchmark"},{"year":"2015","key":"key2020012211505245800_ref010","article-title":"User profiling trends, techniques and applications"},{"issue":"3","key":"key2020012211505245800_ref011","article-title":"Classification of Twitter users who tweet about e-cigarettes","volume":"3","year":"2017","journal-title":"Journal of Medical Internet Research"},{"key":"key2020012211505245800_ref012","doi-asserted-by":"crossref","unstructured":"Kursuncu, U., Gaur, M., Lokala, U., Illendula, A., Thirunarayan, K., Daniulaityte, R. and Arpinar, I.B. (2018), \u201c\u2018What\u2019s ur type?\u2019 contextualized classification of user types in marijuana-related communications using compositional multiview embedding\u201d Proceedings of the 2018 IEEE\/WIC\/ACM International Conference on Web Intelligence, Santiago, December 3-6, available at: https:\/\/doi.org\/10.1109\/WI.2018.00-50","DOI":"10.1109\/WI.2018.00-50"},{"first-page":"2267","article-title":"Recurrent convolutional neural networks for text classification","year":"2015","key":"key2020012211505245800_ref013"},{"journal-title":"BJU International","article-title":"Tweet this: how advocacy for breast and prostate cancers stacks up on social media","year":"2017","key":"key2020012211505245800_ref014"},{"year":"2015","key":"key2020012211505245800_ref015","article-title":"Organizations are users too: characterizing and detecting the presence of organizations on Twitter"},{"issue":"4","key":"key2020012211505245800_ref016","article-title":"A new dimension of health care: systematic review of the uses, benefits, and limitations of social media for health communication","volume":"15","year":"2013","journal-title":"Journal of Medical Internet Research"},{"issue":"6","key":"key2020012211505245800_ref017","doi-asserted-by":"publisher","first-page":"1032","DOI":"10.1136\/amiajnl-2014-00265","article-title":"Tweeting it off: characteristics of adults who tweet about a weight loss attempt","volume":"21","year":"2014","journal-title":"Journal of the American Medical Informatics Association"},{"issue":"2","key":"key2020012211505245800_ref018","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1080\/10810730.2015.1058435","article-title":"Tweeting as health communication: health organizations\u2019 use of Twitter for health promotion and public engagement","volume":"21","year":"2016","journal-title":"Journal of Health Communication"},{"year":"2011","key":"key2020012211505245800_ref019","article-title":"You are what you Tweet: analyzing twitter for public health"},{"year":"2011","key":"key2020012211505245800_ref020","article-title":"A machine learning approach to Twitter user classification"},{"first-page":"1754","article-title":"An analysis of the user occupational class through Twitter content","year":"2015","key":"key2020012211505245800_ref021"},{"year":"2017","key":"key2020012211505245800_ref022","article-title":"Beyond binary labels: political ideology prediction of Twitter users"},{"issue":"2","key":"key2020012211505245800_ref023","article-title":"Measuring audience engagement for public health Twitter chats: insights from #LiveFitNOLA","volume":"3","year":"2017","journal-title":"JMIR Public Health Surveillance"},{"first-page":"37","article-title":"Classifying latent user attributes in Twitter","year":"2010","key":"key2020012211505245800_ref024"},{"key":"key2020012211505245800_ref025","article-title":"Use of Twitter to monitor attitudes toward depression and schizophrenia: an exploratory study","volume":"2","year":"2014","journal-title":"PeerJ"},{"issue":"11","key":"key2020012211505245800_ref026","doi-asserted-by":"crossref","first-page":"1367","DOI":"10.1016\/j.acra.2016.07.012","article-title":"What do patients tweet about their mammography experience?","volume":"23","year":"2016","journal-title":"Academic Radiology"},{"key":"key2020012211505245800_ref027","doi-asserted-by":"crossref","unstructured":"Singh, K. and John, A. (2015), \u201cA study of tweet chats for breast cancer patients\u201d, in Gruzd, A., Jacobson, J., Mai, P. and Wellman, B. (Eds), Proceedings of the 2015 International Conference on Social Media & Society, ACM, New York, NY, p. 6.","DOI":"10.1145\/2789187.2789193"},{"first-page":"18","article-title":"#Swineflu: Twitter predicts swine flu outbreak in 2009","year":"2012","key":"key2020012211505245800_ref300"},{"first-page":"1505","article-title":"Structural aspects of user roles in information cascades","year":"2017","key":"key2020012211505245800_ref028"},{"key":"key2020012211505245800_ref029","article-title":"Adoption and use of social media among public health departments","volume":"12","year":"2012","journal-title":"BMC Public Health"},{"first-page":"1161","article-title":"Identifying communicator roles in Twitter","year":"2012","key":"key2020012211505245800_ref030"},{"issue":"5","key":"key2020012211505245800_ref031","article-title":"Do cancer patients tweet? Examining the Twitter use of cancer patients in Japan","volume":"16","year":"2014","journal-title":"Journal of Medical Internet Research"},{"year":"2014","key":"key2020012211505245800_ref032","article-title":"Understanding types of users on Twitter"},{"issue":"5","key":"key2020012211505245800_ref033","first-page":"360","article-title":"Understanding interobserver agreement: the kappa statistic","volume":"37","year":"2005","journal-title":"Family Medicine"},{"issue":"1","key":"key2020012211505245800_ref034","article-title":"Leveraging social media to promote public health knowledge: example of cancer awareness via Twitter","volume":"2","year":"2016","journal-title":"JMIR Public Health Surveillance"},{"first-page":"684","article-title":"Steeler nation, 12th man, and boo birds: classifying Twitter user interests using time series","year":"2013","key":"key2020012211505245800_ref035"},{"issue":"3","key":"key2020012211505245800_ref036","first-page":"431","article-title":"A comparison of information sharing behaviours across 379 health conditions on Twitter","volume":"64","year":"2018","journal-title":"International Journal of Public Health"},{"key":"key2020012211505245800_ref037","first-page":"925","article-title":"Hate speech detection: a solved problem? The challenging case of long tail on Twitter","volume-title":"Semantic Web","year":"2018"},{"first-page":"85","article-title":"Entity deduplication on ScholarlyData","year":"2017","key":"key2020012211505245800_ref038"}],"container-title":["Online Information Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/OIR-05-2019-0143\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/OIR-05-2019-0143\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:43:02Z","timestamp":1753396982000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/oir\/article\/44\/1\/213-237\/323524"}},"subtitle":["Mining useful features from Twitter user profiles for Twitter user classification in the public health domain"],"short-title":[],"issued":{"date-parts":[[2019,12,17]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,12,17]]}},"alternative-id":["10.1108\/OIR-05-2019-0143"],"URL":"https:\/\/doi.org\/10.1108\/oir-05-2019-0143","relation":{},"ISSN":["1468-4527"],"issn-type":[{"type":"print","value":"1468-4527"}],"subject":[],"published":{"date-parts":[[2019,12,17]]}}}