{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,15]],"date-time":"2025-11-15T10:27:58Z","timestamp":1763202478993,"version":"build-2065373602"},"reference-count":25,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2021,1,16]],"date-time":"2021-01-16T00:00:00Z","timestamp":1610755200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Social media sites are considered one of the most important sources of data in many fields, such as health, education, and politics. While surveys provide explicit answers to specific questions, posts in social media have the same answers implicitly occurring in the text. This research aims to develop a method for extracting implicit answers from large tweet collections, and to demonstrate this method for an important concern: the problem of heart attacks. The approach is to collect tweets containing \u201cheart attack\u201d and then select from those the ones with useful information. Informational tweets are those which express real heart attack issues, e.g., \u201cYesterday morning, my grandfather had a heart attack while he was walking around the garden.\u201d On the other hand, there are non-informational tweets such as \u201cDropped my iPhone for the first time and almost had a heart attack.\u201d The starting point was to manually classify around 7000 tweets as either informational (11%) or non-informational (89%), thus yielding a labeled dataset to use in devising a machine learning classifier that can be applied to our large collection of over 20 million tweets. Tweets were cleaned and converted to a vector representation, suitable to be fed into different machine-learning algorithms: Deep neural networks, support vector machine (SVM), J48 decision tree and na\u00efve Bayes. Our experimentation aimed to find the best algorithm to use to build a high-quality classifier. This involved splitting the labeled dataset, with 2\/3 used to train the classifier and 1\/3 used for evaluation besides cross-validation methods. The deep neural network (DNN) classifier obtained the highest accuracy (95.2%). In addition, it obtained the highest F1-scores with (73.6%) and (97.4%) for informational and non-informational classes, respectively.<\/jats:p>","DOI":"10.3390\/fi13010019","type":"journal-article","created":{"date-parts":[[2021,1,18]],"date-time":"2021-01-18T05:17:34Z","timestamp":1610947054000},"page":"19","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["A Classifier to Detect Informational vs. Non-Informational Heart Attack Tweets"],"prefix":"10.3390","volume":"13","author":[{"given":"Ola","family":"Karajeh","sequence":"first","affiliation":[{"name":"Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dirar","family":"Darweesh","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Jordan University of Science and Technology, 3030 Irbid, Jordan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8346-7148","authenticated-orcid":false,"given":"Omar","family":"Darwish","sequence":"additional","affiliation":[{"name":"Computer Technology and Information Systems, Ferrum College, Ferrum, VA 24088, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5260-7734","authenticated-orcid":false,"given":"Noor","family":"Abu-El-Rub","sequence":"additional","affiliation":[{"name":"Kansas Medical Center, Kansas City, MO 67002, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0316-3641","authenticated-orcid":false,"given":"Belal","family":"Alsinglawi","sequence":"additional","affiliation":[{"name":"School of Computer Data and Mathematical Sciences, Western Sydney University, Rydalmere, NSW 2116, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nasser","family":"Alsaedi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Taibah University, 2003 Medina, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,1,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Shahare, F.F. (2017, January 15\u201316). Sentiment analysis for the news data based on the social media. Proceedings of the 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India.","DOI":"10.1109\/ICCONS.2017.8250692"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.ijinfomgt.2017.12.002","article-title":"Social media analytics\u2014Challenges in topic discovery, data collection, and data preparation","volume":"39","author":"Stieglitz","year":"2018","journal-title":"Int. J. Inf. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1111\/hir.12192","article-title":"Consumer health information seeking in social media: A literature review","volume":"34","author":"Zhao","year":"2017","journal-title":"Health Inf. Libr. J."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Sidana, S., Mishra, S., Amer-Yahia, S., Clausel, M., and Amini, M.-R. (2016, January 17\u201321). Health Monitoring on Social Media over Time. Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, USA.","DOI":"10.1145\/2911451.2914697"},{"key":"ref_5","unstructured":"Sarker, A., and Gonzalez, G. (2016, January 12). Data, tools and resources for mining social media drug chatter. Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016), Osaka, Japan."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Pershad, Y., Hangge, P., Albadawi, H., and Oklu, R. (2018). So-cial medicine: Twitter in healthcare. J. Clin. Med., 7.","DOI":"10.3390\/jcm7060121"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kale, S., and Padmadas, V. (2017, January 17\u201318). Sentiment Analysis of Tweets Using Semantic Analysis. Proceedings of the 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA), Pune, India.","DOI":"10.1109\/ICCUBEA.2017.8464011"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Harish, B.N., Reena, K., Kumar, S., and Zhong, J. (2018). How much do you care? Mining and Analysis of Tweets Pertaining to Health Issues. SoutheastCon 2018, IEEE.","DOI":"10.1109\/SECON.2018.8478865"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"20617","DOI":"10.1109\/ACCESS.2017.2740982","article-title":"A Pattern-Based Approach for Multi-Class Sentiment Analysis in Twitter","volume":"5","author":"Bouazizi","year":"2017","journal-title":"IEEE Access"},{"key":"ref_10","unstructured":"(2019, July 31). Heart Disease: Facts, Statistics, and You. Available online: https:\/\/www.healthline.com\/health\/heart-disease\/statisticsn#1."},{"key":"ref_11","first-page":"49","article-title":"Role of social media in health-care domain: An integrated review","volume":"7","author":"Sridevi","year":"2017","journal-title":"Int. J. Eng. Res. Appl."},{"key":"ref_12","first-page":"1","article-title":"Effect of social media on human health","volume":"2","author":"Tripathi","year":"2018","journal-title":"Virol. Immunol. J."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.socscimed.2014.08.019","article-title":"Social networks and health: A systematic review of sociocentric net-work studies in low-and middle-income countries","volume":"125","author":"Perkins","year":"2015","journal-title":"Soc. Sci. Med."},{"key":"ref_14","unstructured":"Paul, M.J., Sarker, A., Brownstein, J.S., Nikfarjam, A., Scotch, M., Smith, K.L., and Gonzalez, G. (2016, January 4\u20138). Social media mining for public health monitoring and surveillance. Proceedings of the Pacific Symposium on Biocomputing, Kohala Coast, HI, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sutar, S.G. (2017, January 15\u201316). Intelligent data mining technique of social media for improving health care. Proceedings of the 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India.","DOI":"10.1109\/ICCONS.2017.8250690"},{"key":"ref_16","first-page":"599","article-title":"Applying Data Mining to Healthcare: A Study of Social Network of Physicians and Patient Journeys","volume":"Volume 9729","author":"Perner","year":"2016","journal-title":"Computer Vision"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"e109","DOI":"10.2196\/jmir.7087","article-title":"Un-derstanding health care social media use from different stakeholder perspectives: A content analysis of an on-line health community","volume":"19","author":"Lu","year":"2017","journal-title":"J. Med. Internet Res."},{"key":"ref_18","unstructured":"(2020, December 12). Twitter Can Predict Rates of Coronary Heart Disease|Authentic Happiness. Available online: https:\/\/www.authentichappiness.sas.upenn.edu\/news\/twitter-can-predict-rates-coronary-heart-disease."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"e5656","DOI":"10.7717\/peerj.5656","article-title":"Does Twitter language reliably predict heart disease? A commentary on Eichstaedt et al. (2015a)","volume":"6","author":"Brown","year":"2018","journal-title":"PeerJ"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1016\/j.procs.2017.08.009","article-title":"Com-parative study of word embedding methods in topic segmentation","volume":"112","author":"Naili","year":"2017","journal-title":"Procedia Comput. Sci."},{"key":"ref_21","unstructured":"(2020, November 01). Document-Term Matrix. Available online: https:\/\/en.wikipedia.org\/wiki\/Document-term-matrix."},{"key":"ref_22","first-page":"62","article-title":"An introduction to dimensionality reduction using matlab","volume":"1201","year":"2007","journal-title":"Report"},{"key":"ref_23","unstructured":"(2020, November 01). DLRL. Available online: https:\/\/dlib.vt.edu\/index.html."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Alsinglawi, B., Alnajjar, F., Mubin, O., Novoa, M., Alorjani, M., Karajeh, O., and Darwish, O. (2020, January 20\u201324). Predicting Length of Stay for Cardiovascular Hospitalizations in the Intensive Care Unit: Machine Learning Approach. Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada.","DOI":"10.1109\/EMBC44109.2020.9175889"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Darwish, O., Al-Fuqaha, A., Brahim, G.B., Jenhani, I., and Anan, M. (2018, January 25\u201329). Towards a streaming approach to the mitigation of covert timing channels. Proceedings of the 2018 14th International Wireless Communications Mobile Computing Conference (IWCMC), Limassol, Cyprus.","DOI":"10.1109\/IWCMC.2018.8450468"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/13\/1\/19\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:11:50Z","timestamp":1760159510000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/13\/1\/19"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,16]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,1]]}},"alternative-id":["fi13010019"],"URL":"https:\/\/doi.org\/10.3390\/fi13010019","relation":{},"ISSN":["1999-5903"],"issn-type":[{"type":"electronic","value":"1999-5903"}],"subject":[],"published":{"date-parts":[[2021,1,16]]}}}