{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:30:13Z","timestamp":1772166613768,"version":"3.50.1"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:00:00Z","timestamp":1614384000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:00:00Z","timestamp":1614384000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The widespread influence of social media impacts every aspect of life, including the healthcare sector. Although medics and health professionals are the final decision makers, the advice and recommendations obtained from fellow patients are significant. In this context, the present paper explores the topics of discussion posted by breast cancer patients and survivors on online forums. The study examines an online forum, Breastcancer.org, maps the discussion entries to several topics, and proposes a machine learning model based on a classification algorithm to characterize the topics. To explore the topics of breast cancer patients and survivors, approximately 1000 posts are selected and manually labeled with annotations. In contrast, millions of posts are available to build the labels. A semi-supervised learning technique is used to build the labels for the unlabeled data; hence, the large data are classified using a deep learning algorithm. The deep learning algorithm BiLSTM with BERT word embedding technique provided a better f1-score of 79.5%. This method is able to classify the following topics: medication reviews, clinician knowledge, various treatment options, seeking and providing support, diagnostic procedures, financial issues and implications for everyday life. What matters the most for the patients is coping with everyday living as well as seeking and providing emotional and informational support. The approach and findings show the potential of studying social media to provide insight into patients' experiences with cancer like critical health problems.<\/jats:p>","DOI":"10.1186\/s40537-021-00429-7","type":"journal-article","created":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T06:02:53Z","timestamp":1614405773000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Annotating and detecting topics in social media forum and modelling the annotation to derive directions-a case study"],"prefix":"10.1186","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7830-2461","authenticated-orcid":false,"given":"B.","family":"Athira","sequence":"first","affiliation":[]},{"given":"Josette","family":"Jones","sequence":"additional","affiliation":[]},{"given":"Sumam Mary","family":"Idicula","sequence":"additional","affiliation":[]},{"given":"Anand","family":"Kulanthaivel","sequence":"additional","affiliation":[]},{"given":"Enming","family":"Zhang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,2,27]]},"reference":[{"issue":"2","key":"429_CR1","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1007\/s10278-013-9653-0","volume":"27","author":"BJ Kolowitz","year":"2014","unstructured":"Kolowitz BJ, Lauro GR, Venturella J, Georgiev V, Barone M, Deible C, Shrestha R. Clinical social networking\u2014a new revolution in provider communication and delivery of clinical information across providers of care? J Digital Imag. 2014;27(2):192\u20139.","journal-title":"J Digital Imag"},{"issue":"1","key":"429_CR2","first-page":"80","volume":"29","author":"EL Medina","year":"2016","unstructured":"Medina EL, Mesquita CT, Loques Filho O. Healthcare social networks for patients with cardiovascular diseases and recommendation systems. Int J Cardiovasc Sci. 2016;29(1):80\u20135.","journal-title":"Int J Cardiovasc Sci."},{"issue":"3","key":"429_CR3","doi-asserted-by":"publisher","first-page":"e13060","DOI":"10.1111\/ecc.13060","volume":"28","author":"\u0160 Miro\u0161evi\u010d","year":"2019","unstructured":"Miro\u0161evi\u010d \u0160, Prins JB, Seli\u010d P, Zaletel Kragelj L, Klemenc Keti\u0161 Z. Prevalence and factors associated with unmet needs in post-treatment cancer survivors: a systematic review. Eur J Cancer Care. 2019;28(3):e13060.","journal-title":"Eur J Cancer Care"},{"issue":"1","key":"429_CR4","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1093\/fampra\/cmz043","volume":"37","author":"DN Lo-Fo-Wong","year":"2020","unstructured":"Lo-Fo-Wong DN, de Haes HC, Aaronson NK, van Abbema DL, Admiraal JM, den Boer MD, van Hezewijk M, Immink M, Kaptein AA, Menke-Pluijmers MB, Russell NS. Health care use and remaining needs for support among women with breast cancer in the first 15 months after diagnosis: the role of the GP. Family Pract. 2020;37(1):103\u20139.","journal-title":"Family Pract."},{"issue":"3","key":"429_CR5","doi-asserted-by":"publisher","first-page":"e13086","DOI":"10.1111\/ecc.13086","volume":"28","author":"D Brandenbarg","year":"2019","unstructured":"Brandenbarg D, Maass SW, Geerse OP, Stegmann ME, Handberg C, Schroevers MJ, Duijts SF. A systematic review on the prevalence of symptoms of depression, anxiety and distress in long-term cancer survivors: implications for primary care. Eur J Cancer Care. 2019;28(3):e13086.","journal-title":"Eur J Cancer Care"},{"issue":"3","key":"429_CR6","doi-asserted-by":"publisher","first-page":"895","DOI":"10.1007\/s00520-016-3479-5","volume":"25","author":"R Selove","year":"2017","unstructured":"Selove R, Foster M, Wujcik D, Sanderson M, Hull PC, Shen-Miller D, Wolff S, Friedman D. Psychosocial concerns and needs of cancer survivors treated at a comprehensive cancer center and a community safety net hospital. Supportive Care Cancer. 2017;25(3):895\u2013904.","journal-title":"Supportive Care Cancer."},{"issue":"4","key":"429_CR7","doi-asserted-by":"publisher","first-page":"e45","DOI":"10.2196\/medinform.9162","volume":"6","author":"J Jones","year":"2018","unstructured":"Jones J, Pradhan M, Hosseini M, Kulanthaivel A, Hosseini M. Novel approach to cluster patient-generated data into actionable topics: case study of a web-based breast cancer forum. JMIR Med Inform. 2018;6(4):e45.","journal-title":"JMIR Med Inform"},{"key":"429_CR8","doi-asserted-by":"crossref","unstructured":"Nakikj D, Mamykina L. A park or a highway: Overcoming tensions in designing for socio-emotional and informational needs in online health communities. InProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing 2017\u00a0(pp.\u00a01304\u20131319).","DOI":"10.1145\/2998181.2998339"},{"issue":"12","key":"429_CR9","doi-asserted-by":"publisher","first-page":"1397","DOI":"10.1007\/s40264-018-0707-6","volume":"41","author":"K Smith","year":"2018","unstructured":"Smith K, Golder S, Sarker A, Loke Y, O\u2019Connor K, Gonzalez-Hernandez G. Methods to compare adverse events in twitter to faers, drug information databases, and systematic reviews: proof of concept with adalimumab. Drug Safety. 2018;41(12):1397\u2013410.","journal-title":"Drug Safety."},{"issue":"3","key":"429_CR10","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1093\/jamia\/ocv175","volume":"23","author":"AL Hartzler","year":"2016","unstructured":"Hartzler AL, Taylor MN, Park A, Griffiths T, Backonja U, McDonald DW, Wahbeh S, Brown C, Pratt W. Leveraging cues from person-generated health data for peer matching in online communities. J Am Med Inform Assoc. 2016;23(3):496\u2013507.","journal-title":"J Am Med Inform Assoc."},{"issue":"02","key":"429_CR11","doi-asserted-by":"publisher","first-page":"160","DOI":"10.3414\/ME12-02-0003","volume":"52","author":"T Chomutare","year":"2013","unstructured":"Chomutare T, \u00c5rsand E, Fernandez-Luque L, Lauritzen J, Hartvigsen G. Inferring community structure in healthcare forums. Methods Inf Med. 2013;52(02):160\u20137.","journal-title":"Methods Inf Med"},{"key":"429_CR12","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1016\/j.dss.2018.10.005","volume":"116","author":"X Wang","year":"2019","unstructured":"Wang X, Zhao K, Cha S, Amato MS, Cohn AM, Pearson JL, Papandonatos GD, Graham AL. Mining user-generated content in an online smoking cessation community to identify smoking status: a machine learning approach. Decision Support Syst. 2019;116:26\u201334.","journal-title":"Decision Support Syst."},{"key":"429_CR13","doi-asserted-by":"crossref","unstructured":"Durant KT, McCray AT, Safran C. Modeling the temporal evolution of an online cancer forum. InProceedings of the 1st ACM International Health Informatics Symposium 2010\u00a0(pp.\u00a0356\u2013365).","DOI":"10.1145\/1882992.1883042"},{"key":"429_CR14","doi-asserted-by":"crossref","unstructured":"Vlahovic TA, Wang YC, Kraut RE, Levine JM. Support matching and satisfaction in an online breast cancer support community. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 2014\u00a0(pp.\u00a01625\u20131634).","DOI":"10.1145\/2556288.2557108"},{"issue":"2","key":"429_CR15","doi-asserted-by":"publisher","first-page":"e48","DOI":"10.2196\/jmir.6895","volume":"19","author":"D Mowery","year":"2017","unstructured":"Mowery D, Smith H, Cheney T, Stoddard G, Coppersmith G, Bryan C, Conway M. Understanding depressive symptoms and psychosocial stressors on Twitter: a corpus-based study. J Med Internet Res. 2017;19(2):e48.","journal-title":"J Med Internet Res"},{"issue":"9","key":"429_CR16","doi-asserted-by":"publisher","first-page":"1158","DOI":"10.1080\/10410236.2017.1339370","volume":"33","author":"ML Cabling","year":"2018","unstructured":"Cabling ML, Turner JW, Hurtado-de-Mendoza A, Zhang Y, Jiang X, Drago F, Sheppard VB. Sentiment analysis of an online breast cancer support group: communicating about tamoxifen. Health Commun. 2018;33(9):1158\u201365.","journal-title":"Health Commun."},{"key":"429_CR17","unstructured":"Elhadad N, Zhang S, Driscoll P, Brody S. Characterizing the sublanguage of online breast cancer forums for medications, symptoms, and emotions. InAMIA Annual Symposium Proceedings 2014 (Vol.\u00a02014, p.\u00a0516). American Medical Informatics Association."},{"issue":"4","key":"429_CR18","doi-asserted-by":"publisher","first-page":"1049","DOI":"10.1109\/TCSS.2018.2879044","volume":"5","author":"CC Yang","year":"2018","unstructured":"Yang CC, Jiang L. Enriching user experience in online health communities through thread recommendations and heterogeneous information network mining. IEEE Trans Comput Social Syst. 2018;5(4):1049\u201360.","journal-title":"IEEE Trans Comput Social Syst"},{"key":"429_CR19","unstructured":"Liu Y, Xu S, Yoon HJ, Tourassi G. Extracting patient demographics and personal medical information from online health forums. InAMIA Annual Symposium Proceedings 2014 (Vol.\u00a02014, p.\u00a01825). American Medical Informatics Association."},{"key":"429_CR20","doi-asserted-by":"crossref","unstructured":"Nguyen LH, Salopek A, Zhao L, Jin F. A natural language normalization approach to enhance social media text reasoning. In2017 IEEE International Conference on Big Data (Big Data) 2017\u00a0(pp.\u00a02019\u20132026). IEEE, New York.","DOI":"10.1109\/BigData.2017.8258148"},{"key":"429_CR21","doi-asserted-by":"crossref","unstructured":"Lee K, Hasan SA, Farri O, Choudhary A, Agrawal A. Medical concept normalization for online user-generated texts. In2017 IEEE International Conference on Healthcare Informatics (ICHI) 2017 Aug 23 (pp.\u00a0462\u2013469). IEEE, New York.","DOI":"10.1109\/ICHI.2017.59"},{"key":"429_CR22","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1016\/j.sbspro.2011.10.577","volume":"1","author":"E Clark","year":"2011","unstructured":"Clark E, Araki K. Text normalization in social media: progress, problems and applications for a pre-processing system of casual English. Procedia Social Behav Sci. 2011;1:2\u201311.","journal-title":"Procedia Social Behav Sci"},{"issue":"1","key":"429_CR23","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1055\/s-0039-1677918","volume":"28","author":"M Conway","year":"2019","unstructured":"Conway M, Hu M, Chapman WW. Recent advances in using natural language processing to address public health research questions using social media and consumergenerated data. Yearbook Med Inform. 2019;28(1):208.","journal-title":"Yearbook Med Inform"},{"key":"429_CR24","unstructured":"Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. 2013."},{"key":"429_CR25","unstructured":"Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. InAdvances in neural information processing systems 2013;\u00a0pp.\u00a03111\u20133119."},{"issue":"3","key":"429_CR26","first-page":"443","volume":"7","author":"S Momtazi","year":"2019","unstructured":"Momtazi S, Rahbar A, Salami D, Khanijazani I. A joint semantic vector representation model for text clustering and classification. J AI Data Mining. 2019;7(3):443\u201350.","journal-title":"J AI Data Mining."},{"key":"429_CR27","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K. Bert. Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018."},{"key":"429_CR28","doi-asserted-by":"crossref","unstructured":"Singh AK, Shashi M. Vectorization of Text Documents for Identifying Unifiable News Articles. Int J Adv Comput Sci Appl. 2019;10.","DOI":"10.14569\/IJACSA.2019.0100742"},{"key":"429_CR29","doi-asserted-by":"crossref","first-page":"100057","DOI":"10.1016\/j.yjbinx.2019.100057","volume":"1","author":"FK Khattak","year":"2019","unstructured":"Khattak FK, Jeblee S, Pou-Prom C, Abdalla M, Meaney C, Rudzicz F. A survey of word embeddings for clinical text. J Biomed Inform X. 2019;1:100057.","journal-title":"J Biomed Inform X"},{"key":"429_CR30","doi-asserted-by":"crossref","unstructured":"Chen L. \u201cA Classification Framework for Online Social Support Using Deep Learning.\u201c International Conference on Human-Computer Interaction. Springer, Cham, 2019.","DOI":"10.1007\/978-3-030-22338-0_14"},{"key":"429_CR31","doi-asserted-by":"crossref","unstructured":"Zhu B, Cai X, Cai R. \u201cAnswer Quality Evaluation in Online Health Care Community.\u201c 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018). Atlantis Press, 2018.","DOI":"10.2991\/ncce-18.2018.143"},{"key":"429_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jbi.2017.03.012","volume":"69","author":"S Zhang","year":"2017","unstructured":"Zhang S, Grave E, Sklar E, Elhadad N. Longitudinal analysis of discussion topics in an online breast cancer community using convolutional neural networks. J Biomed Inform. 2017;69:1\u20139.","journal-title":"J Biomed Inform"},{"issue":"3","key":"429_CR33","doi-asserted-by":"publisher","first-page":"367","DOI":"10.3233\/IDA-130584","volume":"17","author":"MR Keyvanpour","year":"2013","unstructured":"Keyvanpour MR, Imani MB. Semi-supervised text categorization: exploiting unlabeled data using ensemble learning algorithms. Intelligent Data Anal. 2013;17(3):367\u201385.","journal-title":"Intelligent Data Anal."},{"key":"429_CR34","doi-asserted-by":"crossref","unstructured":"Sigdel M, et al. \u201cEvaluation of semi-supervised learning for classification of protein crystallization imagery.\u201c IEEE SOUTHEASTCON 2014. IEEE,\u00a0 New York.\u00a02014.","DOI":"10.1109\/SECON.2014.6950649"},{"key":"429_CR35","doi-asserted-by":"crossref","unstructured":"Gowda HS, Suhil M, Guru DS, Raju LN. Semi-supervised text categorization using recursive K-means clustering. InInternational Conference on Recent Trends in Image Processing and Pattern Recognition 2016 Dec 16 (pp.\u00a0217\u2013227). Springer, Singapore.","DOI":"10.1007\/978-981-10-4859-3_20"},{"key":"429_CR36","unstructured":"Seeger M. Learning with labeled and unlabeled data. Technical report, University of Edinburgh, Tech. Rep. 2001."},{"key":"429_CR37","doi-asserted-by":"crossref","unstructured":"Jalan R, Gupta M, Varma V. Medical forum question classification using deep learning. In European Conference on Information Retrieval 2018 Mar 26 (pp.\u00a045\u201358). Springer, Cham.","DOI":"10.1007\/978-3-319-76941-7_4"},{"issue":"2","key":"429_CR38","first-page":"2","volume":"1","author":"P Mayring","year":"2000","unstructured":"Mayring P. Qualitative content analysis forum qualitative sozialforschung. InForum Qual Social Res. 2000;1(2):2.","journal-title":"InForum Qual Social Res"},{"key":"429_CR39","doi-asserted-by":"publisher","first-page":"158","DOI":"10.1016\/j.ijmedinf.2017.10.006","volume":"108","author":"RJ Holden","year":"2017","unstructured":"Holden RJ, Kulanthaivel A, Purkayastha S, Goggins KM, Kripalani S. Know thy eHealth user: development of biopsychosocial personas from a study of older adults with heart failure. Int J Med Inform. 2017;108:158\u201367.","journal-title":"Int J Med Inform"},{"issue":"1","key":"429_CR40","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1016\/j.neunet.2018.11.009","volume":"110","author":"Y Chen","year":"2019","unstructured":"Chen Y, Chang H, Meng J, Zhang D. Ensemble Neural Networks (ENN): a gradient-free stochastic method. Neural Netw. 2019;110(1):170\u201385.","journal-title":"Neural Netw"},{"issue":"1\u20132","key":"429_CR41","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1016\/S0004-3702(02)00190-X","volume":"1;137","author":"ZH Zhou","year":"2002","unstructured":"Zhou ZH, Wu J, Tang W. Ensembling neural networks: many could be better than all. Artif Intell. 2002 May;1;137(1\u20132):239\u201363.","journal-title":"Artif Intell"},{"issue":"4","key":"429_CR42","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1109\/5326.983933","volume":"31","author":"R Polikar","year":"2001","unstructured":"Polikar R, Upda L, Upda SS, Honavar V. Learn++. An incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern Part C. 2001;31(4):497\u2013508.","journal-title":"IEEE Trans Syst Man Cybern Part C."}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-021-00429-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s40537-021-00429-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-021-00429-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,19]],"date-time":"2022-12-19T02:11:57Z","timestamp":1671415917000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-021-00429-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,27]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["429"],"URL":"https:\/\/doi.org\/10.1186\/s40537-021-00429-7","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-132773\/v1","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-132773\/v2","asserted-by":"object"}]},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,27]]},"assertion":[{"value":"17 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 February 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 February 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"41"}}