{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T19:53:42Z","timestamp":1772913222666,"version":"3.50.1"},"reference-count":54,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2017,2,22]],"date-time":"2017-02-22T00:00:00Z","timestamp":1487721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Leonard David Institute"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Objective<\/jats:title><jats:p>Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. Human review of social media data is infeasible due to data quantity, thus natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. Our objective is to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media.<\/jats:p><\/jats:sec><jats:sec><jats:title>Materials and Methods<\/jats:title><jats:p>We developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Our best-performing RNN model used pretrained word embeddings created from a large, non\u2013domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision.<\/jats:p><\/jats:sec><jats:sec><jats:title>Discussion<\/jats:title><jats:p>Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets.<\/jats:p><\/jats:sec>","DOI":"10.1093\/jamia\/ocw180","type":"journal-article","created":{"date-parts":[[2016,12,22]],"date-time":"2016-12-22T20:19:37Z","timestamp":1482437977000},"page":"813-821","source":"Crossref","is-referenced-by-count":171,"title":["Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts"],"prefix":"10.1093","volume":"24","author":[{"given":"Anne","family":"Cocos","sequence":"first","affiliation":[{"name":"Department of Biomedical and Health Informatics, The Children\u2019s Hospital of Philadelphia Philadelphia, PA, USA"}]},{"given":"Alexander G","family":"Fiks","sequence":"additional","affiliation":[{"name":"Department of Biomedical and Health Informatics, The Children\u2019s Hospital of Philadelphia Philadelphia, PA, USA"}]},{"given":"Aaron J","family":"Masino","sequence":"additional","affiliation":[{"name":"Department of Biomedical and Health Informatics, The Children\u2019s Hospital of Philadelphia Philadelphia, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,2,22]]},"reference":[{"issue":"3","key":"2020110612443612900_ocw180-B1","doi-asserted-by":"crossref","first-page":"e33236","DOI":"10.1371\/journal.pone.0033236","article-title":"Percentage of patients with preventable adverse drug reactions and preventability of adverse drug reactions: a meta-analysis","volume":"7","author":"Hakkarainen","year":"2012","journal-title":"PLoS One."},{"key":"2020110612443612900_ocw180-B2","doi-asserted-by":"crossref","first-page":"S73","DOI":"10.4103\/0976-500X.120957","article-title":"Clinical and economic burden of adverse drug reactions","volume":"4","author":"Sultana","year":"2013","journal-title":"J Pharmacol Pharmacother."},{"issue":"1","key":"2020110612443612900_ocw180-B3","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1046\/j.1525-1497.2003.20130.x","article-title":"Adverse drug event monitoring at the Food and Drug Administration","volume":"18","author":"Ahmad","year":"2003","journal-title":"J Gen Intern Med."},{"issue":"2","key":"2020110612443612900_ocw180-B4","doi-asserted-by":"crossref","first-page":"e89829","DOI":"10.1371\/journal.pone.0089829","article-title":"Adverse drug reactions of spontaneous reports in Shanghai pediatric population","volume":"9","author":"Li","year":"2014","journal-title":"PLoS One."},{"issue":"5","key":"2020110612443612900_ocw180-B5","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1177\/009286150804200501","article-title":"VigiBase, the WHO global ICSR database system: basic facts","volume":"42","author":"Lindquist","year":"2008","journal-title":"Drug Inform J."},{"issue":"6","key":"2020110612443612900_ocw180-B6","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1056\/NEJMp1014427","article-title":"Developing the Sentinel System: a national resource for evidence development","volume":"364","author":"Behrman","year":"2011","journal-title":"N Engl J Med."},{"key":"2020110612443612900_ocw180-B7","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1136\/jamia.2009.002477","article-title":"Development and evaluation of a common data model enabling active drug safety surveillance using disparate healthcare databases","volume":"17(6)","author":"Reisinger","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"6","key":"2020110612443612900_ocw180-B8","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1136\/jamia.2010.008607","article-title":"Drug safety surveillance using de-identified EMR and claims data: issues and challenges","volume":"17","author":"Nadkarni","year":"2010","journal-title":"J Am Med Inform Assoc."},{"issue":"3","key":"2020110612443612900_ocw180-B9","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1197\/jamia.M3028","article-title":"Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study","volume":"16","author":"Wang","year":"2009","journal-title":"J Am Med Inform Assoc."},{"issue":"6","key":"2020110612443612900_ocw180-B10","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1038\/clpt.2013.47","article-title":"Pharmacovigilance using clinical notes","volume":"93","author":"LePendu","year":"2013","journal-title":"Clin Pharmacol Ther."},{"issue":"3","key":"2020110612443612900_ocw180-B11","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1136\/amiajnl-2012-000930","article-title":"Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions","volume":"20","author":"Harpaz","year":"2013","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612443612900_ocw180-B12","article-title":"Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks","author":"Leaman"},{"issue":"6","key":"2020110612443612900_ocw180-B13","doi-asserted-by":"crossref","first-page":"989","DOI":"10.1016\/j.jbi.2011.07.005","article-title":"Identifying potential adverse effects using the web: a new approach to medical hypothesis generation","volume":"44","author":"Benton","year":"2011","journal-title":"J Biomed Inform."},{"key":"2020110612443612900_ocw180-B14","article-title":"Detecting signals of adverse drug reactions from health consumer contributed content in social media","author":"Yang"},{"key":"2020110612443612900_ocw180-B15","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1007\/978-3-642-36973-5_92","article-title":"ADRTrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites","volume":"7814LNCS","author":"Yates","year":"2013","journal-title":"Adv Inform Retrieval."},{"issue":"3","key":"2020110612443612900_ocw180-B16","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1136\/amiajnl-2012-001482","article-title":"Web-scale pharmacovigilance: listening to signals from the crowd","volume":"20","author":"White","year":"2013","journal-title":"J Am Med Inform Assoc."},{"issue":"5","key":"2020110612443612900_ocw180-B17","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1007\/s40264-014-0155-x","article-title":"Digital drug safety surveillance: monitoring pharmaceutical products in Twitter","volume":"37","author":"Freifeld","year":"2014","journal-title":"Drug Saf."},{"key":"2020110612443612900_ocw180-B18","article-title":"Mining Twitter for adverse drug reaction mentions: a corpus and classification benchmark","author":"Ginn"},{"key":"2020110612443612900_ocw180-B19","doi-asserted-by":"crossref","article-title":"Identifying adverse drug events from health social media: a case study on heart disease discussion forums","author":"Liu","DOI":"10.1007\/978-3-319-08416-9_3"},{"key":"2020110612443612900_ocw180-B20","first-page":"924","article-title":"Pharmacovigilance on Twitter? Mining tweets for adverse drug reactions","author":"O\u2019Connor","year":"2104","journal-title":"AMIA Annu Symp Proc"},{"issue":"3","key":"2020110612443612900_ocw180-B21","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1093\/jamia\/ocu041","article-title":"Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features","volume":"22","author":"Nikfarjam","year":"2015","journal-title":"J Am Med Inform Assoc."},{"key":"2020110612443612900_ocw180-B22","article-title":"Conditional random fields: probabilistic models for segmenting and labeling sequence data","author":"Lafferty"},{"key":"2020110612443612900_ocw180-B23","unstructured":"Wang W . Mining adverse drug reaction mentions in Twitter with word embeddings. In: Online Proceedings of the Pacific Symposium on Biocomputing Social Media Mining Shared Task Workshop 2016. http:\/\/diego.asu.edu\/psb2016\/acceptedpapers\/DLIR.pdf. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B24","first-page":"581","article-title":"Online Proceedings of the Social Media Mining Shared Task Workshop","volume":"21","author":"Sarker","year":"2016","journal-title":"Pacific Symposium on Biocomputing"},{"key":"2020110612443612900_ocw180-B25","first-page":"2524","article-title":"Recurrent neural networks for language understanding","author":"Yao","year":"2013"},{"key":"2020110612443612900_ocw180-B26","first-page":"1","article-title":"Supervised sequence labeling with recurrent neural networks (doctoral dissertation)","volume":"Springer; 2012","author":"Graves","journal-title":"Studies in Computational Intelligence 385"},{"key":"2020110612443612900_ocw180-B27","doi-asserted-by":"crossref","article-title":"Investigation of recurrent-neural-network architectures and learning methods for language understanding","author":"Mesnil","DOI":"10.21437\/Interspeech.2013-596"},{"key":"2020110612443612900_ocw180-B28","article-title":"Multimedia Lab @ ACL W-NUT NER Shared Task: named entity recognition for Twitter microposts using distributed word representations","author":"Godin"},{"key":"2020110612443612900_ocw180-B29","article-title":"Learning to diagnose with LSTM recurrent neural networks","author":"Lipton"},{"issue":"8","key":"2020110612443612900_ocw180-B30","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"issue":"11","key":"2020110612443612900_ocw180-B31","first-page":"2673","article-title":"Bidirectional recurrent neural networks","volume":"45","author":"Schuster","year":"1997","journal-title":"IEEE Trans Audio Speech Lang Process."},{"issue":"5\u20136","key":"2020110612443612900_ocw180-B32","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1016\/j.neunet.2005.06.042","article-title":"Framewise phoneme classification with bidirectional LSTM and other neural network architectures","volume":"18","author":"Graves","year":"2005","journal-title":"Neural Netw."},{"key":"2020110612443612900_ocw180-B33","unstructured":"Ott M . ark-twokenize-py. GitHub Repository. 2016. github.com\/myleott\/ark-twokenize-py. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B34","unstructured":"Owoputi O , O\u2019ConnorB, DyerCet al. Part-of-speech tagging for Twitter: word clusters and other advances. Carnegie Mellon University. CMU-ML-12-107. 2012. www.cs.cmu.edu\/\u223cark\/TweetNLP\/owoputi+etal.tr12.pdf. Accessed August 2016."},{"key":"2020110612443612900_ocw180-B35","doi-asserted-by":"crossref","article-title":"Text chunking using transformation-based learning","author":"Ramshaw","DOI":"10.1007\/978-94-017-2390-9_10"},{"key":"2020110612443612900_ocw180-B36","unstructured":"Mikolov T , ChenK, CorradoGet al. Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781. 2013. arxiv.org\/pdf\/1301.3781v3.pdf. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B37","unstructured":"Chollet F . Keras. GitHub Repository. 2016. github.com\/fchollet\/keras. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B38","article-title":"Theano: a CPU and GPU math expression compiler","author":"Bergstra"},{"key":"2020110612443612900_ocw180-B39","article-title":"Theano: new features and speed improvements","author":"Bastien"},{"issue":"10","key":"2020110612443612900_ocw180-B40","doi-asserted-by":"crossref","first-page":"1550","DOI":"10.1109\/5.58337","article-title":"Backpropagation through time: what it does and how to do it","volume":"78","author":"Werbos","year":"1990","journal-title":"Proc IEEE."},{"key":"2020110612443612900_ocw180-B41","volume-title":"Open Source Collaborative Consumer Health Vocabulary Initiative"},{"key":"2020110612443612900_ocw180-B42","unstructured":"Okazaki N . CRFsuite: A Fast Implementation of Conditional Random Fields. Software Package. 2007. www.chokkan.org\/software\/crfsuite\/. Accessed June 1, 2016."},{"key":"2020110612443612900_ocw180-B43","article-title":"Smith. Part-of-speech tagging for Twitter: annotation, features, and experiments","author":"Gimpel"},{"key":"2020110612443612900_ocw180-B44","unstructured":"Guo Z . DepND. GitHub Repository. 2016. github.com\/zachguo\/DepND. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B45","unstructured":"Sagae K GDep (GENIA dependency parser). Software package. 2016. sagae.bitbucket.org\/gdep\/. Accessed August 1, 2016."},{"key":"2020110612443612900_ocw180-B46","article-title":"Dependency parsing and domain adaptation with LR models and parser ensembles","author":"Sagae"},{"key":"2020110612443612900_ocw180-B47","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1186\/1471-2105-7-92","article-title":"Various criteria in the evaluation of biomedical named entity recognition","volume":"7","author":"Tsai","year":"2006","journal-title":"BMC Bioinformatics."},{"key":"2020110612443612900_ocw180-B48","first-page":"9","article-title":"Approximate randomization tests","volume-title":"Computer-Intensive Methods for Testing Hypotheses: An Introduction","author":"Noreen","year":"1989"},{"key":"2020110612443612900_ocw180-B49","first-page":"165","volume-title":"Empirical Methods for Artificial Intelligence","author":"Cohen","year":"1995"},{"key":"2020110612443612900_ocw180-B50","unstructured":"Pad\u00f3 S . User\u2019s guide to sigf: Significance Testing by Approximate Randomization. 2006. http:\/\/www.nlpado.de\/\u223csebastian\/software\/sigf.shtml. Accessed August 2016."},{"key":"2020110612443612900_ocw180-B51","doi-asserted-by":"crossref","first-page":"160035","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson","year":"2016","journal-title":"Scientific Data."},{"key":"2020110612443612900_ocw180-B52","article-title":"Mining adverse drug reaction signals from social media: going beyond extraction","author":"Patki"},{"key":"2020110612443612900_ocw180-B53","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1016\/j.jbi.2014.11.002","article-title":"Portable automatic text classification for adverse drug reaction detection via multi-corpus training","volume":"53","author":"Sarker","year":"2015","journal-title":"J Biomed Inform."},{"key":"2020110612443612900_ocw180-B54","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.jbi.2016.06.007","article-title":"Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts","volume":"62","author":"Korkontzelos","year":"2016","journal-title":"J Biomed Inform."}],"container-title":["Journal of the American Medical Informatics Association"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/24\/4\/813\/34148877\/ocw180.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/jamia\/article-pdf\/24\/4\/813\/34148877\/ocw180.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,19]],"date-time":"2022-07-19T20:02:12Z","timestamp":1658260932000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/jamia\/article\/24\/4\/813\/3041102"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,2,22]]},"references-count":54,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2017,2,22]]},"published-print":{"date-parts":[[2017,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/jamia\/ocw180","relation":{},"ISSN":["1527-974X"],"issn-type":[{"value":"1527-974X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,7]]},"published":{"date-parts":[[2017,2,22]]}}}