{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T23:48:17Z","timestamp":1777852097820,"version":"3.51.4"},"reference-count":31,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2018,11,29]],"date-time":"2018-11-29T00:00:00Z","timestamp":1543449600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Health Informatics J"],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:p>Data on disease burden are often used for assessing population health, evaluating the effectiveness of interventions, formulating health policies, and planning future resource allocation. We investigated whether Internet usage and social media data, specifically the search volume on Google, page view count on Wikipedia, and disease mentioning frequency on Twitter, correlated with the disease burden, measured by prevalence and treatment cost, for 1633 diseases over an 11-year period. We also applied least absolute shrinkage and selection operator to predict the burden of diseases. We found that Google search volume is relatively strongly correlated with the burdens for 39 of 1633 diseases, including viral hepatitis, diabetes mellitus, multiple sclerosis, and hemorrhoids. Wikipedia and Twitter data strongly correlated with the burdens of 15 and 7 diseases, respectively. However, an accurate analysis must consider each condition\u2019s characteristics, including acute\/chronic nature, severity, familiarity to the public, and the presence of stigma.<\/jats:p>","DOI":"10.1177\/1460458218810743","type":"journal-article","created":{"date-parts":[[2018,11,29]],"date-time":"2018-11-29T05:37:15Z","timestamp":1543469835000},"page":"1863-1877","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":14,"title":["Estimating disease burden using Internet data"],"prefix":"10.1177","volume":"25","author":[{"given":"Riyi","family":"Qiu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mirsad","family":"Hadzikadic","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sha","family":"Yu","sequence":"additional","affiliation":[{"name":"The University of North Carolina at Charlotte, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lixia","family":"Yao","sequence":"additional","affiliation":[{"name":"The University of North Carolina at Charlotte, USA; Mayo Clinic, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2018,11,29]]},"reference":[{"key":"bibr1-1460458218810743","unstructured":"Australian Institute of Health and Welfare (AIHW). Burden of disease, 2016, http:\/\/www.aihw.gov.au\/burden-of-disease\/"},{"key":"bibr2-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1377\/hlthaff.28.1.15"},{"key":"bibr3-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1007\/s00125-003-1116-6"},{"key":"bibr4-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1001\/archinte.163.9.1009"},{"key":"bibr5-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1016\/S0140-6736(13)61953-4"},{"key":"bibr6-1460458218810743","doi-asserted-by":"publisher","DOI":"10.2105\/AJPH.90.8.1241"},{"issue":"11","key":"bibr7-1460458218810743","first-page":"1076","volume":"79","author":"Mathers CD","year":"2001","journal-title":"Bull World Health Organ"},{"key":"bibr8-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1177\/003335490612100107"},{"key":"bibr9-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1001\/jama.285.5.535"},{"key":"bibr10-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1001\/jama.1993.03510180077038"},{"issue":"3","key":"bibr11-1460458218810743","first-page":"429","volume":"72","author":"Murray CJ.","year":"1994","journal-title":"Bull World Health Organ"},{"key":"bibr12-1460458218810743","volume-title":"Methods of Collecting Morbidity Statistics. Revised Report to the Eurostat Task Force on \u2018Health and Health-Related Survey Data\u2019","author":"Mason V","year":"1997"},{"key":"bibr13-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1002\/mds.10087"},{"key":"bibr14-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1002\/mds.10362"},{"key":"bibr15-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1002\/1531-8257(199907)14:4<596::AID-MDS1008>3.0.CO;2-U"},{"key":"bibr16-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1038\/nature07634"},{"key":"bibr17-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1038\/srep01801"},{"key":"bibr18-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0019467"},{"key":"bibr19-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.3276"},{"issue":"2","key":"bibr20-1460458218810743","first-page":"217","volume":"81","author":"Schuyler PL","year":"1993","journal-title":"Bull Med Libr Assoc"},{"key":"bibr21-1460458218810743","volume-title":"Health research data for the real world: the MarketScan databases","author":"Adamson DM","year":"2008"},{"key":"bibr22-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btq126"},{"key":"bibr23-1460458218810743","author":"Weisstein EW","year":"2004","journal-title":"Wolfram Research, Inc"},{"issue":"2","key":"bibr24-1460458218810743","first-page":"65","volume":"6","author":"Holm S.","year":"1979","journal-title":"Scand J Stat"},{"key":"bibr25-1460458218810743","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","volume":"58","author":"Tibshirani R.","year":"1996","journal-title":"J Roy Stat Soc B Met"},{"key":"bibr26-1460458218810743","unstructured":"Friedman J, Hastie T, Tibshirani R. glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. package version 1.5.2. 2011, http:\/\/CRAN.R-project.org\/package=glmnet"},{"key":"bibr27-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1212\/WNL.0000000000001608"},{"key":"bibr28-1460458218810743","unstructured":"Aslam S. Twitter by the numbers: stats, demographics & fun facts, 2017, https:\/\/www.omnicoreagency.com\/twitter-statistics\/"},{"key":"bibr29-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0115545"},{"key":"bibr30-1460458218810743","unstructured":"Centers for Disease Control and Prevention, 2012, https:\/\/www.cdc.gov\/media\/releases\/2012\/p0516_higher_education.html"},{"key":"bibr31-1460458218810743","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2016.16885"}],"container-title":["Health Informatics Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1460458218810743","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1460458218810743","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1460458218810743","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T22:30:29Z","timestamp":1777501829000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1460458218810743"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,11,29]]},"references-count":31,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["10.1177\/1460458218810743"],"URL":"https:\/\/doi.org\/10.1177\/1460458218810743","relation":{},"ISSN":["1460-4582","1741-2811"],"issn-type":[{"value":"1460-4582","type":"print"},{"value":"1741-2811","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,11,29]]}}}