{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T05:20:58Z","timestamp":1672291258948},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2016,11]]},"abstract":"<jats:p>\n            <jats:italic>Nowcasting<\/jats:italic>\n            is the practice of using social media data to quantify ongoing real-world phenomena. It has been used by researchers to measure flu activity, unemployment behavior, and more. However, the typical nowcasting workflow requires either slow and tedious manual searching of relevant social media messages or automated statistical approaches that are prone to spurious and low-quality results.\n          <\/jats:p>\n          <jats:p>\n            In this paper, we propose a method for declaratively specifying a nowcasting model; this method involves processing a user query over a very large social media database, which can take hours. Due to the human-in-the-loop nature of constructing nowcasting models, slow runtimes place an extreme burden on the user. Thus we also propose a novel set of query optimization techniques, which allow users to quickly construct nowcasting models over very large datasets. Further, we propose a novel query quality alarm that helps users estimate phenomena even when historical ground truth data is not available. These contributions allow us to build a\n            <jats:italic>declarative nowcasting data management system,<\/jats:italic>\n            R\n            <jats:sc>accoon<\/jats:sc>\n            DB, which yields high-quality results in interactive time.\n          <\/jats:p>\n          <jats:p>\n            We evaluate R\n            <jats:sc>accoon<\/jats:sc>\n            DB using 40 billion tweets collected over five years. We show that our automated system saves work over traditional manual approaches while improving result quality---57% more accurate in our user study---and that its query optimizations yield a 424x speedup, allowing it to process queries 123x faster than a 300-core Spark cluster, using only 10% of the computational resources.\n          <\/jats:p>","DOI":"10.14778\/3021924.3021931","type":"journal-article","created":{"date-parts":[[2017,1,24]],"date-time":"2017-01-24T15:29:41Z","timestamp":1485271781000},"page":"145-156","source":"Crossref","is-referenced-by-count":2,"title":["A declarative query processing system for nowcasting"],"prefix":"10.14778","volume":"10","author":[{"given":"Dolan","family":"Antenucci","sequence":"first","affiliation":[{"name":"University of Michigan"}]},{"given":"Michael R.","family":"Anderson","sequence":"additional","affiliation":[{"name":"University of Michigan"}]},{"given":"Michael","family":"Cafarella","sequence":"additional","affiliation":[{"name":"University of Michigan"}]}],"member":"320","published-online":{"date-parts":[[2016,11]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"http:\/\/www.boxofficemojo.com\/weekly\/","author":"Mojo Weekly Box Office Box Office","year":"2015"},{"key":"e_1_2_1_2_1","volume-title":"http:\/\/www.google.org\/flutrends\/data.txt","author":"Influenza Surveillance CDC","year":"2015"},{"key":"e_1_2_1_3_1","volume-title":"http:\/\/fbi.gov\/services\/cjis\/nics","author":"FBI","year":"2015"},{"key":"e_1_2_1_4_1","volume-title":"http:\/\/www.google.com\/finance\/domestic_trends","author":"Google Finance Google Domestic","year":"2015"},{"key":"e_1_2_1_5_1","unstructured":"US Dept. of Labor - Unemployment Insurance Weekly Claims Data. http:\/\/research.stlouisfed.org 2015.  US Dept. of Labor - Unemployment Insurance Weekly Claims Data. http:\/\/research.stlouisfed.org 2015."},{"key":"e_1_2_1_6_1","volume-title":"http:\/\/eia.gov\/petroleum\/gasdiesel\/","author":"Gasoline U.S.","year":"2015"},{"key":"e_1_2_1_7_1","volume-title":"http:\/\/wunderground.com\/history\/airport\/KNYC\/","author":"Weather History New York Weather Underground","year":"2015"},{"key":"e_1_2_1_8_1","volume-title":"Runtime support for human-in-the-loop feature engineering systems","author":"Anderson M. R."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2016.7498336"},{"key":"e_1_2_1_10_1","volume-title":"NBER","author":"Antenucci D.","year":"2014"},{"key":"e_1_2_1_11_1","volume-title":"WebDB","author":"Antenucci D.","year":"2013"},{"key":"e_1_2_1_12_1","volume-title":"IZA","author":"Askitas N.","year":"2011"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2010.12.007"},{"key":"e_1_2_1_14_1","volume-title":"Google","author":"Choi H.","year":"2011"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/89086.89095"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/237661.237715"},{"key":"e_1_2_1_17_1","volume-title":"Nature","author":"Ginsberg J.","year":"2009"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944968"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2038916.2038934"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964858.1964870"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-004-0128-2"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/1880172.1880175"},{"key":"e_1_2_1_23_1","volume-title":"VLDB","author":"Kossmann D.","year":"2002"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1248506"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/PASSAT\/SocialCom.2011.98"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-71496-5_5"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767828"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.5555\/1690219.1690287"},{"key":"e_1_2_1_29_1","volume-title":"Google","author":"Mohebbi M.","year":"2011"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920865"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2007.367934"},{"key":"e_1_2_1_33_1","volume-title":"Information Retrieval. Butterworth-Heinemann","author":"Rijsbergen C. J. V.","year":"1979","edition":"2"},{"key":"e_1_2_1_34_1","volume-title":"Economics of Digitization","author":"Scott S. L.","year":"2014"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/1861751.1861752"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831371"},{"key":"e_1_2_1_37_1","volume-title":"Economics of Digitization","author":"Wu L.","year":"2014"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33486-3_41"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2012.76"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2593678"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3021924.3021931","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:28:21Z","timestamp":1672219701000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3021924.3021931"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,11]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,11]]}},"alternative-id":["10.14778\/3021924.3021931"],"URL":"https:\/\/doi.org\/10.14778\/3021924.3021931","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2016,11]]}}}