{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:18:02Z","timestamp":1758269882640},"reference-count":19,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>Research on data visualization aims at finding the best way to present data via visual interfaces. We introduce the complementary problem of \"data vocalization\". Our goal is to present relational data in the most efficient way via voice output. This problem setting is motivated by emerging tools and devices (e.g., Google Home, Amazon Echo, Apple's Siri, or voice-based SQL interfaces) that communicate data primarily via audio output to their users.<\/jats:p>\n          <jats:p>We treat voice output generation as an optimization problem. The goal is to minimize speaking time while transmitting an approximation of a relational table to the user. We consider constraints on the precision of the transmitted data as well as on the cognitive load placed on the listener. We formalize voice output optimization and show that it is NP-hard. We present three approaches to solve that problem. First, we show how the problem can be translated into an integer linear program which enables us to apply corresponding solvers. Second, we present a two-phase approach that forms groups of similar rows in a pre-processing step, using a variant of the apriori algorithm. Then, we select an optimal combination of groups to generate a speech. Finally, we present a greedy algorithm that runs in polynomial time. Under simplifying assumptions, we prove that it generates near-optimal output by leveraging the sub-modularity property of our cost function. We compare our algorithms experimentally and analyze their complexity.<\/jats:p>","DOI":"10.14778\/3137628.3137663","type":"journal-article","created":{"date-parts":[[2017,9,7]],"date-time":"2017-09-07T13:35:53Z","timestamp":1504791353000},"page":"1574-1585","source":"Crossref","is-referenced-by-count":6,"title":["Data vocalization"],"prefix":"10.14778","volume":"10","author":[{"given":"Immanuel","family":"Trummer","sequence":"first","affiliation":[{"name":"Cornell University"}]},{"given":"Jiancheng","family":"Zhu","sequence":"additional","affiliation":[{"name":"Cornell University"}]},{"given":"Mark","family":"Bryan","sequence":"additional","affiliation":[{"name":"Cornell University"}]}],"member":"320","published-online":{"date-parts":[[2017,8]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"487","volume-title":"VLDB","volume":"1215","author":"Agrawal R.","year":"1994"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1017\/S135132490000005X"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622655.1622667"},{"key":"e_1_2_1_4_1","first-page":"65","volume-title":"EACL","author":"Demberg V.","year":"2006"},{"key":"e_1_2_1_5_1","unstructured":"T. Hermann A. Hunt and J. G. Neuhoff. The sonification handbook. 2011.  T. Hermann A. Hunt and J. G. Neuhoff. The sonification handbook. 2011."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4939-2092-1_38"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artmed.2012.09.002"},{"key":"e_1_2_1_8_1","first-page":"178","volume-title":"ACL","author":"Jing H.","year":"2000"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213929"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447824"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3115\/981311.981340"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2949741.2949744"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2899394"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(95)00026-D"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0043158"},{"key":"e_1_2_1_16_1","first-page":"1358","volume-title":"ICML","author":"Mirzasoleiman B.","year":"2016"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.3.3.177"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"T. V. Raman. Audio system for technical readings. PhD thesis 1998.   T. V. Raman. Audio system for technical readings. PhD thesis 1998.","DOI":"10.1007\/BFb0054977"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1353343.1353396"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3137628.3137663","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:59:39Z","timestamp":1672221579000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3137628.3137663"}},"subtitle":["optimizing voice output of relational data"],"short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":19,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2017,8]]}},"alternative-id":["10.14778\/3137628.3137663"],"URL":"https:\/\/doi.org\/10.14778\/3137628.3137663","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2017,8]]}}}