{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:33:28Z","timestamp":1750221208423,"version":"3.41.0"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2018,12,8]],"date-time":"2018-12-08T00:00:00Z","timestamp":1544227200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["ACI-1640864,CAREER IIS-1750460, CAREER IIS-1762268"],"award-info":[{"award-number":["ACI-1640864,CAREER IIS-1750460, CAREER IIS-1762268"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>\n            Tuple-independent and disjoint-independent probabilistic databases (TI- and DI-PDBs) represent uncertain data in a factorized form as a product of independent random variables that represent either tuples (TI-PDBs) or sets of tuples (DI-PDBs). When the user submits a query, the database derives the marginal probabilities of each output-tuple, exploiting the underlying assumptions of statistical independence. While query processing in TI- and DI-PDBs has been studied extensively, limited research has been dedicated to the problems of\n            <jats:italic>updating or deriving the parameters from observations of query results<\/jats:italic>\n            . Addressing this problem is the main focus of this article. We first introduce\n            <jats:italic>Beta Probabilistic Databases<\/jats:italic>\n            (B-PDBs), a generalization of TI-PDBs designed to support both (i)\n            <jats:italic>belief updating<\/jats:italic>\n            and (ii)\n            <jats:italic>parameter learning<\/jats:italic>\n            in a principled and scalable way. The key idea of B-PDBs is to treat each parameter as a latent, Beta-distributed random variable. We show how this simple expedient enables both belief updating and parameter learning in a principled way, without imposing any burden on regular query processing. Building on B-PDBs, we then introduce\n            <jats:italic>Dirichlet Probabilistic Databases<\/jats:italic>\n            (D-PDBs), a generalization of DI-PDBs with similar properties. We provide the following key contributions for both B- and D-PDBs: (i) We study the complexity of performing Bayesian belief updates and devise efficient algorithms for certain tractable classes of queries; (ii) we propose a soft-EM algorithm for computing maximum-likelihood estimates of the parameters; (iii) we present an algorithm for efficiently computing conditional probabilities, allowing us to efficiently implement B- and D-PDBs via a standard relational engine; and (iv) we support our conclusions with extensive experimental results.\n          <\/jats:p>","DOI":"10.1145\/3277503","type":"journal-article","created":{"date-parts":[[2018,12,10]],"date-time":"2018-12-10T13:09:16Z","timestamp":1544447356000},"page":"1-41","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Learning From Query-Answers"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4107-1759","authenticated-orcid":false,"given":"Niccol\u00f2","family":"Meneghetti","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0632-1668","authenticated-orcid":false,"given":"Oliver","family":"Kennedy","sequence":"additional","affiliation":[{"name":"University at Buffalo, Davis Hall, Buffalo, NY, USA"}]},{"given":"Wolfgang","family":"Gatterbauer","sequence":"additional","affiliation":[{"name":"Northeastern University, Huntington Avenue, Boston, MA, USA"}]}],"member":"320","published-online":{"date-parts":[[2018,12,8]]},"reference":[{"volume-title":"Foundations of Databases","author":"Abiteboul Serge","key":"e_1_2_1_1_1"},{"volume-title":"Proceedings of the International Conference on Very Large Databases (VLDB\u201906)","year":"2006","author":"Agrawal Parag","key":"e_1_2_1_2_1"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2008.4497507"},{"volume-title":"Proceedings of the International Conference on Very Large Databases (VLDB\u201906)","year":"2006","author":"Benjelloun Omar","key":"e_1_2_1_4_1"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066277"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/645504.656274"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465283"},{"volume-title":"Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI\u201918)","year":"2018","author":"Chou Li","key":"e_1_2_1_8_1"},{"volume-title":"Proceedings of the International Conference on Very Large Databases (VLDB\u201904)","author":"Nilesh","key":"e_1_2_1_9_1"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1265530.1265531"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2395116.2395119"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1977.tb01600.x"},{"volume-title":"Proceedings of the 8th International Workshop on Statistical Relational (StarAI\u201918)","year":"2018","author":"den Heuvel Maarten Van","key":"e_1_2_1_13_1"},{"key":"e_1_2_1_14_1","unstructured":"Fr\u00e9d\u00e9ric Devernay. 2007. C\/C++ Minpack. Retrieved from http:\/\/devernay.free.fr\/hacks\/cminpack\/.  Fr\u00e9d\u00e9ric Devernay. 2007. C\/C++ Minpack. Retrieved from http:\/\/devernay.free.fr\/hacks\/cminpack\/."},{"volume-title":"Proceedings of the Conference on Innovative Data Systems Research (CIDR\u201915)","author":"Duggan Jennie","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Maximilian Dylla Martin Theobald and Iris Miliaraki. 2014. Querying and learning in probabilistic databases. In Reasoning Web.  Maximilian Dylla Martin Theobald and Iris Miliaraki. 2014. Querying and learning in probabilistic databases. In Reasoning Web.","DOI":"10.1007\/978-3-319-10587-1_8"},{"volume-title":"Tricomi","year":"1954","author":"Erd\u00e9lyi Arthur","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989481"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-013-0310-5"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/239041.239045"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2532641"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735479.2735494"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/1148074.1705152"},{"key":"e_1_2_1_25_1","first-page":"9","article-title":"Provenance in ORCHESTRA","volume":"33","author":"Green Todd J.","year":"2010","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1265530.1265535"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989490"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87479-9_49"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/2034063.2034113"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.2307\/2527783"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559984"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376686"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376701"},{"key":"e_1_2_1_35_1","unstructured":"Norman L. Johnson Samuel Kotz and N. Balakrishnan. 1995. Continuous univariate distributions vol. 2.  Norman L. Johnson Samuel Kotz and N. Balakrishnan. 1995. Continuous univariate distributions vol. 2."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989323.1989411"},{"volume-title":"Goldin","year":"1994","author":"Kanellakis Paris C.","key":"e_1_2_1_37_1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/0196-6774(89)90038-2"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447879"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453894"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-48751-4_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2007.09.006"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1978.1055832"},{"key":"e_1_2_1_44_1","unstructured":"Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. Retrieved from http:\/\/yann.lecun.com\/exdb\/mnist\/.  Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. Retrieved from http:\/\/yann.lecun.com\/exdb\/mnist\/."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402755.3402803"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064026"},{"volume-title":"Hinton","year":"1998","author":"Neal Radford M.","key":"e_1_2_1_47_1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87993-0_26"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559887"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447826"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(93)90061-F"},{"volume-title":"Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI\u201907)","year":"2007","author":"Raedt Luc De","key":"e_1_2_1_52_1"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-006-5833-1"},{"volume-title":"IPUMS USA: Version 8.0 {dataset}","year":"2018","author":"Ruggles Steven","key":"e_1_2_1_54_1"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-009-0153-2"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376744"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767854"},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"Dan Suciu Dan Olteanu Christopher R\u00e9 and Christoph Koch. 2011. Probabilistic Databases. Morgan 8 Claypool Publishers.   Dan Suciu Dan Olteanu Christopher R\u00e9 and Christoph Koch. 2011. Probabilistic Databases. Morgan 8 Claypool Publishers.","DOI":"10.1007\/978-3-031-01879-4"},{"key":"e_1_2_1_59_1","unstructured":"TPC. 2017. TPC-H benchmark. Retrieved from http:\/\/www.tpc.org\/tpch\/.  TPC. 2017. TPC-H benchmark. Retrieved from http:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824055"}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3277503","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3277503","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3277503","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:39:38Z","timestamp":1750210778000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3277503"}},"subtitle":["A Scalable Approach to Belief Updating and Parameter Learning"],"short-title":[],"issued":{"date-parts":[[2018,12,8]]},"references-count":58,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3277503"],"URL":"https:\/\/doi.org\/10.1145\/3277503","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"type":"print","value":"0362-5915"},{"type":"electronic","value":"1557-4644"}],"subject":[],"published":{"date-parts":[[2018,12,8]]},"assertion":[{"value":"2017-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-12-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}