{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T09:49:18Z","timestamp":1773481758345,"version":"3.50.1"},"reference-count":101,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T00:00:00Z","timestamp":1686614400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Israel Ministry of Science and Technology"},{"name":"ANR","award":["18-CE23-0003-02 and 19-CE48-0019"],"award-info":[{"award-number":["18-CE23-0003-02 and 19-CE48-0019"]}]},{"DOI":"10.13039\/501100003977","name":"Israel Science Foundation","doi-asserted-by":"publisher","award":["2015\/21"],"award-info":[{"award-number":["2015\/21"]}],"id":[{"id":"10.13039\/501100003977","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,6,13]]},"abstract":"<jats:p>We present a novel framework for uncertain data management. We start with a database whose tuple correctness is uncertain and an oracle that can resolve the uncertainty, i.e., decide if a tuple is correct or not. Such an oracle may correspond, e.g., to a data expert or to a crowdsourcing platform. We wish to use the oracle to clean the database with the goal of ensuring the correct answer for specific mission-critical queries. To avoid the prohibitive cost of cleaning the entire database and to minimize the expected number of calls to the oracle, we must carefully select tuples whose resolution would suffice to resolve the uncertainty in query results. In other words, we need a query-guided process for the resolution of uncertain data.<\/jats:p>\n          <jats:p>We develop an end-to-end solution to this problem, based on the derivation of query answers and on correctness probabilities for the uncertain data. At a high level, we first track Boolean provenance to identify which input tuples contribute to the derivation of each output tuple, and in what ways. We then design an active learning solution for iteratively choosing tuples to resolve, based on the provenance structure and on an evolving estimation of tuple correctness probabilities. We conduct an extensive experimental study to validate our framework in different use cases.<\/jats:p>","DOI":"10.1145\/3589325","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T20:26:45Z","timestamp":1687292805000},"page":"1-27","source":"Crossref","is-referenced-by-count":3,"title":["Query-Guided Resolution in Uncertain Databases"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-6639-9219","authenticated-orcid":false,"given":"Osnat","family":"Drien","sequence":"first","affiliation":[{"name":"Bar-Ilan University, Ramat-Gan, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9998-4403","authenticated-orcid":false,"given":"Matanya","family":"Freiman","sequence":"additional","affiliation":[{"name":"Bar-Ilan University, Ramat-Gan, Israel"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7977-4441","authenticated-orcid":false,"given":"Antoine","family":"Amarilli","sequence":"additional","affiliation":[{"name":"LTCI, T\u00e9l\u00e9com Paris, Institut Polytechnique de Paris, Paris, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8032-9962","authenticated-orcid":false,"given":"Yael","family":"Amsterdamer","sequence":"additional","affiliation":[{"name":"Bar-Ilan University, Ramat-Gan, Israel"}]}],"member":"320","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"e_1_2_2_1_1","volume-title":"Foundations of Databases","author":"Abiteboul Serge","unstructured":"Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley."},{"key":"e_1_2_2_2_1","volume-title":"Kolaitis","author":"Afrati Foto N.","year":"2009","unstructured":"Foto N. Afrati and Phokion G. Kolaitis. 2009. Repair Checking in Inconsistent Databases: Algorithms and Complexity. In ICDT. 31--41."},{"key":"e_1_2_2_3_1","volume-title":"Chris Hayworth, Shubha U. Nabar, Tomoe Sugihara, and Jennifer Widom.","author":"Agrawal Parag","year":"2006","unstructured":"Parag Agrawal, Omar Benjelloun, Anish Das Sarma, Chris Hayworth, Shubha U. Nabar, Tomoe Sugihara, and Jennifer Widom. 2006. Trio: A System for Data, Uncertainty, and Lineage. In PVLDB. 1151--1154."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-015-0092-9"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/2095686.2095693"},{"key":"e_1_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Yael Amsterdamer Daniel Deutch and Val Tannen. 2011. Provenance for Aggregate Queries. In PODS. 153--164.","DOI":"10.1145\/1989284.1989302"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1471068403001832"},{"key":"e_1_2_2_8_1","first-page":"1","article-title":"Cleaning Data with Constraints and Experts","volume":"1","author":"Assadi Ahmad","year":"2018","unstructured":"Ahmad Assadi, Tova Milo, and Slava Novgorodov. 2018. Cleaning Data with Constraints and Experts. In WebDB. 1:1--1:6.","journal-title":"WebDB."},{"key":"e_1_2_2_9_1","volume-title":"Butler","author":"Bates Adam","year":"2013","unstructured":"Adam Bates, Benjamin Mood, Masoud Valafar, and Kevin R. B. Butler. 2013. Towards Secure Provenance-Based Access Control in Cloud Environments. In CODASPY. 277--284."},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-007-0080-z"},{"key":"e_1_2_2_11_1","doi-asserted-by":"crossref","unstructured":"Moria Bergman Tova Milo Slava Novgorodov and Wang Chiew Tan. 2015. Query-Oriented Data Cleaning with Oracles. In SIGMOD. 1199--1214.","DOI":"10.1145\/2723372.2737786"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00224-012-9402-7"},{"key":"e_1_2_2_13_1","doi-asserted-by":"crossref","unstructured":"Michael Bloodgood. 2018. Support Vector Machine Active Learning Algorithms with Query-by-Committee Versus Closest-to-Hyperplane Selection. In ICSC. 148--155.","DOI":"10.1109\/ICSC.2018.00029"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103389"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-4567-5_3"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Peter Buneman Sanjeev Khanna and Wang Chiew Tan. 2001. Why and Where: A Characterization of Data Provenance. In ICDT. 316--330.","DOI":"10.1007\/3-540-44503-X_20"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000006"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453935"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920945"},{"key":"e_1_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Xu Chu Ihab F. Ilyas and Paolo Papotti. 2013. Holistic Data Cleaning: Putting Violations Into Context. In ICDE. 458--469.","DOI":"10.1109\/ICDE.2013.6544847"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749431"},{"key":"e_1_2_2_23_1","volume-title":"Eduardo Sany Laber, and Aline Medeiros Saettler","author":"Cicalese Ferdinando","year":"2014","unstructured":"Ferdinando Cicalese, Eduardo Sany Laber, and Aline Medeiros Saettler. 2014. Diagnosis determination: decision trees optimizing simultaneously worst and expected testing cost. In ICML. 414--422."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-006-0004-3"},{"key":"e_1_2_2_25_1","unstructured":"Dave DeBarr and H. Wechsler. 2009. Spam Detection using Clustering Random Forests and Active Learning. In CEAS."},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/2503308.2503327"},{"key":"e_1_2_2_27_1","first-page":"3","article-title":"Query Processing on Probabilistic Data: A Survey","volume":"7","author":"den Broeck Guy Van","year":"2017","unstructured":"Guy Van den Broeck and Dan Suciu. 2017. Query Processing on Probabilistic Data: A Survey. Found. Trends DBs 7, 3--4 (2017), 197--341.","journal-title":"Found. Trends DBs"},{"key":"e_1_2_2_28_1","doi-asserted-by":"crossref","unstructured":"Amol Deshpande Lisa Hellerstein and Devorah Kletenik. 2014. Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover. In SODA. 1453--1466.","DOI":"10.1137\/1.9781611973402.107"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3055540.3055550"},{"key":"e_1_2_2_30_1","unstructured":"Daniel Deutch Tova Milo Sudeepa Roy and Val Tannen. 2014. Circuits for Datalog Provenance. In ICDT. 201--212."},{"key":"e_1_2_2_31_1","doi-asserted-by":"crossref","unstructured":"Osnat Drien Antoine Amarilli and Yael Amsterdamer. 2021. Managing Consent for Data Access in Shared Databases. In ICDE. 2012--2015.","DOI":"10.1109\/ICDE51399.2021.00182"},{"key":"e_1_2_2_32_1","doi-asserted-by":"crossref","unstructured":"Anna Fariha Ashish Tiwari Alexandra Meliou Arjun Radhakrishna and Sumit Gulwani. 2021. CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning. In SIGMOD. 2706--2710.","DOI":"10.1145\/3448016.3452750"},{"key":"e_1_2_2_33_1","unstructured":"Kaniz Fatema Ensar Hadziselimovic Harshvardhan J. Pandit Christophe Debruyne Dave Lewis and Declan O'Sullivan. 2017. Compliance through Informed Consent: Semantic Based Consent Permission and Data Management Model. In PrivOn at ISWC."},{"key":"e_1_2_2_34_1","volume-title":"Decision Trees: More Theoretical Justification for Practical Algorithms. In ALT. 156--170.","author":"Fiat Amos","year":"2004","unstructured":"Amos Fiat and Dmitry Pechyony. 2004. Decision Trees: More Theoretical Justification for Practical Algorithms. In ALT. 156--170."},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"J. Nathan Foster Todd J. Green and Val Tannen. 2008. Annotated XML: Queries and Provenance. In PODS. 271--280.","DOI":"10.1145\/1376916.1376954"},{"key":"e_1_2_2_36_1","doi-asserted-by":"crossref","unstructured":"Michael J. Franklin Donald Kossmann Tim Kraska Sukriti Ramesh and Reynold Xin. 2011. CrowdDB: Answering Queries with Crowdsourcing. In SIGMOD. 61--72.","DOI":"10.1145\/1989323.1989331"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-019-00586-5"},{"key":"e_1_2_2_38_1","volume-title":"Mark James Carman, and F. Crestani","author":"Gerani Shima","year":"2010","unstructured":"Shima Gerani, Mark James Carman, and F. Crestani. 2010. Proximity-Based Opinion Retrieval. SIGIR (2010)."},{"key":"e_1_2_2_39_1","doi-asserted-by":"crossref","unstructured":"Amir Gilad Daniel Deutch and Sudeepa Roy. 2020. On Multiple Semantics for Declarative Database Repairs. In SIGMOD. 817--831.","DOI":"10.1145\/3318464.3389721"},{"key":"e_1_2_2_40_1","volume-title":"Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting. In ICDE. 174--185.","author":"Glavic Boris","year":"2009","unstructured":"Boris Glavic and Gustavo Alonso. 2009. Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting. In ICDE. 174--185."},{"key":"e_1_2_2_41_1","doi-asserted-by":"crossref","unstructured":"Boris Glavic and Gustavo Alonso. 2009. Provenance for Nested Subqueries. In EDBT. 982--993.","DOI":"10.1145\/1516360.1516472"},{"key":"e_1_2_2_42_1","volume-title":"Brief Announcement: A Consent Management Solution for Enterprises. In CSCML. 189--192.","author":"Goldsteen Abigail","year":"2017","unstructured":"Abigail Goldsteen, Shelly Garion, Sima Nadler, Natalia Razinkov, Yosef Moatti, and Paula Ta-Shma. 2017. Brief Announcement: A Consent Management Solution for Enterprises. In CSCML. 189--192."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/2208436.2208448"},{"key":"e_1_2_2_44_1","volume-title":"NeurIPS. Curran Associates","author":"Golovin Daniel","unstructured":"Daniel Golovin, Andreas Krause, and Debajyoti Ray. 2010. Near-Optimal Bayesian Active Learning with Noisy Observations. In NeurIPS. Curran Associates, Inc., 766--774."},{"key":"e_1_2_2_45_1","volume-title":"AMW","volume":"1912","author":"Greco Sergio","year":"2017","unstructured":"Sergio Greco, Cristian Molinaro, and Irina Trubitsyna. 2017. Computing Approximate Certain Answers over Incomplete Databases. In AMW, Vol. 1912."},{"key":"e_1_2_2_46_1","first-page":"9","article-title":"Provenance in ORCHESTRA","volume":"33","author":"Green Todd J.","year":"2010","unstructured":"Todd J. Green, Grigoris Karvounarakis, Zachary G. Ives, and Val Tannen. 2010. Provenance in ORCHESTRA. IEEE Data Eng. Bull. 33, 3 (2010), 9--16.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_2_47_1","doi-asserted-by":"crossref","unstructured":"Todd J. Green Gregory Karvounarakis and Val Tannen. 2007. Provenance Semirings. In PODS. 31--40.","DOI":"10.1145\/1265530.1265535"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2014.12.009"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137765.3137815"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-017-0486-1"},{"key":"e_1_2_2_51_1","volume-title":"Ilyas and Xu Chu","author":"Ihab","year":"2019","unstructured":"Ihab F. Ilyas and Xu Chu. 2019. Data Cleaning. ACM."},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/1634.1886"},{"key":"e_1_2_2_53_1","doi-asserted-by":"crossref","unstructured":"Haim Kaplan Eyal Kushilevitz and Yishay Mansour. 2005. Learning with Attribute Costs. In STOC. 356--365.","DOI":"10.1145\/1060590.1060644"},{"key":"e_1_2_2_54_1","doi-asserted-by":"crossref","unstructured":"Grigoris Karvounarakis Zachary G. Ives and Val Tannen. 2010. Querying Data Provenance. In SIGMOD. 951--962.","DOI":"10.1145\/1807167.1807269"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2007.70811"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453894"},{"key":"e_1_2_2_57_1","doi-asserted-by":"crossref","unstructured":"Daphne Koller. 1999. Probabilistic Relational Models. In ILP. 3--13.","DOI":"10.1007\/3-540-48751-4_1"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.14778\/3489496.3489516"},{"key":"e_1_2_2_59_1","unstructured":"Ksenia Konyushkova Raphael Sznitman and Pascal Fua. 2017. Learning Active Learning from Data. In NIPS. 4225--4235."},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/2994509.2994514"},{"key":"e_1_2_2_61_1","doi-asserted-by":"crossref","unstructured":"Jasper Kuperus Cor J. Veenman and Maurice van Keulen. 2013. Increasing NER Recall with Minimal Precision Loss. In EISIC. 106--111.","DOI":"10.1109\/EISIC.2013.23"},{"key":"e_1_2_2_62_1","first-page":"1926","article-title":"CDB","volume":"11","author":"Li Guoliang","year":"2018","unstructured":"Guoliang Li, Chengliang Chai, Ju Fan, Xueping Weng, Jian Li, Yudian Zheng, Yuanbing Li, Xiang Yu, Xiaohang Zhang, and Haitao Yuan. 2018. CDB: A Crowd-Powered Database System. PVLDB 11, 12 (2018), 1926--1929.","journal-title":"A Crowd-Powered Database System. PVLDB"},{"key":"e_1_2_2_63_1","doi-asserted-by":"crossref","unstructured":"Xiang Lian Lei Chen and Shaoxu Song. 2010. Consistent Query Answers in Inconsistent Probabilistic Databases. In SIGMOD. 303--314.","DOI":"10.1145\/1807167.1807202"},{"key":"e_1_2_2_64_1","doi-asserted-by":"crossref","unstructured":"Xin Lin Yun Peng Jianliang Xu and Byron Choi. 2018. Human-Powered Data Cleaning for Probabilistic Reachability Queries on Uncertain Graphs. In ICDE. 1755--1756.","DOI":"10.1109\/ICDE.2018.00235"},{"key":"e_1_2_2_65_1","first-page":"1517","article-title":"Generative Adversarial Active Learning for Unsupervised Outlier Detection","volume":"32","author":"Liu Ye-Zheng","year":"2020","unstructured":"Ye-Zheng Liu, Zhe Li, Chong Zhou, Yuanchun Jiang, Jianshan Sun, Meng Wang, and Xiangnan He. 2020. Generative Adversarial Active Learning for Unsupervised Outlier Detection. IEEE TKDE 32, 8 (2020), 1517--1528.","journal-title":"IEEE TKDE"},{"key":"e_1_2_2_66_1","doi-asserted-by":"crossref","unstructured":"Ester Livshits Benny Kimelfeld and Sudeepa Roy. 2018. Computing Optimal Repairs for Functional Dependencies. In PODS. 225--237.","DOI":"10.1145\/3196959.3196980"},{"key":"e_1_2_2_67_1","volume-title":"Crowdsourced Databases: Query Processing with People. In CIDR.","author":"Marcus A.","year":"2011","unstructured":"A. Marcus, E. Wu, D.R. Karger, S. Madden, and R.C. Miller. 2011. Crowdsourced Databases: Query Processing with People. In CIDR."},{"key":"e_1_2_2_68_1","unstructured":"Alexandra Meliou Wolfgang Gatterbauer and Dan Suciu. 2011. Bringing Provenance to Its Full Potential Using Causal Reasoning. In TaPP."},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3191513"},{"key":"e_1_2_2_70_1","volume-title":"Yang","author":"Mo Luyi","year":"2013","unstructured":"Luyi Mo, Reynold Cheng, Xiang Li, David W. Cheung, and Xuan S. Yang. 2013. Cleaning Uncertain Data for Top-k Queries. In ICDE. 134--145."},{"key":"e_1_2_2_71_1","first-page":"24","article-title":"From Cleaning before ML to Cleaning for ML","volume":"44","author":"Neutatz Felix","year":"2021","unstructured":"Felix Neutatz, Binger Chen, Ziawasch Abedjan, and Eugene Wu. 2021. From Cleaning before ML to Cleaning for ML. IEEE Data Eng. Bull. 44, 1 (2021), 24--41.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3156018"},{"key":"e_1_2_2_73_1","doi-asserted-by":"crossref","unstructured":"Natalia Ostapuk Jie Yang and Philippe Cudr\u00e9-Mauroux. 2019. ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs. In WWW. 1398--1408.","DOI":"10.1145\/3308558.3313620"},{"key":"e_1_2_2_74_1","doi-asserted-by":"crossref","unstructured":"Harshvardhan J. Pandit Declan O'Sullivan and Dave Lewis. 2018. Queryable Provenance Metadata For GDPR Compliance. In SEMANTICS. 262--268.","DOI":"10.1016\/j.procs.2018.09.026"},{"key":"e_1_2_2_75_1","volume-title":"B\u00f6hlen","author":"Papaioannou Katerina","year":"2018","unstructured":"Katerina Papaioannou, Martin Theobald, and Michael H. B\u00f6hlen. 2018. Supporting Set Operations in Temporal- Probabilistic Databases. In ICDE. 1180--1191."},{"key":"e_1_2_2_76_1","volume-title":"Deco: Declarative Crowdsourcing. In CIKM.","author":"Parameswaran Aditya G.","year":"2012","unstructured":"Aditya G. Parameswaran, Hyunjung Park, Hector Garcia-Molina, Neoklis Polyzotis, and Jennifer Widom. 2012. Deco: Declarative Crowdsourcing. In CIKM."},{"key":"e_1_2_2_77_1","volume-title":"Sandhu","author":"Park Jaehong","year":"2012","unstructured":"Jaehong Park, Dang Nguyen, and Ravi S. Sandhu. 2012. A Provenance-Based Access Control Model. In PST. 137--144."},{"key":"e_1_2_2_78_1","doi-asserted-by":"publisher","DOI":"10.1093\/cybsec\/tyy001"},{"key":"e_1_2_2_79_1","doi-asserted-by":"crossref","unstructured":"Christopher Re Nilesh Dalvi and Dan Suciu. 2007. Efficient Top-k Query Evaluation on Probabilistic Data. In ICDE. 886--895.","DOI":"10.1109\/ICDE.2007.367934"},{"key":"e_1_2_2_80_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137631"},{"key":"e_1_2_2_81_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476301"},{"key":"e_1_2_2_82_1","doi-asserted-by":"crossref","unstructured":"Sudeepa Roy Vittorio Perduca and Val Tannen. 2011. Faster Query Answering in Probabilistic Databases Using Read-Once Functions. In ICDT. 232--243.","DOI":"10.1145\/1938551.1938582"},{"key":"e_1_2_2_83_1","doi-asserted-by":"crossref","unstructured":"Rodrygo L. T. Santos Ben He Craig Macdonald and Iadh Ounis. 2009. Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval. In ECIR. 325--336.","DOI":"10.1007\/978-3-642-00958-7_30"},{"key":"e_1_2_2_84_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3236253"},{"key":"e_1_2_2_85_1","doi-asserted-by":"crossref","unstructured":"Yanyao Shen Hyokun Yun Zachary C. Lipton Yakov Kronrod and Animashree Anandkumar. 2018. Deep Active Learning for Named Entity Recognition. In ICLR.","DOI":"10.18653\/v1\/W17-2630"},{"key":"e_1_2_2_86_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2013.05.007"},{"key":"e_1_2_2_87_1","unstructured":"sourcecode 2022. Source code repository. https:\/\/github.com\/osnatairy\/Active-Probabilistic-Databases.git."},{"key":"e_1_2_2_88_1","doi-asserted-by":"crossref","unstructured":"Dan Suciu Dan Olteanu Christopher R\u00e9 and Christoph Koch. 2011. Probabilistic Databases. Morgan & Claypool.","DOI":"10.1007\/978-3-031-01879-4"},{"key":"e_1_2_2_89_1","first-page":"71","article-title":"A Framework for Conditioning Uncertain Relational Data","volume":"7447","author":"Tang Ruiming","year":"2012","unstructured":"Ruiming Tang, Reynold Cheng, Huayu Wu, and St\u00e9phane Bressan. 2012. A Framework for Conditioning Uncertain Relational Data. In DEXA, Vol. 7447. 71--87.","journal-title":"DEXA"},{"key":"e_1_2_2_90_1","unstructured":"TPC-H 2019. TPC-H Benchmark. http:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_2_2_91_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2002.08.001"},{"key":"e_1_2_2_92_1","first-page":"290","article-title":"Rule-Based Conditioning of Probabilistic Data","volume":"11142","author":"van Keulen Maurice","year":"2018","unstructured":"Maurice van Keulen, Benjamin Lucien Kaminski, Christoph Matheja, and Joost-Pieter Katoen. 2018. Rule-Based Conditioning of Probabilistic Data. In SUM, Vol. 11142. 290--305.","journal-title":"SUM"},{"key":"e_1_2_2_93_1","doi-asserted-by":"crossref","unstructured":"Joann\u00e8s Vermorel and M. Mohri. 2005. Multi-armed Bandit Algorithms and Empirical Evaluation. In ECML.","DOI":"10.1007\/11564096_42"},{"key":"e_1_2_2_94_1","doi-asserted-by":"crossref","unstructured":"Jiannan Wang Sanjay Krishnan Michael J. Franklin Ken Goldberg Tim Kraska and Tova Milo. 2014. A Sample-and- Clean Framework for Fast and Accurate Query Processing on Dirty Data. In SIGMOD. 469--480.","DOI":"10.1145\/2588555.2610505"},{"key":"e_1_2_2_95_1","first-page":"713","article-title":"Na\u00efve Bayes","volume":"15","author":"Webb Geoffrey I","year":"2010","unstructured":"Geoffrey I Webb, Eamonn Keogh, and Risto Miikkulainen. 2010. Na\u00efve Bayes. Encyclopedia of Machine Learning 15 (2010), 713--714.","journal-title":"Encyclopedia of Machine Learning"},{"key":"e_1_2_2_96_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377391.3377393"},{"key":"e_1_2_2_97_1","first-page":"207","article-title":"Query Aware Determinization of Uncertain Objects","volume":"27","author":"Xu Jie","year":"2015","unstructured":"Jie Xu, Dmitri V. Kalashnikov, and Sharad Mehrotra. 2015. Query Aware Determinization of Uncertain Objects. IEEE TKDE 27, 1 (2015), 207--221.","journal-title":"IEEE TKDE"},{"key":"e_1_2_2_98_1","volume-title":"Elmagarmid","author":"Yakout Mohamed","year":"2013","unstructured":"Mohamed Yakout, Laure Berti-\u00c9quille, and Ahmed K. Elmagarmid. 2013. Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes. In SIGMOD. 553--564."},{"key":"e_1_2_2_99_1","doi-asserted-by":"crossref","unstructured":"Wei Zhang Clement Yu and Weiyi Meng. 2007. Opinion Retrieval from Blogs. In CIKM. 831.","DOI":"10.1145\/1321440.1321555"},{"key":"e_1_2_2_100_1","doi-asserted-by":"publisher","DOI":"10.14778\/3436905.3436909"},{"key":"e_1_2_2_101_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2015.10.017"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589325","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589325","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:14Z","timestamp":1750178774000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589325"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,13]]},"references-count":101,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6,13]]}},"alternative-id":["10.1145\/3589325"],"URL":"https:\/\/doi.org\/10.1145\/3589325","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,13]]}}}