{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:25:31Z","timestamp":1750220731304,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,5,31]],"date-time":"2020-05-31T00:00:00Z","timestamp":1590883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,11]]},"DOI":"10.1145\/3318464.3380608","type":"proceedings-article","created":{"date-parts":[[2020,5,29]],"date-time":"2020-05-29T17:12:33Z","timestamp":1590772353000},"page":"1967-1978","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference"],"prefix":"10.1145","author":[{"given":"Mohammad","family":"Raza","sequence":"first","affiliation":[{"name":"Microsoft Corporation, Redmond, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sumit","family":"Gulwani","sequence":"additional","affiliation":[{"name":"Microsoft Corporation, Redmond, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,5,31]]},"reference":[{"volume-title":"Syntax-Guided Synthesis","author":"Alur Rajeev","key":"e_1_3_2_2_1_1","unstructured":"Rajeev Alur , Rastislav Bod\u00edk , Eric Dallal , Dana Fisman , Pranav Garg , Garvit Juniwal , Hadas Kress-Gazit , P. Madhusudan , Milo M. K. Martin , Mukund Raghothaman , Shambwaditya Saha , Sanjit A. Seshia , Rishabh Singh , Armando Solar-Lezama , Emina Torlak , and Abhishek Udupa . 2015. Syntax-Guided Synthesis . In Dependable Software Systems Engineering, Maximilian Irlbeck, Doron A. Peled, and Alexander Pretschner (Eds.). NATO Science for Peace and Security Series, D : Information and Communication Security, Vol. 40 . IOS Press , 1--25. http:\/\/dblp.uni-trier.de\/db\/series\/natosec\/natosec40.html#AlurBDF0JKMMRSSSSTU15 Rajeev Alur, Rastislav Bod\u00edk, Eric Dallal, Dana Fisman, Pranav Garg, Garvit Juniwal, Hadas Kress-Gazit, P. Madhusudan, Milo M. K. Martin, Mukund Raghothaman, Shambwaditya Saha, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2015. Syntax-Guided Synthesis. In Dependable Software Systems Engineering, Maximilian Irlbeck, Doron A. Peled, and Alexander Pretschner (Eds.). NATO Science for Peace and Security Series, D: Information and Communication Security, Vol. 40. IOS Press, 1--25. http:\/\/dblp.uni-trier.de\/db\/series\/natosec\/natosec40.html#AlurBDF0JKMMRSSSSTU15"},{"key":"e_1_3_2_2_2_1","unstructured":"Tobias Anton. 2005. XPath-Wrapper Induction by generating tree traversal patterns.. In LWA (2005--11--14) Mathias Bauer Boris Brandherm Johannes F\u00fcrnkranz Gunter Grieser Andreas Hotho Andreas Jedlitschka and Alexander Kr\u00f6ner (Eds.). DFKI 126--133. http:\/\/dblp.uni-trier.de\/db\/conf\/lwa\/lwa2005.html#Anton05  Tobias Anton. 2005. XPath-Wrapper Induction by generating tree traversal patterns.. In LWA (2005--11--14) Mathias Bauer Boris Brandherm Johannes F\u00fcrnkranz Gunter Grieser Andreas Hotho Andreas Jedlitschka and Alexander Kr\u00f6ner (Eds.). DFKI 126--133. http:\/\/dblp.uni-trier.de\/db\/conf\/lwa\/lwa2005.html#Anton05"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872799"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-72667-8_3"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87479-9_31"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.4.3.233"},{"key":"e_1_3_2_2_7_1","volume-title":"International Conference on Very Large Data Bases (VLDB) (2001","author":"Crescenzi V","year":"2001","unstructured":"V Crescenzi . 2001 . RoadRunner: Towards Automatic Data Extraction from Large Web Sites . International Conference on Very Large Data Bases (VLDB) (2001 ). http:\/\/www.vldb.org\/conf\/2001\/P109.pdf V Crescenzi. 2001. RoadRunner: Towards Automatic Data Extraction from Large Web Sites. International Conference on Very Large Data Bases (VLDB) (2001). http:\/\/www.vldb.org\/conf\/2001\/P109.pdf"},{"key":"e_1_3_2_2_8_1","unstructured":"Jacob Devlin Jonathan Uesato Surya Bhupatiraju Rishabh Singh Abdel-rahman Mohamed and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I\/O. In ICML .  Jacob Devlin Jonathan Uesato Surya Bhupatiraju Rishabh Singh Abdel-rahman Mohamed and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I\/O. In ICML ."},{"key":"e_1_3_2_2_9_1","volume-title":"Automated Data Extraction Methodology Categories and Subject Descriptors. The World Wide Web Conference","author":"Furche Tim","year":"2012","unstructured":"Tim Furche , Georg Gottlob , Giovanni Grasso , \u00d6mer Gunes , Xiaonan Guo , Andrey Kravchenko , Giorgio Orsi , Christian Schallhart , Andrew Sellers , and Cheng Wang . 2012 . DIADEM : Domain-centric, Intelligent , Automated Data Extraction Methodology Categories and Subject Descriptors. The World Wide Web Conference (2012). Tim Furche, Georg Gottlob, Giovanni Grasso, \u00d6mer Gunes, Xiaonan Guo, Andrey Kravchenko, Giorgio Orsi, Christian Schallhart, Andrew Sellers, and Cheng Wang. 2012. DIADEM : Domain-centric, Intelligent, Automated Data Extraction Methodology Categories and Subject Descriptors. The World Wide Web Conference (2012)."},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2011.5767842"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"crossref","unstructured":"Sumit Gulwani. 2011. Automating String Processing in Spreadsheets using Input-Output Examples. In Principles of Programming Languages (POPL). 317--330.  Sumit Gulwani. 2011. Automating String Processing in Spreadsheets using Input-Output Examples. In Principles of Programming Languages (POPL). 317--330.","DOI":"10.1145\/1925844.1926423"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2240236.2240260"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"crossref","unstructured":"M. Hall E. Frank G. Holmes B. Pfahringer P. Reutemann and I. H. Witten. 2009. The weka data mining software: an update. ACM SIGKDD explorations newsletter 11 1 (2009).  M. Hall E. Frank G. Holmes B. Pfahringer P. Reutemann and I. H. Witten. 2009. The weka data mining software: an update. ACM SIGKDD explorations newsletter 11 1 (2009).","DOI":"10.1145\/1656274.1656278"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2010020"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/306766.306775"},{"key":"e_1_3_2_2_16_1","unstructured":"import.io. 2018. import.io. http:\/\/www.import.io  import.io. 2018. import.io. http:\/\/www.import.io"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3158090"},{"key":"e_1_3_2_2_18_1","unstructured":"Mozenda Inc. 2018. Mozenda. http:\/\/www.mozenda.com\/  Mozenda Inc. 2018. Mozenda. http:\/\/www.mozenda.com\/"},{"key":"e_1_3_2_2_19_1","volume-title":"Estimating Continuous Distributions in Bayesian Classifiers. 11th Conference on Uncertainty in Artificial Intelligence","author":"John G. H.","year":"1995","unstructured":"G. H. John and P. Langley . 1995 . Estimating Continuous Distributions in Bayesian Classifiers. 11th Conference on Uncertainty in Artificial Intelligence ( 1995 ), 338--345. G. H. John and P. Langley. 1995. Estimating Continuous Distributions in Bayesian Classifiers. 11th Conference on Uncertainty in Artificial Intelligence (1995), 338--345."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2479787.2479798"},{"key":"e_1_3_2_2_21_1","unstructured":"Nicholas Kushmerick Daniel S. Weld and Robert Doorenbos. 1997. Wrapper Induction for Information Extraction. In IJCAI-97 .  Nicholas Kushmerick Daniel S. Weld and Robert Doorenbos. 1997. Wrapper Induction for Information Extraction. In IJCAI-97 ."},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1025671410623"},{"key":"e_1_3_2_2_23_1","unstructured":"Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples.. In PLDI Michael F. P. O'Boyle and Keshav Pingali (Eds.). ACM 55. http:\/\/dblp.uni-trier.de\/db\/conf\/pldi\/pldi2014.html#LeG14  Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples.. In PLDI Michael F. P. O'Boyle and Keshav Pingali (Eds.). ACM 55. http:\/\/dblp.uni-trier.de\/db\/conf\/pldi\/pldi2014.html#LeG14"},{"key":"e_1_3_2_2_24_1","volume-title":"My Command: Programming by Example","author":"Lieberman Henry","year":"2001","unstructured":"Henry Lieberman (Ed.). 2001 . Your Wish is My Command: Programming by Example . Morgan Kaufmann Publishers . Henry Lieberman (Ed.). 2001. Your Wish is My Command: Programming by Example .Morgan Kaufmann Publishers."},{"key":"e_1_3_2_2_25_1","volume-title":"Allen","author":"Manshadi Mehdi Hafezi","year":"2013","unstructured":"Mehdi Hafezi Manshadi , Daniel Gildea , and James F . Allen . 2013 . Integrating Programming by Example and Natural Language Programming. In AAAI, Marie desJardins and Michael L. Littman (Eds.). AAAI Press . http:\/\/dblp.uni-trier.de\/db\/conf\/aaai\/aaai2013.html#ManshadiGA13 Mehdi Hafezi Manshadi, Daniel Gildea, and James F. Allen. 2013. Integrating Programming by Example and Natural Language Programming. In AAAI, Marie desJardins and Michael L. Littman (Eds.). AAAI Press. http:\/\/dblp.uni-trier.de\/db\/conf\/aaai\/aaai2013.html#ManshadiGA13"},{"key":"e_1_3_2_2_26_1","volume-title":"Knoblock","author":"Muslea Ion","year":"1999","unstructured":"Ion Muslea , Steven Minton , and Craig A . Knoblock . 1999 . A Hierarchical Approach to Wrapper Induction. In Autonomous Agents and Multi-Agent Systems . 190--197. http:\/\/dblp.uni-trier.de\/db\/conf\/agents\/agents99.html#MusleaMK99 Ion Muslea, Steven Minton, and Craig A. Knoblock. 1999. A Hierarchical Approach to Wrapper Induction. In Autonomous Agents and Multi-Agent Systems. 190--197. http:\/\/dblp.uni-trier.de\/db\/conf\/agents\/agents99.html#MusleaMK99"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2015.12.040"},{"key":"e_1_3_2_2_29_1","volume-title":"Antoon Bronselaer, and Guy De Tr\u00e9.","author":"Nielandt Joachim","year":"2014","unstructured":"Joachim Nielandt , Robin De Mol , Antoon Bronselaer, and Guy De Tr\u00e9. 2014 . Wrapper Induction by XPath Alignment. In KDIR, Ana L. N. Fred and Joaquim Filipe (Eds.). SciTePress , 492--500. http:\/\/dblp.uni-trier.de\/db\/conf\/ic3k\/kdir2014.html#NielandtMBT14 Joachim Nielandt, Robin De Mol, Antoon Bronselaer, and Guy De Tr\u00e9. 2014. Wrapper Induction by XPath Alignment. In KDIR, Ana L. N. Fred and Joaquim Filipe (Eds.). SciTePress, 492--500. http:\/\/dblp.uni-trier.de\/db\/conf\/ic3k\/kdir2014.html#NielandtMBT14"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018661.3018740"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.14778\/1921071.1921079"},{"key":"e_1_3_2_2_32_1","volume-title":"FlashMeta: a framework for inductive program synthesis","author":"Polozov Oleksandr","year":"2015","unstructured":"Oleksandr Polozov and Sumit Gulwani . 2015. FlashMeta: a framework for inductive program synthesis . In OOPSLA, Jonathan Aldrich and Patrick Eugster (Eds.). ACM , 107--126. http:\/\/dblp.uni-trier.de\/db\/conf\/oopsla\/oopsla 2015 .html#PolozovG15 Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: a framework for inductive program synthesis. In OOPSLA, Jonathan Aldrich and Patrick Eugster (Eds.). ACM, 107--126. http:\/\/dblp.uni-trier.de\/db\/conf\/oopsla\/oopsla2015.html#PolozovG15"},{"volume-title":"5: programs for machine learning","author":"Quinlan Ross","key":"e_1_3_2_2_33_1","unstructured":"Ross Quinlan . 2014. C4. 5: programs for machine learning . Elsevier . Ross Quinlan. 2014. C4. 5: programs for machine learning. Elsevier."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.10668"},{"key":"e_1_3_2_2_35_1","volume-title":"Disjunctive Program Synthesis: A Robust Approach to Programming by Example","author":"Raza Mohammad","year":"2018","unstructured":"Mohammad Raza and Sumit Gulwani . 2018. Disjunctive Program Synthesis: A Robust Approach to Programming by Example . In AAAI, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press , 1403--1412. http:\/\/dblp.uni-trier.de\/db\/conf\/aaai\/aaai 2018 .html#RazaG18 Mohammad Raza and Sumit Gulwani. 2018. Disjunctive Program Synthesis: A Robust Approach to Programming by Example. In AAAI, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 1403--1412. http:\/\/dblp.uni-trier.de\/db\/conf\/aaai\/aaai2018.html#RazaG18"},{"key":"e_1_3_2_2_36_1","unstructured":"Mohammad Raza and Sumit Gulwani. 2019. Microsoft Technical Report: Web data extraction using hybrid program synthesis: a combination of top-down and bottom-up inference. https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/11\/websynth-techreport.pdf  Mohammad Raza and Sumit Gulwani. 2019. Microsoft Technical Report: Web data extraction using hybrid program synthesis: a combination of top-down and bottom-up inference. https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2019\/11\/websynth-techreport.pdf"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v28i1.8744"},{"volume-title":"VLDB (2002-01-03), Malcolm P","author":"Sahuguet Arnaud","key":"e_1_3_2_2_38_1","unstructured":"Arnaud Sahuguet and Fabien Azavant . 1999. Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F . In VLDB (2002-01-03), Malcolm P . Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann , 738--741. http:\/\/dblp.uni-trier.de\/db\/conf\/vldb\/vldb99.html#SahuguetA99 Arnaud Sahuguet and Fabien Azavant. 1999. Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F. In VLDB (2002-01-03), Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann, 738--741. http:\/\/dblp.uni-trier.de\/db\/conf\/vldb\/vldb99.html#SahuguetA99"},{"key":"e_1_3_2_2_39_1","unstructured":"SelectorGadget. 2018. SelectorGadget. https:\/\/selectorgadget.com\/  SelectorGadget. 2018. SelectorGadget. https:\/\/selectorgadget.com\/"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"crossref","unstructured":"Rishabh Singh. 2016. BlinkFill: Semi-supervised Programming by Example for Syntactic String Transformations. In PVLDB. 816--827.  Rishabh Singh. 2016. BlinkFill: Semi-supervised Programming by Example for Syntactic String Transformations. In PVLDB. 816--827.","DOI":"10.14778\/2977797.2977807"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060745.1060761"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1645962"}],"event":{"name":"SIGMOD\/PODS '20: International Conference on Management of Data","sponsor":["SIGMOD ACM Special Interest Group on Management of Data"],"location":"Portland OR USA","acronym":"SIGMOD\/PODS '20"},"container-title":["Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3318464.3380608","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3318464.3380608","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:23Z","timestamp":1750199903000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3318464.3380608"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,31]]},"references-count":41,"alternative-id":["10.1145\/3318464.3380608","10.1145\/3318464"],"URL":"https:\/\/doi.org\/10.1145\/3318464.3380608","relation":{},"subject":[],"published":{"date-parts":[[2020,5,31]]},"assertion":[{"value":"2020-05-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}