{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T06:40:12Z","timestamp":1735627212339,"version":"3.32.0"},"reference-count":9,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:p>Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms of expressive power, complexity, and the ability to naturally combine text analysis with relational querying. This demonstration presents Spanner-Lib---a library for embedding document spanners in Python code. SpannerLib facilitates the development of IE programs by providing an implementation of Spannerlog (Datalog-based document spanners) that interacts with the Python code in two directions: rules can be embedded inside Python, and they can invoke custom Python code (e.g., calls to ML-based NLP models) via user-defined functions. The demonstration scenarios showcase IE programs, with increasing levels of complexity, within Jupyter Notebook.<\/jats:p>","DOI":"10.14778\/3685800.3685855","type":"journal-article","created":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T17:25:21Z","timestamp":1731086721000},"page":"4281-4284","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow"],"prefix":"10.14778","volume":"17","author":[{"given":"Dean","family":"Light","sequence":"first","affiliation":[{"name":"Technion, Haifa, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ahmad","family":"Aiashy","sequence":"additional","affiliation":[{"name":"Technion, Haifa, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mahmoud","family":"Diab","sequence":"additional","affiliation":[{"name":"Technion, Haifa, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Nachmias","sequence":"additional","affiliation":[{"name":"Technion, Haifa, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stijn","family":"Vansummeren","sequence":"additional","affiliation":[{"name":"UHasselt, Data Science Institute, Diepenbeek, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Benny","family":"Kimelfeld","sequence":"additional","affiliation":[{"name":"Technion, Haifa, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,11,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-3010"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 1st Workshop on NLP for COVID-19 at ACL","author":"Alec","year":"2020","unstructured":"Alec Chapman et al. 2020. A Natural Language Processing System for National COVID-19 Surveillance in the US Department of Veterans Affairs. In Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020."},{"volume-title":"Proceedings NIPS '20","author":"Tom","key":"e_1_2_1_3_1","unstructured":"Tom B. Brown et al. 2020. Language Models are Few-Shot Learners. In Proceedings NIPS '20. Article 159, 25 pages."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699442"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1561\/1900000017"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.1212303"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of NIPS '20","author":"Lewis Patrick","year":"2020","unstructured":"Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen tau Yih, Tim Rockt\u00e4schel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Proceedings of NIPS '20. Article 793, 16 pages."},{"volume-title":"Incorporating information extraction in the relational database model (WebDB)","author":"Nahshon Yoav","key":"e_1_2_1_8_1","unstructured":"Yoav Nahshon, Liat Peterfreund, and Stijn Vansummeren. 2016. Incorporating information extraction in the relational database model (WebDB). Association for Computing Machinery, 1--7."},{"volume-title":"Optimizing recursive queries with monotonic aggregates in DeALS","author":"Shkapsky Alexander","key":"e_1_2_1_9_1","unstructured":"Alexander Shkapsky, Mohan Yang, and Carlo Zaniolo. 2015. Optimizing recursive queries with monotonic aggregates in DeALS. In ICDE. IEEE Computer Society, 867--878."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3685800.3685855","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T05:28:27Z","timestamp":1735622907000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3685800.3685855"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8]]},"references-count":9,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["10.14778\/3685800.3685855"],"URL":"https:\/\/doi.org\/10.14778\/3685800.3685855","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2024,8]]},"assertion":[{"value":"2024-11-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}