{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,1]],"date-time":"2025-06-01T04:10:38Z","timestamp":1748751038117,"version":"3.41.0"},"publisher-location":"Cham","reference-count":36,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031939754","type":"print"},{"value":"9783031939761","type":"electronic"}],"license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T00:00:00Z","timestamp":1748304000000},"content-version":"vor","delay-in-days":146,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>This paper designs and implements an artifact for converting unstructured or semi-structured open data into outputs conforming to the OGC SensorThings API (STA). Motivated by the growing influx of heterogeneous data in Internet-of-Things environments, the study employs an Action Design Research process to apply formalized grammars to Large Language Models (LLMs) to produce valid, STA-compliant JSON documents. Early prototypes using JSON schemas and Pydantic models highlighted the need for stricter control mechanisms to handle real-world open data complexity. Evaluation across multiple open data sources demonstrates the effectiveness of grammar-driven constraints in reducing malformed or incomplete outputs. Three smaller LLMs\u2014Qwen 2.5 Instruct, Llama 3.1 Instruct, and Phi-4\u2014were tested, showing that grammar length and input context can significantly influence output quality and model throughput. The findings underscore the advantages of embedding strict syntax requirements without sacrificing flexibility for diverse use cases. While domain-level validation (e.g., verifying realistic time-series values) remains a future direction, this research confirms the promise of grammar-based generation for streamlining data ingestion in IoT platforms. The approach facilitates more consistent and maintainable pipelines, potentially boosting interoperability and data quality in sensor-driven environments.<\/jats:p>","DOI":"10.1007\/978-3-031-93976-1_12","type":"book-chapter","created":{"date-parts":[[2025,5,31]],"date-time":"2025-05-31T11:36:10Z","timestamp":1748691370000},"page":"178-195","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Designing Grammar-Guided LLM Outputs for\u00a0Open Data Integration \u2013 A DSR Approach to\u00a0IoT Data Platforms"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5071-2589","authenticated-orcid":false,"given":"Dennis M.","family":"Riehle","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9822-0586","authenticated-orcid":false,"given":"Arnold F.","family":"Arz von Straussenburg","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7929-4836","authenticated-orcid":false,"given":"Timon T.","family":"Aldenhoff","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,5,27]]},"reference":[{"key":"12_CR1","unstructured":"Abdin, M., Aneja, J., et al.: Phi-4 Technical report (2024). http:\/\/arxiv.org\/abs\/2412.08905"},{"issue":"6","key":"12_CR2","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1109\/MIC.2016.124","volume":"20","author":"B Ahlgren","year":"2016","unstructured":"Ahlgren, B., Hidell, M., Ngai, E.-H.: Internet of things for smart cities: interoperability and open data. IEEE Internet Comput. 20(6), 52\u201356 (2016)","journal-title":"IEEE Internet Comput."},{"key":"12_CR3","unstructured":"Arz von Straussenburg, A.F., Aldenhoff, T.T., Riehle, D.M.: Extending the SensorThings API data model - improving interoperability and use case flexibility in IoT. In: The 43rd International Conference on Conceptual Modeling Forum: Pittsburgh, Pennsylvania, USA (2024)"},{"key":"12_CR4","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1016\/j.ccs.2017.09.006","volume":"12","author":"S Barns","year":"2018","unstructured":"Barns, S.: Smart cities and urban data platforms: designing interfaces for smart governance. City Cult. Soc. 12, 5\u201312 (2018)","journal-title":"City Cult. Soc."},{"key":"12_CR5","doi-asserted-by":"publisher","unstructured":"Blazevic, M., Aldenhoff, T.T., Riehle, D.M.: Towards a smarter tomorrow: a design science perspective on building a smart campus IoT Data Platform. In: Mandviwalla, M., S\u00f6llner, M., Tuunanen, T. (eds) Design Science Research for a Resilient Future. DESRIST 2024. LNCS, vol. 14621, pp. 262\u2013277. Springer, Cham (2024). https:\/\/doi.org\/10.1007\/978-3-031-61175-9_18","DOI":"10.1007\/978-3-031-61175-9_18"},{"key":"12_CR6","unstructured":"T. Brown, B. Mann, et al.: Language Models Are Few-Shot Learners. In: Advances in Neural Information Processing Systems (2020)"},{"key":"12_CR7","doi-asserted-by":"crossref","unstructured":"do Carmo, S.L.O., Geyer, C.F.R., dos Anjos, J.C.S.: Data quantitative and qualitative study in Brazilian open data portals. J. Internet Serv. App. 15(1), 72\u201382 (2024)","DOI":"10.5753\/jisa.2024.3980"},{"key":"12_CR8","doi-asserted-by":"publisher","unstructured":"de Reuver, M., Ofe, H., et al.: The openness of data platforms: a research agenda. In: Proceedings of the 1st Int. Workshop on Data Economy. DE 2022, pp. 34\u201341. ACM, New York, NY, USA (2022). https:\/\/doi.org\/10.1145\/3565011.3569056","DOI":"10.1145\/3565011.3569056"},{"key":"12_CR9","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1016\/j.future.2021.06.031","volume":"125","author":"M Francia","year":"2021","unstructured":"Francia, M., Gallinucci, E., et al.: Making data platforms smarter with MOSES. Futur. Gener. Comput. Syst. 125, 299\u2013313 (2021)","journal-title":"Futur. Gener. Comput. Syst."},{"key":"12_CR10","doi-asserted-by":"crossref","unstructured":"Francia, M., Golfarelli, M., Pasini, M.: Towards a process-driven design of data platforms. In: DOLAP, pp. 28\u201335 (2024)","DOI":"10.1016\/j.is.2025.102527"},{"issue":"1","key":"12_CR11","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1016\/j.engappai.2010.09.007","volume":"24","author":"T-C Fu","year":"2011","unstructured":"Fu, T.-C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164\u2013181 (2011)","journal-title":"Eng. Appl. Artif. Intell."},{"key":"12_CR12","unstructured":"Grattafiori, A., Dubey, A., et al.: The Llama 3 Herd of Models (2024). http:\/\/arxiv.org\/abs\/2407.21783"},{"issue":"7","key":"12_CR13","doi-asserted-by":"publisher","first-page":"1645","DOI":"10.1016\/j.future.2013.01.010","volume":"29","author":"J Gubbi","year":"2013","unstructured":"Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a vision, architectural elements, and future directions. Futur. Gener. Comput. Syst. 29(7), 1645\u20131660 (2013). https:\/\/doi.org\/10.1016\/j.future.2013.01.010","journal-title":"Futur. Gener. Comput. Syst."},{"key":"12_CR14","doi-asserted-by":"publisher","unstructured":"Ho, J., Ooi, B., Westner, M.: Application integration framework for large language models. In: 5th International Conference on AI and Data Sciences, AiDAS 2024 - Proceedings, pp. 398\u2013403 (2024). https:\/\/doi.org\/10.1109\/AiDAS63860.2024.10730541","DOI":"10.1109\/AiDAS63860.2024.10730541"},{"key":"12_CR15","doi-asserted-by":"publisher","unstructured":"Khan, N.A., Ahangar, H.: Emerging trends in open research data. In: 2017 9th International Conference on Information and Knowledge Technology (IKT), pp. 141\u2013146 (2017). https:\/\/doi.org\/10.1109\/IKT.2017.8258631","DOI":"10.1109\/IKT.2017.8258631"},{"issue":"2","key":"12_CR16","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1007\/BF01692511","volume":"2","author":"D Knuth","year":"1968","unstructured":"Knuth, D.: Semantics of context-free languages. Math. Syst. Theory 2(2), 127\u2013145 (1968). https:\/\/doi.org\/10.1007\/BF01692511","journal-title":"Math. Syst. Theory"},{"key":"12_CR17","doi-asserted-by":"crossref","unstructured":"Krasikov, P., Legner, C.: A method to screen, assess, and prepare open data for use. J. Data Inf. Qual. 15(4), 43:1\u201343:25 (2023)","DOI":"10.1145\/3603708"},{"key":"12_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-12-S4-S5","volume":"12","author":"J Laros","year":"2011","unstructured":"Laros, J., Blavier, A., den Dunnen, J., Taschner, P.: A formalized description of the standard human variant nomenclature in extended Backus-Naur form. BMC Bioinform. 12, 1\u20137 (2011)","journal-title":"BMC Bioinform."},{"key":"12_CR19","unstructured":"Lewis, P., Perez, E., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems (2020)"},{"key":"12_CR20","doi-asserted-by":"publisher","unstructured":"Liang, S., Khalafbeigi, T.: OGC SensorThings API Part 2 \u2013 Tasking Core, Version 1.0. Report (2019). https:\/\/doi.org\/10.25607\/OBP-454","DOI":"10.25607\/OBP-454"},{"key":"12_CR21","unstructured":"Liang, S., Khalafbeigi, T., et al.: OGC SensorThings API Part 1: Sensing Version 1.1 (2021)"},{"key":"12_CR22","doi-asserted-by":"publisher","unstructured":"Liu, M., Liu, F., et al.: \u201cWe need structured output\u201d: towards user-centered constraints on large language model output. In: Conference on Human Factors in Computing Systems - Proceedings (2024). https:\/\/doi.org\/10.1145\/3613905.3650756","DOI":"10.1145\/3613905.3650756"},{"key":"12_CR23","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1007\/978-3-030-02925-8_15","volume-title":"Web Information Systems Engineering \u2013 WISE 2018","author":"F Montori","year":"2018","unstructured":"Montori, F., Liao, K., Jayaraman, P.P., Bononi, L., Sellis, T., Georgakopoulos, D.: Classification and annotation of open internet of things datastreams. In: Hacid, H., Cellary, W., Wang, H., Paik, H.-Y., Zhou, R. (eds.) WISE 2018. LNCS, vol. 11234, pp. 209\u2013224. Springer, Cham (2018). https:\/\/doi.org\/10.1007\/978-3-030-02925-8_15"},{"issue":"1","key":"12_CR24","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1080\/0960085X.2018.1451811","volume":"28","author":"MT Mullarkey","year":"2019","unstructured":"Mullarkey, M.T., Hevner, A.R.: An elaborated action design research process model. Eur. J. Inf. Syst. 28(1), 6\u201320 (2019)","journal-title":"Eur. J. Inf. Syst."},{"issue":"1","key":"12_CR25","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1080\/00987913.2008.10765152","volume":"34","author":"P Murray-Rust","year":"2008","unstructured":"Murray-Rust, P.: Open data in science. Ser. Rev. 34(1), 52\u201364 (2008). https:\/\/doi.org\/10.1080\/00987913.2008.10765152","journal-title":"Ser. Rev."},{"issue":"4","key":"12_CR26","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1007\/s12525-019-00362-x","volume":"29","author":"B Otto","year":"2019","unstructured":"Otto, B., Jarke, M.: Designing a multi-sided data platform: findings from the international data spaces case. Electron. Mark. 29(4), 561\u2013580 (2019)","journal-title":"Electron. Mark."},{"key":"12_CR27","unstructured":"Qwen, A., Yang, et al.: Qwen2.5 Technical report (2025). http:\/\/arxiv.org\/ abs\/2412.15115"},{"key":"12_CR28","unstructured":"Radev, I.: Context-free grammars from the computing theory perspective. In: 25th World Multi-Conference on Systemics, Cybernetics and Informatics, WMSCI 2021, pp. 51\u201356 (2021)"},{"key":"12_CR29","doi-asserted-by":"publisher","unstructured":"Riehle, D.M., Arz von Straussenburg, A.F., Aldenhoff, T.T.: Supplementary Dataset: Designing Grammar-Guided LLM Outputs for Open Data Integration - A DSR Approach to IoT Data Platforms. Zenodo (2025). https:\/\/doi.org\/10.5281\/zenodo.15100791","DOI":"10.5281\/zenodo.15100791"},{"key":"12_CR30","doi-asserted-by":"publisher","unstructured":"Rudakov, V., Timur, M., Yedilkhan, A.: Comparison of time series databases. In: 17th International Conference on Electronics Computer and Computation (ICECCO), pp. 1\u20134 (2023). https:\/\/doi.org\/10.1109\/ICECCO58239.2023.10147153","DOI":"10.1109\/ICECCO58239.2023.10147153"},{"key":"12_CR31","doi-asserted-by":"publisher","first-page":"20","DOI":"10.3389\/frsc.2020.00020","volume":"2","author":"O Slobodova","year":"2020","unstructured":"Slobodova, O., Becker, S.: Zooming into the ecosystem: agency and politics around open data platforms in Lyon and Berlin. Front. Sustain. Cities 2, 20 (2020)","journal-title":"Front. Sustain. Cities"},{"key":"12_CR32","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1016\/j.procs.2019.08.125","volume":"156","author":"A Struckov","year":"2019","unstructured":"Struckov, A., Yufa, S., Visheratin, A.A., Nasonov, D.: Evaluation of modern tools and techniques for storing time-series data. Proc. Comput. Sci. 156, 19\u201328 (2019)","journal-title":"Proc. Comput. Sci."},{"issue":"7","key":"12_CR33","doi-asserted-by":"publisher","first-page":"285","DOI":"10.3390\/a17070287","volume":"17","author":"N Tao","year":"2024","unstructured":"Tao, N., Ventresque, A., Nallur, V., Saber, T.: Enhancing program synthesis with large language models using many-objective grammar-guided genetic programming. Algorithms 17(7), 285 (2024)","journal-title":"Algorithms"},{"key":"12_CR34","unstructured":"Wei, J., Wang, X., et al.: Chain-of-thought prompting elicits reasoning in large language models. In: Advances in Neural Information Processing Systems (2022)"},{"issue":"2","key":"12_CR35","doi-asserted-by":"publisher","first-page":"384","DOI":"10.1002\/gdj3.138","volume":"9","author":"G Wildman","year":"2022","unstructured":"Wildman, G., Lewis, E.: Value of open data: a geoscience perspective. Geosci. Data J. 9(2), 384\u2013392 (2022)","journal-title":"Geosci. Data J."},{"issue":"1","key":"12_CR36","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.18","volume":"3","author":"MD Wilkinson","year":"2016","unstructured":"Wilkinson, M.D., Dumontier, M., et al.: The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3(1), 160018 (2016)","journal-title":"Sci. Data"}],"container-title":["Lecture Notes in Computer Science","Local Solutions for Global Challenges"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-93976-1_12","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,31]],"date-time":"2025-05-31T11:36:14Z","timestamp":1748691374000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-93976-1_12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"ISBN":["9783031939754","9783031939761"],"references-count":36,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-93976-1_12","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025]]},"assertion":[{"value":"27 May 2025","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"DESRIST","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Design Science Research in Information Systems and Technology","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Montego Bay","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Jamaica","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2025","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2 June 2025","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"4 June 2025","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"20","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"desrist2025","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"http:\/\/desrist2025.org\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}