{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T10:20:32Z","timestamp":1777890032883,"version":"3.51.4"},"reference-count":39,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2024,4,30]],"date-time":"2024-04-30T00:00:00Z","timestamp":1714435200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["SW"],"published-print":{"date-parts":[[2024,4,30]]},"abstract":"<jats:p>Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i)\u00a0it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii)\u00a0we could automatically identify modeling constraints in the specification documents by inspecting the tables ( P = 87 %) and text of these documents (F1 up to 94%); (iii)\u00a0the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text.<\/jats:p>","DOI":"10.3233\/sw-233342","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T11:31:30Z","timestamp":1687260690000},"page":"517-554","source":"Crossref","is-referenced-by-count":13,"title":["Deriving semantic validation rules from industrial standards: An OPC UA study"],"prefix":"10.1177","volume":"15","author":[{"given":"Yashoda Saisree","family":"Bareedu","sequence":"first","affiliation":[{"name":"Corporate Technology, Siemens AG, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas","family":"Fr\u00fchwirth","sequence":"additional","affiliation":[{"name":"Institute of Computer Engineering, TU Wien, Vienna, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christoph","family":"Niedermeier","sequence":"additional","affiliation":[{"name":"Corporate Technology, Siemens AG, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marta","family":"Sabou","sequence":"additional","affiliation":[{"name":"Institute of Information Systems Engineering, TU Wien, Vienna, Austria"},{"name":"Institute for Data, Process and Knowledge Management, Vienna University of Economics and Business, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gernot","family":"Steindl","sequence":"additional","affiliation":[{"name":"Institute of Computer Engineering, TU Wien, Vienna, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Aparna Saisree","family":"Thuluva","sequence":"additional","affiliation":[{"name":"Corporate Technology, Siemens AG, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stefani","family":"Tsaneva","sequence":"additional","affiliation":[{"name":"Institute of Information Systems Engineering, TU Wien, Vienna, Austria"},{"name":"Institute for Data, Process and Knowledge Management, Vienna University of Economics and Business, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nilay","family":"Tufek Ozkaya","sequence":"additional","affiliation":[{"name":"Corporate Technology, Siemens AG, Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"key":"10.3233\/SW-233342_ref1","doi-asserted-by":"crossref","unstructured":"M.\u00a0Ayd\u0131n and H.\u00a0Yaman, Domain knowledge representation languages and methods for building regulations, in: Eurasian BIM Forum, Springer, 2019, pp.\u00a0101\u2013121.","DOI":"10.1007\/978-3-030-42852-5_9"},{"key":"10.3233\/SW-233342_ref2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-3-319-41490-4","volume-title":"Semantic Web Technologies for Intelligent Engineering Applications","author":"Biffl","year":"2016"},{"key":"10.3233\/SW-233342_ref3","unstructured":"A.\u00a0Boufrida and Z.\u00a0Boufaida, Rule extraction from scientific texts: Evaluation in the specialty of gynecology, Journal of King Saud University\u00a0\u2013 Computer and Information Sciences (2020)."},{"key":"10.3233\/SW-233342_ref4","doi-asserted-by":"publisher","DOI":"10.1109\/IECON.2017.8217514"},{"issue":"6","key":"10.3233\/SW-233342_ref5","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1109\/MIC.2013.117","article-title":"Smart cities [guest editors\u2019 introduction]","volume":"17","author":"Celino","year":"2013","journal-title":"IEEE Internet Computing"},{"issue":"1\u20132","key":"10.3233\/SW-233342_ref6","doi-asserted-by":"publisher","first-page":"121","DOI":"10.3233\/SW-2010-0005","article-title":"Five challenges for the semantic sensor web","volume":"1","author":"Corcho","year":"2010","journal-title":"Semantic Web"},{"key":"10.3233\/SW-233342_ref7","doi-asserted-by":"publisher","DOI":"10.1109\/CIFER.1997.618923"},{"issue":"1","key":"10.3233\/SW-233342_ref8","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1145\/234173.234209","article-title":"Information extraction","volume":"39","author":"Cowie","year":"1996","journal-title":"Commun. ACM"},{"key":"10.3233\/SW-233342_ref9","unstructured":"H.\u00a0Cunningham, Information extraction, automatic, Encyclopedia of Language and Linguistics 3(8) (2005), 10."},{"key":"10.3233\/SW-233342_ref10","doi-asserted-by":"publisher","DOI":"10.1109\/IECON43393.2020.9254274"},{"key":"10.3233\/SW-233342_ref11","unstructured":"P.\u00a0Dolog and W.\u00a0Nejdl, Challenges and benefits of the semantic web for user modelling, in: Proceedings of the Workshop on Adaptive Hypermedia and Adaptive Web-Based Systems (AH2003) at 12th International World Wide Web Conference, Budapest, 2003."},{"key":"10.3233\/SW-233342_ref12","unstructured":"H.\u00a0Dong, S.\u00a0Liu, Z.\u00a0Fu, S.\u00a0Han and D.\u00a0Zhang, Semantic structure extraction for spreadsheet tables with a multi-task learning architecture, in: Workshop on Document Intelligence at NeurIPS 2019, 2019."},{"issue":"4","key":"10.3233\/SW-233342_ref15","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1007\/s10506-021-09282-8","article-title":"The linked legal data landscape: Linking legal data across different countries","volume":"29","author":"Filtz","year":"2021","journal-title":"Artificial Intelligence and Law"},{"key":"10.3233\/SW-233342_ref16","doi-asserted-by":"publisher","DOI":"10.1109\/ICSC.2013.68"},{"issue":"5","key":"10.3233\/SW-233342_ref17","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1109\/MIS.2015.68","article-title":"Information extraction","volume":"30","author":"Grishman","year":"2015","journal-title":"IEEE Intelligent Systems"},{"key":"10.3233\/SW-233342_ref18","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1016\/j.rser.2017.03.107","article-title":"Towards the next generation of smart grids: Semantic and holonic multi-agent management of distributed energy resources","volume":"77","author":"Howell","year":"2017","journal-title":"Renewable and Sustainable Energy Reviews"},{"issue":"1","key":"10.3233\/SW-233342_ref19","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1016\/j.dss.2005.01.004","article-title":"Rule identification from web pages by the XRML approach","volume":"41","author":"Kang","year":"2005","journal-title":"Decision Support Systems"},{"key":"10.3233\/SW-233342_ref20","unstructured":"E.\u00a0Kumar, Natural Language Processing, IK International Pvt Ltd, 2013."},{"key":"10.3233\/SW-233342_ref22","doi-asserted-by":"crossref","unstructured":"S.\u00a0Le\u00f3n, J.A.\u00a0Rodr\u00edguez-Mond\u00e9jar and C.\u00a0Puente, Inconsistency detection on data communication standards using information extraction techniques: The ABP case, in: International Workshop on Soft Computing Models in Industrial and Environmental Applications, Springer, 2019, pp.\u00a0291\u2013300.","DOI":"10.1007\/978-3-030-20055-8_28"},{"issue":"1","key":"10.3233\/SW-233342_ref23","doi-asserted-by":"publisher","first-page":"12434","DOI":"10.1016\/j.ifacol.2017.08.1248","article-title":"The role of interoperability in the fourth industrial revolution era","volume":"50","author":"Liao","year":"2017","journal-title":"IFAC-PapersOnLine"},{"key":"10.3233\/SW-233342_ref24","doi-asserted-by":"crossref","unstructured":"W.\u00a0Mahnke and S.-H.\u00a0Leitner, OPC unified architecture \u2013 the future standard for communication and information modeling in automation, ABB Review 2009 (2009), 3.","DOI":"10.1007\/978-3-540-68899-0"},{"key":"10.3233\/SW-233342_ref25","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-68899-0"},{"issue":"2","key":"10.3233\/SW-233342_ref26","doi-asserted-by":"publisher","first-page":"255","DOI":"10.3233\/SW-180333","article-title":"Information extraction meets the semantic web: A survey","volume":"11","author":"Martinez-Rodriguez","year":"2020","journal-title":"Semantic Web"},{"issue":"6","key":"10.3233\/SW-233342_ref27","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1515\/auto-2018-0138","article-title":"Applying knowledge bases to make factories smarter","volume":"67","author":"Ocker","year":"2019","journal-title":"at \u2013 Automatisierungstechnik"},{"key":"10.3233\/SW-233342_ref28","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"The Journal Of Machine Learning Research"},{"key":"10.3233\/SW-233342_ref29","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28569-1_2"},{"key":"10.3233\/SW-233342_ref30","doi-asserted-by":"publisher","DOI":"10.1109\/SysCon48628.2021.9447144"},{"key":"10.3233\/SW-233342_ref31","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1016\/j.compind.2018.04.007","article-title":"Industrial information extraction through multi-phase classification using ontology for unstructured documents","volume":"100","author":"Rajbabu","year":"2018","journal-title":"Computers in Industry"},{"issue":"3","key":"10.3233\/SW-233342_ref32","doi-asserted-by":"publisher","first-page":"461","DOI":"10.14232\/actacyb.21.3.2014.11","article-title":"Hungarian noun phrase extraction using rule-based and hybrid methods","volume":"21","author":"Recski","year":"2014","journal-title":"Acta Cybernetica"},{"issue":"1\u20132","key":"10.3233\/SW-233342_ref33","doi-asserted-by":"publisher","first-page":"127","DOI":"10.3233\/SW-2010-0011","article-title":"Smart objects: Challenges for semantic web research","volume":"1","author":"Sabou","year":"2010","journal-title":"Semantic Web"},{"issue":"3","key":"10.3233\/SW-233342_ref34","doi-asserted-by":"publisher","first-page":"291","DOI":"10.3233\/SW-180292","article-title":"Semantic web and human computation: The status of an emerging field","volume":"9","author":"Sabou","year":"2018","journal-title":"Semantic Web"},{"issue":"1","key":"10.3233\/SW-233342_ref35","doi-asserted-by":"publisher","first-page":"115","DOI":"10.3233\/SW-190381","article-title":"Semantics for cyber-physical systems: A cross-domain perspective","volume":"11","author":"Sabou","year":"2020","journal-title":"Semantic Web"},{"key":"10.3233\/SW-233342_ref36","unstructured":"S.\u00a0Sanyal, S.\u00a0Hazra, S.\u00a0Adhikary and N.\u00a0Ghosh, Resume parser with natural language processing, International Journal of Engineering Science 4484 (2017)."},{"key":"10.3233\/SW-233342_ref37","doi-asserted-by":"publisher","DOI":"10.1109\/INDIN41052.2019.8972102"},{"key":"10.3233\/SW-233342_ref38","unstructured":"S.\u00a0Schoenmackers, J.\u00a0Davis, O.\u00a0Etzioni and D.\u00a0Weld, Learning first-order horn clauses from web text, in: Proceedings of the 2010 Conference on Empirical Methods on Natural Language Processing, 2010, pp.\u00a01088\u20131098."},{"key":"10.3233\/SW-233342_ref39","unstructured":"M.\u00a0Sintek, M.\u00a0Junker, L.\u00a0Van Elst and A.\u00a0Abecker, Using information extraction rules for extending domain ontologies, in: Workshop on Ontology Learning, 2001."},{"key":"10.3233\/SW-233342_ref40","doi-asserted-by":"publisher","DOI":"10.15439\/2016F221"},{"key":"10.3233\/SW-233342_ref41","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1016\/j.jbi.2017.11.011","article-title":"Clinical information extraction applications: A literature review","volume":"77","author":"Wang","year":"2018","journal-title":"Journal of Biomedical Informatics"},{"issue":"1","key":"10.3233\/SW-233342_ref42","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1109\/MPE.2015.2481787","article-title":"A brief history: The common information model [in my view]","volume":"14","author":"Wollenberg","year":"2016","journal-title":"IEEE Power and Energy Magazine"}],"container-title":["Semantic Web"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/SW-233342","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T05:26:58Z","timestamp":1777613218000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/SW-233342"}},"subtitle":[],"editor":[{"given":"Bahar","family":"Aameri","sequence":"additional","affiliation":[{"name":"University of Toronto, Canada"}],"role":[{"role":"editor","vocabulary":"crossref"}]},{"given":"Mar\u00eda","family":"Poveda-Villal\u00f3n","sequence":"additional","affiliation":[{"name":"Universidad Polit\u00e9cnica de Madrid, Spain"}],"role":[{"role":"editor","vocabulary":"crossref"}]},{"given":"Emilio M.","family":"Sanfilippo","sequence":"additional","affiliation":[{"name":"ISTC-CNR Laboratory for Applied Ontology, Italy"}],"role":[{"role":"editor","vocabulary":"crossref"}]},{"given":"Walter","family":"Terkaj","sequence":"additional","affiliation":[{"name":"STIIMA-CNR, Italy"}],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2024,4,30]]},"references-count":39,"journal-issue":{"issue":"2"},"URL":"https:\/\/doi.org\/10.3233\/sw-233342","relation":{},"ISSN":["2210-4968","1570-0844"],"issn-type":[{"value":"2210-4968","type":"electronic"},{"value":"1570-0844","type":"print"}],"subject":[],"published":{"date-parts":[[2024,4,30]]}}}