{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,14]],"date-time":"2026-05-14T16:41:42Z","timestamp":1778776902932,"version":"3.51.4"},"reference-count":45,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Federal Coordination Body for Geoinformation (GKG)"},{"name":"Swiss Conference of Directors of Construction, Planning and Environment (BPUK)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>The improvement of search engines for geospatial data on the World Wide Web has been a subject of research, particularly concerning the challenges in discovering and utilizing geospatial web services. Despite the establishment of standards by the Open Geospatial Consortium (OGC), the implementation of these services varies significantly among providers, leading to issues in dataset discoverability and usability. This paper presents a proof of concept for a search engine tailored to geospatial services in Switzerland. It addresses challenges such as scraping data from various OGC web service providers, enhancing metadata quality through Natural Language Processing, and optimizing search functionality and ranking methods. Semantic augmentation techniques are applied to enhance metadata completeness and quality, which are stored in a high-performance NoSQL database for efficient data retrieval. The results show improvements in dataset discoverability and search relevance, with NLP-extracted information contributing significantly to ranking accuracy. Overall, the GeoHarvester proof of concept demonstrates the feasibility of improving the discoverability and usability of geospatial web services through advanced search engine techniques.<\/jats:p>","DOI":"10.3390\/ijgi13040128","type":"journal-article","created":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T03:34:37Z","timestamp":1712892877000},"page":"128","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Search Engine for Open Geospatial Consortium Web Services Improving Discoverability through Natural Language Processing-Based Processing and Ranking"],"prefix":"10.3390","volume":"13","author":[{"given":"Elia","family":"Ferrari","sequence":"first","affiliation":[{"name":"Institute of Geomatics, FHNW University of Applied Sciences and Arts Northwestern Switzerland, 4132 Muttenz, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Friedrich","family":"Striewski","sequence":"additional","affiliation":[{"name":"Institute of Geomatics, FHNW University of Applied Sciences and Arts Northwestern Switzerland, 4132 Muttenz, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fiona","family":"Tiefenbacher","sequence":"additional","affiliation":[{"name":"Institute of Geomatics, FHNW University of Applied Sciences and Arts Northwestern Switzerland, 4132 Muttenz, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8127-2654","authenticated-orcid":false,"given":"Pia","family":"Bereuter","sequence":"additional","affiliation":[{"name":"Institute of Geomatics, FHNW University of Applied Sciences and Arts Northwestern Switzerland, 4132 Muttenz, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Oesch","sequence":"additional","affiliation":[{"name":"Federal Office of Topography Swisstopo, 3084 Wabern, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pasquale","family":"Di Donato","sequence":"additional","affiliation":[{"name":"Federal Office of Topography Swisstopo, 3084 Wabern, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,4,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ma, J., Co, J.E., and Quintanilla, A. (2010, January 5\u20137). A Semantic Index Structure for Integrating OGC Services in a Spatial Search Engine. Proceedings of the 2010 IEEE Conference on Open Systems (ICOS 2010), Kuala Lumpur, Malaysia.","DOI":"10.1109\/ICOS.2010.5720072"},{"key":"ref_2","unstructured":"De la Beaujardiere, J. (2023, November 11). OpenGIS\u00ae Web Map Server Implementation Specification 2006. Available online: https:\/\/portal.ogc.org\/files\/?artifact_id=14416."},{"key":"ref_3","unstructured":"Maso, J., Pomakis, K., and Juli\u00e0, N. (2023, November 11). OpenGIS\u00ae Web Map Tile Service Implementation Standard 2010. Available online: https:\/\/portal.ogc.org\/files\/?artifact_id=35326."},{"key":"ref_4","unstructured":"Vretanos, P.A. (2023, November 11). Web Feature Service Implementation Specification 2005. Available online: https:\/\/portal.ogc.org\/files\/?artifact_id=8339."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Yue, P., Di, L., Zhao, P., Yang, W., Yu, G., and Wei, Y. (August, January 31). Semantic Augmentations for Geospatial Catalogue Service. Proceedings of the 2006 IEEE International Symposium on Geoscience and Remote Sensing, Denver, CO, USA.","DOI":"10.1109\/IGARSS.2006.894"},{"key":"ref_6","unstructured":"Oesch, D. (2023, November 29). Resultate Der GeoUnconference\u2014Thema 16\u2014Service-Verzeichnis 2022. Available online: https:\/\/github.com\/GeoUnconference\/discussions\/discussions\/38."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1080\/17538947.2014.966164","article-title":"A Geospatial Search Engine for Discovering Multi-Format Geospatial Data across the Web","volume":"9","author":"Bone","year":"2016","journal-title":"Int. J. Digit. Earth"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Huang, C.-Y., and Chang, H. (2016). GeoWeb Crawler: An Extensible and Scalable Web Crawling Framework for Discovering Geospatial Web Resources. ISPRS Int. J. Geo-Inf., 5.","DOI":"10.3390\/ijgi5080136"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Miao, L., Guo, J., Cheng, W., and Zhou, Y. (2016, January 14\u201320). A Novel Model to Support OGC Web Services Semantic Search Using OWL-S. Proceedings of the 2016 24th International Conference on Geoinformatics, Galway, Ireland.","DOI":"10.1109\/GEOINFORMATICS.2016.7578973"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1111\/tgis.12863","article-title":"Annotating OGC Web Feature Services Automatically for Generating Geospatial Knowledge Graphs","volume":"26","author":"Saquicela","year":"2022","journal-title":"Trans. GIS"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1515\/geo-2020-0232","article-title":"An OGC Web Service Geospatial Data Semantic Similarity Model for Improving Geospatial Service Discovery","volume":"13","author":"Miao","year":"2021","journal-title":"Open Geosci."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Halilali, M.S., Gouard\u00e8res, E., Gaio, M., and Devin, F. (2022). Geospatial Web Services Discovery through Semantic Annotation of WPS. ISPRS Int. J. Geo-Inf., 11.","DOI":"10.3390\/ijgi11040254"},{"key":"ref_13","unstructured":"Shen, S., Liu, W., Wu, H., and Chen, Y. (2009, January 12\u201314). A Multi-Level Comprehensive Evaluation Method for Quality of WMS Based on Fuzzy Mathematics. Proceedings of the 2009 17th International Conference on Geoinformatics, Fairfax, VA, USA."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1002\/(SICI)1097-4571(199410)45:9<645::AID-ASI2>3.0.CO;2-8","article-title":"GIPSY: Automated Geographic Indexing of Text Documents","volume":"45","author":"Woodruff","year":"1994","journal-title":"J. Am. Soc. Inf. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Amitay, E., Har\u2019El, N., Sivan, R., and Soffer, A. (2004, January 25). Web-a-Where: Geotagging Web Content. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK.","DOI":"10.1145\/1008992.1009040"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1080\/13658810601169840","article-title":"The Design and Implementation of SPIRIT: A Spatially Aware Search Engine for Information Retrieval on the Internet","volume":"21","author":"Purves","year":"2007","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Manning, C.D., Raghavan, P., and Sch\u00fctze, H. (2008). Introduction to Information Retrieval, Cambridge University Press.","DOI":"10.1017\/CBO9780511809071"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1080\/13658810701626293","article-title":"A Comparison of Geometric Approaches to Assessing Spatial Similarity for GIR","volume":"22","author":"Frontiera","year":"2008","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_19","unstructured":"Andrade, L., and Silva, M. (2006, January 10). Relevance Ranking for Geographic IR. Proceedings of the 3rd ACM Workshop on Geographic Information Retrieval, Seattle, WA, USA."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Berry, M.W., and Kogan, J. (2010). Text Mining, Wiley.","DOI":"10.1002\/9780470689646"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Robertson, S., Walker, S., Jones, S., Hancock-Beaulieu, M., and Gatford, M. (1994). Okapi at TREC-3, National Institute of Standards and Technology (NIST).","DOI":"10.6028\/NIST.SP.500-225.routing-city"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1093\/biomet\/30.1-2.81","article-title":"A New Measure of Rank Correlation","volume":"30","author":"Kendall","year":"1938","journal-title":"Biometrika"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1145\/2047296.2047305","article-title":"Ranking Approaches for GIR","volume":"3","author":"Larson","year":"2011","journal-title":"SIGSPATIAL Spec."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"217","DOI":"10.14778\/2535569.2448955","article-title":"Spatial Keyword Query Processing: An Experimental Evaluation","volume":"6","author":"Chen","year":"2013","journal-title":"Proc. VLDB Endow."},{"key":"ref_25","unstructured":"Ji, X., Sungu-Eryilmaz, Y., Momeni, E., and Rawassizadeh, R. (2022). Speeding Up Question Answering Task of Language Models via Inverted Index. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Park, D., and Ahn, C.W. (2019). Self-Supervised Contextual Data Augmentation for Natural Language Processing. Symmetry, 11.","DOI":"10.3390\/sym11111393"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"100017","DOI":"10.1016\/j.nlp.2023.100017","article-title":"A Survey on Named Entity Recognition\u2014Datasets, Tools, and Methodologies","volume":"3","author":"Jehangir","year":"2023","journal-title":"Nat. Lang. Process. J."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1145\/273035.273069","article-title":"Sorting out Searching: A User-Interface Framework for Text Searches","volume":"41","author":"Shneiderman","year":"1998","journal-title":"Commun. ACM"},{"key":"ref_29","first-page":"164","article-title":"Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text","volume":"12","author":"Purves","year":"2018","journal-title":"FNT Inf. Retr."},{"key":"ref_30","first-page":"40","article-title":"Smart Voice Search Engine","volume":"90","author":"Sarhan","year":"2014","journal-title":"J. Comput. Appl."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Roy, N., Maxwell, D., and Hauff, C. (2022). Users and Contemporary SERPs: A (Re-)Investigation Examining User Interactions and Experiences. arXiv.","DOI":"10.1145\/3477495.3531719"},{"key":"ref_32","unstructured":"Oesch, D. (2023, August 15). Geoservice Harvester POC Open Geo Services Reported by the Swiss Gov Agencies and Third Parties 2023. Available online: https:\/\/github.com\/davidoesch\/geoservice_harvester_poc."},{"key":"ref_33","unstructured":"Honnibal, M., Boyd, A., Van Landeghem, S., and Montani, I. (2023, August 15). spaCy: Industrial-Strength Natural Language Processing in Python. Available online: https:\/\/zenodo.org\/doi\/10.5281\/zenodo.1212303."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. arXiv.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_35","unstructured":"Gavrilidou, M., Carayannis, G., Markantonatou, S., Piperidis, S., and Stainhauer, G. (June, January 31). Building a Treebank for Italian: A Data-Driven Annotation Schema. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC\u201900), Athens, Greece."},{"key":"ref_36","unstructured":"Weischedel, R., Palmer, M., Marcus, M., Hovy, E., Pradhan, S., Ramshaw, L., Xue, N., Taylor, A., Kaufman, J., and Franchini, M. (2013). OntoNotes Release 5.0, Linguistic Data Consortium. 2806280 KB."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"597","DOI":"10.1007\/s11168-004-7431-3","article-title":"TIGER: Linguistic Interpretation of a German Corpus","volume":"2","author":"Brants","year":"2004","journal-title":"Res. Lang. Comput."},{"key":"ref_38","unstructured":"Candito, M., and Seddah, D. (2012, January 4\u20138). Le Corpus Sequoia: Annotation Syntaxique et Exploitation Pour l\u2019adaptation d\u2019analyseur Par Pont Lexical. Proceedings of the TALN 2012\u201419e Conf\u00e9rence sur le Traitement Automatique des Langues Naturelles, Grenoble, France."},{"key":"ref_39","unstructured":"Shuyo, N. (2023, November 10). Language Detection Library for Java 2010. Available online: http:\/\/code.google.com\/p\/language-detection\/."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Chen, S., Tang, X., Wang, H., Zhao, H., and Guo, M. (2016, January 23\u201326). Towards Scalable and Reliable In-Memory Storage System: A Case Study with Redis. Proceedings of the 2016 IEEE Trustcom\/BigDataSE\/ISPA, Tianjin, China.","DOI":"10.1109\/TrustCom.2016.0255"},{"key":"ref_41","unstructured":"Card, S.K., Robertson, G.G., and Mackinlay, J.D. (May, January 27). The Information Visualizer, an Information Workspace. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Reaching through Technology\u2014CHI\u2019 91, New Orleans, LA, USA."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1108\/eb046814","article-title":"An Algorithm for Suffix Stripping","volume":"14","author":"Porter","year":"1980","journal-title":"Program"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1007\/978-3-030-77385-4_23","article-title":"Augmenting Ontology Alignment by Semantic Embedding and Distant Supervision","volume":"Volume 12731","author":"Verborgh","year":"2021","journal-title":"The Semantic Web"},{"key":"ref_44","unstructured":"(2023, November 12). Federal Statistical Office Permanent Resident Population by Category of Citizenship and Sex by Canton and City, 1999\u20132022. Available online: https:\/\/www.bfs.admin.ch\/asset\/en\/26565157."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Elnagar, S., Yoon, V., and Thomas, M.A. (2020, January 7\u201310). An Automatic Ontology Generation Framework with An Organizational Perspective. Proceedings of the Hawaii International Conference on System Sciences 2020, Honolulu, HI, USA.","DOI":"10.24251\/HICSS.2020.597"}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/13\/4\/128\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:26:46Z","timestamp":1760106406000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/13\/4\/128"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,12]]},"references-count":45,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["ijgi13040128"],"URL":"https:\/\/doi.org\/10.3390\/ijgi13040128","relation":{},"ISSN":["2220-9964"],"issn-type":[{"value":"2220-9964","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,12]]}}}