{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T10:34:54Z","timestamp":1766486094196,"version":"3.41.2"},"reference-count":17,"publisher":"Emerald","issue":"4\/5","license":[{"start":{"date-parts":[[2010,7,8]],"date-time":"2010-07-08T00:00:00Z","timestamp":1278547200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,7,8]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-heading\">Purpose<\/jats:title><jats:p>This paper sets out to discuss the use of information extraction (IE), a natural language\u2010processing (NLP) technique to assist \u201crich\u201d semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic\u2010aware \u201crich\u201d indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Design\/methodology\/approach<\/jats:title><jats:p>The paper proposes use of the English Heritage extension (CRM\u2010EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology\u2010Oriented Information Extraction process. The process of semantic indexing is based on a rule\u2010based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Findings<\/jats:title><jats:p>Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic\u2010aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-heading\">Originality\/value<\/jats:title><jats:p>The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as \u201cGrey Literature\u201d, from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.<\/jats:p><\/jats:sec>","DOI":"10.1108\/00012531011074708","type":"journal-article","created":{"date-parts":[[2010,8,21]],"date-time":"2010-08-21T07:18:21Z","timestamp":1282375101000},"page":"466-475","source":"Crossref","is-referenced-by-count":11,"title":["Excavating grey literature"],"prefix":"10.1108","volume":"62","author":[{"given":"Andreas","family":"Vlachidis","sequence":"first","affiliation":[]},{"given":"Ceri","family":"Binding","sequence":"additional","affiliation":[]},{"given":"Douglas","family":"Tudhope","sequence":"additional","affiliation":[]},{"given":"Keith","family":"May","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"key":"key2022021320430546400_b1","doi-asserted-by":"crossref","unstructured":"Binding, C., Tudhope, D. and May, K. (2008), \u201cSemantic interoperability in archaeological datasets: data mapping and extraction via the CIDOC CRM\u201d, Proceedings of the 12th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2008), Lecture Notes in Computer Science 5173, Springer, Berlin, pp. 280\u201090.","DOI":"10.1007\/978-3-540-87599-4_30"},{"key":"key2022021320430546400_b4","doi-asserted-by":"crossref","unstructured":"Bontcheva, K., Li, Y. and Cunningham, H. (2007), \u201cHierarchical, perception\u2010like learning for ontology\u2010based information extraction\u201d, Proceedings of the 16th International World Wide Web Conference, pp. 777\u201086, available at: www2007.org\/proceedings.html. (accessed 1 May 2010).","DOI":"10.1145\/1242572.1242677"},{"key":"key2022021320430546400_b2","doi-asserted-by":"crossref","unstructured":"Bontcheva, K., Cunningham, H., Kiryakov, A. and Tablan, V. (2006a), \u201cSemantic annotation and human language technology\u201d, Semantic Web Technology: Trends and Research in Ontology Based Systems, John Wiley & Sons Ltd, New York, NY.","DOI":"10.1002\/047003033X.ch3"},{"key":"key2022021320430546400_b3","doi-asserted-by":"crossref","unstructured":"Bontcheva, K., Duke, T., Glover, N. and Kings, I. (2006b), \u201cSemantic information access\u201d, Semantic Web Technology: Trends and Research in Ontology Based Systems, John Wiley & Sons Ltd, New York, NY.","DOI":"10.1002\/047003033X.ch8"},{"key":"key2022021320430546400_b5","doi-asserted-by":"crossref","unstructured":"Cunningham, H. (2005), \u201cInformation extraction, automatic\u201d, Encyclopedia of Language and Linguistics, 2nd ed., Elsevier, Oxford.","DOI":"10.1016\/B0-08-044854-2\/00960-3"},{"key":"key2022021320430546400_b6","doi-asserted-by":"crossref","unstructured":"Gaizauskas, R. and Wilks, Y. (1998), \u201cInformation extraction: beyond document retrieval\u201d, Journal of Documentation, Vol. 54 No. 1, pp. 70\u2010105.","DOI":"10.1108\/EUM0000000007162"},{"key":"key2022021320430546400_b8","doi-asserted-by":"crossref","unstructured":"Kiryakov, A., Popov, B., Terziev, I., Manov, D. and Ognyanoff, D. (2004), \u201cSemantic annotation, indexing, and retrieval\u201d, Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 2 No. 1, pp. 49\u201079.","DOI":"10.1016\/j.websem.2004.07.005"},{"key":"key2022021320430546400_b9","doi-asserted-by":"crossref","unstructured":"Lee, B., Hendler, J. and Lassila, O. (2001), \u201cThe semantic web\u201d, Scientific American, Vol. 284 No. 5, pp. 28\u201037.","DOI":"10.1038\/scientificamerican0501-34"},{"key":"key2022021320430546400_b10","unstructured":"May, K., Binding, C. and Tudhope, D. (2008), \u201cA STAR is born: some emerging semantic technologies for archaeological resources\u201d, Proceedings of Computer Applications and Quantitative Methods in Archaeology (CAA2008)."},{"key":"key2022021320430546400_b11","unstructured":"Moens, M.F. (2006), Information Extraction Algorithms and Prospects in a Retrieval Context, Springer, Dordrecht."},{"key":"key2022021320430546400_b14","doi-asserted-by":"crossref","unstructured":"Smeaton, A.F. (1997), \u201cInformation retrieval: still butting heads with natural language processing?\u201d, in Smeaton, A. (Ed.), Online Publications, available at: www.compapp.dcu.ie\/\u223casmeaton\/pubs\u2010list.html (accessed 1 May 2010).","DOI":"10.1007\/3-540-63438-X_7"},{"key":"key2022021320430546400_b15","unstructured":"Tudhope, D., Binding, C. and May, K. (2008), \u201cSemantic interoperability issues from a case study in archaeology\u201d, in Kollias, S. and Cousins, J. (Eds), Semantic Interoperability in the European Digital Library, Proceedings of the 1st International Workshop (SIEDL 2008), associated with 5th European Semantic Web Conference, Tenerife, pp. 88\u201099, available at: http:\/\/image.ntua.gr\/swamm2006\/SIEDLproceedings.pdf (accessed 1 May 2010)."},{"key":"key2022021320430546400_b16","doi-asserted-by":"crossref","unstructured":"Tudhope, D., Binding, C., Blocks, D. and Cunliffe, D. (2006), \u201cQuery expansion via conceptual distance in thesaurus\u2010indexed collections\u201d, Journal of Documentation, Vol. 62 No. 4, pp. 509\u201033.","DOI":"10.1108\/00220410610673873"},{"key":"key2022021320430546400_b17","doi-asserted-by":"crossref","unstructured":"Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas\u2010Vera, M., Motta, E. and Ciravegna, F. (2006), \u201cSemantic annotation for knowledge management: requirements and a survey of the state\u2010of\u2010the\u2010art\u201d, Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 4 No. 1, pp. 14\u201028.","DOI":"10.1016\/j.websem.2005.10.002"},{"key":"key2022021320430546400_frg1","unstructured":"General Architecture for Text Engineering (GATE) (n.d.), available at: http:\/\/gate.ac.uk\/ (accessed 1 May 2010)."},{"key":"key2022021320430546400_frg2","unstructured":"Online AccesS to the Index of archaeological investigationS (OASIS) (n.d.), available at: www.oasis.ac.uk\/ (accessed 1 May 2010)."},{"key":"key2022021320430546400_frg3","unstructured":"Semantic Technologies for Archaeological Resources (STAR) (n.d.), available at: http:\/\/hypermedia.research.glam.ac.uk\/kos\/STAR\/ (accessed 1 May 2010)."}],"container-title":["Aslib Proceedings"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/www.emeraldinsight.com\/doi\/full-xml\/10.1108\/00012531011074708","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/00012531011074708\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/00012531011074708\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T11:37:08Z","timestamp":1753357028000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ajim\/article\/62\/4-5\/466-475\/122075"}},"subtitle":["A case study on the rich indexing of archaeological documents via natural language\u2010processing techniques and knowledge\u2010based resources"],"editor":[{"given":"Vanda","family":"Broughton","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2010,7,8]]},"references-count":17,"journal-issue":{"issue":"4\/5","published-print":{"date-parts":[[2010,7,8]]}},"alternative-id":["10.1108\/00012531011074708"],"URL":"https:\/\/doi.org\/10.1108\/00012531011074708","relation":{},"ISSN":["0001-253X"],"issn-type":[{"type":"print","value":"0001-253X"}],"subject":[],"published":{"date-parts":[[2010,7,8]]}}}