{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T20:12:15Z","timestamp":1762459935144,"version":"3.41.2"},"reference-count":35,"publisher":"Emerald","issue":"5\/6","license":[{"start":{"date-parts":[[2020,10,30]],"date-time":"2020-10-30T00:00:00Z","timestamp":1604016000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["EL"],"published-print":{"date-parts":[[2020,10,30]]},"abstract":"<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title>\n<jats:p>The purpose of this paper is to provide a methodology for automatic annotation of a multimedia collection of intangible cultural heritage mostly in the form of interviews. Assigned annotations provide a way to search the collection.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title>\n<jats:p>Annotation is based on automatic extraction of metadata and is conducted by named entity and topic extraction from textual descriptions with a rule-based approach supported by vocabulary resources, a compiled domain-specific classification scheme and domain-oriented corpus analysis.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Findings<\/jats:title>\n<jats:p>The proposed methodology for automatic annotation of a collection of intangible cultural heritage, applied on the cultural heritage of the Balkans, has very good results according to F measure, which is 0.87 for the named entity and 0.90 for topic annotation. The overall methodology enables encapsulating domain-specific and language-specific knowledge into collections of finite state transducers and allows further improvements.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title>\n<jats:p>Although cultural heritage has a significant role in the development of identity of a group or an individual, it is one of those specific domains that have not yet been fully explored in case of many languages. A methodology is proposed that can be used for incorporating natural language processing techniques into digital libraries of cultural heritage.<\/jats:p>\n<\/jats:sec>","DOI":"10.1108\/el-03-2020-0052","type":"journal-article","created":{"date-parts":[[2020,10,28]],"date-time":"2020-10-28T12:27:41Z","timestamp":1603888061000},"page":"905-918","source":"Crossref","is-referenced-by-count":5,"title":["HerCulB: content-based information extraction and retrieval for cultural heritage of the Balkans"],"prefix":"10.1108","volume":"38","author":[{"given":"Ivana","family":"Tanasijevi\u0107","sequence":"first","affiliation":[]},{"given":"Gordana","family":"Pavlovi\u0107-La\u017eeti\u0107","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"key":"key2020121210210205600_ref001","first-page":"83","article-title":"Twitie: an open-source information extraction pipeline for microblog text","volume-title":"Recent Advances in Natural Language Processing (RANLP \u201813)","year":"2013"},{"article-title":"Bunjevci: etnodijalektolo\u0161ka istra\u017eivanja, 2009","volume-title":"Nacionalni Savet Bunjeva\u010dke Nacionalne Manjine, Subotica, in cooperation with Matica srpska, Novi Sad","year":"2013","key":"key2020121210210205600_ref002"},{"key":"key2020121210210205600_ref003","first-page":"48","article-title":"Automatic extraction of archaeological events from text","volume-title":"Making History Interactive: Computer Applications and Quantitative Methods in Archaeology (CAA)","year":"2010"},{"issue":"1","key":"key2020121210210205600_ref004","first-page":"3","article-title":"Dictionnaires \u00e9lectroniques du fran\u00e7ais","volume":"87","year":"1990","journal-title":"Langue Fran\u00e7aise"},{"issue":"2","key":"key2020121210210205600_ref005","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1023\/A:1014348124664","article-title":"GATE, a general architecture for text engineering","volume":"36","year":"2002","journal-title":"Computers and the Humanities"},{"issue":"5","key":"key2020121210210205600_ref006","doi-asserted-by":"crossref","first-page":"425","DOI":"10.3414\/ME0508","article-title":"Semantic structuring of and information extraction from medical documents using the UMLS","volume":"47","year":"2008","journal-title":"Methods of Information in Medicine"},{"key":"key2020121210210205600_ref007","first-page":"225","article-title":"Web-assisted annotation, semantic indexing and search of television and radio news","volume-title":"The 14th International World Wide Web (WWW \u201805)","year":"2005"},{"issue":"Suppl 1","key":"key2020121210210205600_ref008","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1093\/bioinformatics\/17.suppl_1.S74","article-title":"GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles","volume":"17","year":"2001","journal-title":"Bioinformatics"},{"issue":"1","key":"key2020121210210205600_ref009","first-page":"53a","article-title":"Personal names in information extraction","volume":"11","year":"2010","journal-title":"INFOtheca-Journal of Informatics and Librarianship"},{"issue":"3\/4","key":"key2020121210210205600_ref010","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1108\/NLW-01-2014-0013","article-title":"Exhibiting library collections online: Omeka in context","volume":"115","year":"2014","journal-title":"New Library World"},{"issue":"2\/3","key":"key2020121210210205600_ref011","first-page":"236","article-title":"Easy listening: spoken document retrieval in choral","volume":"34","year":"2009","journal-title":"Interdisciplinary Science Reviews"},{"key":"key2020121210210205600_ref012","first-page":"100","article-title":"A comparison of NER tools: W.R.T a domain-specific vocabulary","volume-title":"Conference on Semantic Systems","year":"2014"},{"key":"key2020121210210205600_ref013","unstructured":"Jovanovi\u0107, S. (2003), \u201cGradja za tezaurus u oblasti etnologije\u201d, available at: www.nb.rs (accessed 10 August 2020)."},{"key":"key2020121210210205600_ref014","first-page":"48","article-title":"E-dictionaries and finite-state automata for the recognition of named entities","volume-title":"9th International Workshop on Finite State Methods and Natural Language Processing","year":"2011"},{"issue":"6","key":"key2020121210210205600_ref015","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1016\/j.knosys.2011.03.006","article-title":"Exploiting information extraction techniques for automatic semantic video indexing with an application to Turkish news videos","volume":"24","year":"2011","journal-title":"Knowledge-Based Systems"},{"issue":"4","key":"key2020121210210205600_ref016","first-page":"555","article-title":"Ontea: platform for pattern based automated semantic annotation","volume":"28","year":"2012","journal-title":"Computing and Informatics"},{"key":"key2020121210210205600_ref017","first-page":"65","article-title":"Entity-based opinion mining from text and multimedia","volume-title":"Advances in Social Media Analysis","year":"2015"},{"issue":"2","key":"key2020121210210205600_ref018","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1108\/JCHMSD-02-2016-0010","article-title":"The arches heritage inventory and management system: a platform for the heritage field","volume":"6","year":"2016","journal-title":"Journal of Cultural Heritage Management and Sustainable Development"},{"key":"key2020121210210205600_ref019","first-page":"282","article-title":"Information extraction from semi-structured resources: a two-phase finite state transducers approach","volume-title":"International Conference on Implementation and Application of Automata (CIAA \u201811)","year":"2011"},{"issue":"2","key":"key2020121210210205600_ref020","first-page":"36","article-title":"Transducers for annotating weather information in meteorological texts in Serbian","volume":"13","year":"2012","journal-title":"INFOtheca \u2013 Journal of Informatics and Librarianship"},{"issue":"3","key":"key2020121210210205600_ref021","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1108\/EL-06-2017-0128","article-title":"Semi-automatic extraction of multiword terms from domain-specific corpora","volume":"36","year":"2018","journal-title":"The Electronic Library"},{"key":"key2020121210210205600_ref022","unstructured":"Paumier, S. (2011), \u201cUnitex - manuel d\u2019utilisation\u201d, HAL ID l-00639621, available at: https:\/\/hal.archives-ouvertes.fr\/hal-00639621\/document (accessed 9 September 2020)."},{"article-title":"Materialy dlia etnolingvisticheskogo izucheniia balkano-slavianskogo areala [materials for ethnolinguistic investigation of the Balkan-Slavic area]","volume-title":"Institut Slavianovedeniia RAN","year":"2009","key":"key2020121210210205600_ref023"},{"key":"key2020121210210205600_ref024","first-page":"1634","article-title":"Survey of semantic annotation platforms","volume-title":"2005 ACM Symposium on Applied Computing","year":"2005"},{"key":"key2020121210210205600_ref025","first-page":"31","article-title":"The archaeology data service and the archaeotools project: faceted classification and natural language processing","volume-title":"Archaeology 2.0: New Approaches to Communication and Collaboration","year":"2011"},{"key":"key2020121210210205600_ref026","first-page":"1017","article-title":"WissKI: a virtual research environment for cultural heritage","volume-title":"20th European Conference on Artificial Intelligence (ECAI \u201812)","year":"2012"},{"issue":"4","key":"key2020121210210205600_ref027","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1016\/j.websem.2008.08.001","article-title":"Semantic annotation and search of cultural-heritage collections: the MultimediaN e-culture demonstrator","volume":"6","year":"2008","journal-title":"Journal of Web Semantics"},{"issue":"9","key":"key2020121210210205600_ref028","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1111\/j.1749-818X.2010.00230.x","article-title":"Natural language processing for cultural heritage domains","volume":"4","year":"2010","journal-title":"Language and Linguistics Compass"},{"key":"key2020121210210205600_ref029","first-page":"112","article-title":"Keyword-based search on bilingual digital libraries","volume-title":"Semanitic Keyword-Based Search on Structured Data Sources","year":"2016"},{"key":"key2020121210210205600_ref030","first-page":"127","article-title":"Enriching Serbian WordNet and electronic dictionaries with terms from the culinary domain","volume-title":"Seventh Global WordNet Conference","year":"2014"},{"key":"key2020121210210205600_ref031","first-page":"2874","article-title":"Multimedia database of the cultural heritage of the Balkans","volume-title":"8th International Conference on Language Resources and Evaluation (LREC \u201812)","year":"2012"},{"issue":"1\/2","key":"key2020121210210205600_ref032","first-page":"35a","article-title":"Resources and methods for named entity recognition in Serbian","volume":"9","year":"2008","journal-title":"INFOtheca-Journal of Informatics and Librarianship"},{"key":"key2020121210210205600_ref033","first-page":"97","article-title":"Processing Serbian written texts: an overview of resources and basic tools","volume":"21","year":"2003","journal-title":"Workshop on Balkan Language Resources and Tools"},{"issue":"5","key":"key2020121210210205600_ref034","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1002\/asi.23485","article-title":"A knowledge\u2010based approach to information extraction for semantic interoperability in the archaeology domain","volume":"67","year":"2016","journal-title":"Journal of the Association for Information Science and Technology"},{"key":"key2020121210210205600_ref035","first-page":"301","article-title":"A methodology toward effective and efficient manual document annotation: addressing annotator discrepancy and annotation quality","volume-title":"Knowledge Engineering and Management by the Masses (EKAW \u201810), (Series: Lecture Notes in Computer Science, Vol. 6317)","year":"2010"}],"container-title":["The Electronic Library"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/EL-03-2020-0052\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/EL-03-2020-0052\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,25]],"date-time":"2025-07-25T01:06:57Z","timestamp":1753405617000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/el\/article\/38\/5-6\/905-918\/47307"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,30]]},"references-count":35,"journal-issue":{"issue":"5\/6","published-print":{"date-parts":[[2020,10,30]]}},"alternative-id":["10.1108\/EL-03-2020-0052"],"URL":"https:\/\/doi.org\/10.1108\/el-03-2020-0052","relation":{},"ISSN":["0264-0473","0264-0473"],"issn-type":[{"type":"print","value":"0264-0473"},{"type":"print","value":"0264-0473"}],"subject":[],"published":{"date-parts":[[2020,10,30]]}}}