{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T08:15:23Z","timestamp":1768292123830,"version":"3.49.0"},"reference-count":37,"publisher":"Emerald","issue":"2","license":[{"start":{"date-parts":[[2023,11,14]],"date-time":"2023-11-14T00:00:00Z","timestamp":1699920000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AJIM"],"published-print":{"date-parts":[[2025,3,4]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>This paper aims to amplify the retrieval and utilization of historical newspapers through the application of semantic organization, all from the vantage point of a fine-grained knowledge element perspective. This endeavor seeks to unlock the latent value embedded within newspaper contents while simultaneously furnishing invaluable guidance within methodological paradigms for research in the humanities domain.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>According to the semantic organization process and knowledge element concept, this study proposes a holistic framework, including four pivotal stages: knowledge element description, extraction, association and application. Initially, a semantic description model dedicated to knowledge elements is devised. Subsequently, harnessing the advanced deep learning techniques, the study delves into the realm of entity recognition and relationship extraction. These techniques are instrumental in identifying entities within the historical newspaper contents and capturing the interdependencies that exist among them. Finally, an online platform based on Flask is developed to enable the recognition of entities and relationships within historical newspapers.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>This article utilized the Shengjing Times\u00b7Changchun Compilation as the datasets for describing, extracting, associating and applying newspapers contents. Regarding knowledge element extraction, the BERT\u00a0+\u00a0BS consistently outperforms Bi-LSTM, CRF++ and even BERT in terms of Recall and F1 scores, making it a favorable choice for entity recognition in this context. Particularly noteworthy is the Bi-LSTM-Pro model, which stands out with the highest scores across all metrics, notably achieving an exceptional F1 score in knowledge element relationship recognition.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>Historical newspapers transcend their status as mere artifacts, evolving into invaluable reservoirs safeguarding the societal and historical memory. Through semantic organization from a fine-grained knowledge element perspective, it can facilitate semantic retrieval, semantic association, information visualization and knowledge discovery services for historical newspapers. In practice, it can empower researchers to unearth profound insights within the historical and cultural context, broadening the landscape of digital humanities research and practical applications.<\/jats:p><\/jats:sec>","DOI":"10.1108\/ajim-05-2023-0180","type":"journal-article","created":{"date-parts":[[2023,11,12]],"date-time":"2023-11-12T06:14:55Z","timestamp":1699769695000},"page":"260-281","source":"Crossref","is-referenced-by-count":4,"title":["Unearthing historical insights: semantic organization and application of historical newspapers from a fine-grained knowledge element perspective"],"prefix":"10.1108","volume":"77","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2927-0971","authenticated-orcid":false,"given":"Shaodan","family":"Sun","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3291-7193","authenticated-orcid":false,"given":"Jun","family":"Deng","sequence":"additional","affiliation":[]},{"given":"Xugong","family":"Qin","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2023,11,14]]},"reference":[{"issue":"2","key":"key2025030318500424600_ref001","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1108\/JD-06-2018-0087","article-title":"Metadata categorization for identifying search patterns in a digital library","volume":"75","year":"2019","journal-title":"Journal of Documentation"},{"key":"key2025030318500424600_ref002","first-page":"1","article-title":"Robust named entity recognition and linking on historical multilingual documents","year":"2020"},{"issue":"8","key":"key2025030318500424600_ref003","first-page":"132","article-title":"Extracting knowledge elements of sci-tech literature based on artificial and machine features","volume":"5","year":"2021","journal-title":"Data Analysis and Knowledge Discovery in Chinese"},{"key":"key2025030318500424600_ref004","first-page":"320","article-title":"Person-centric mining of historical newspaper collections","year":"2016"},{"key":"key2025030318500424600_ref005","volume-title":"Research on Ontology-Based Retrieval Model for Digital Libraries","year":"2006"},{"key":"key2025030318500424600_ref006","article-title":"Irisa system for entity detection and linking at clef hipe 2020","year":"2020"},{"key":"key2025030318500424600_ref007","first-page":"155","article-title":"Ranking archived documents for structured queries on semantic layers","year":"2018"},{"issue":"1","key":"key2025030318500424600_ref008","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1080\/15332748.2019.1642701","article-title":"Successful management of an outsourced large-scale digitization newspaper project","volume":"16","year":"2019","journal-title":"Journal of Archival Organization"},{"key":"key2025030318500424600_ref009","first-page":"290","article-title":"Visualizing the first world war using StreamGraphs and information extraction","year":"2016"},{"key":"key2025030318500424600_ref010","first-page":"770","article-title":"Deep residual learning for image recognition","year":"2016"},{"issue":"2","key":"key2025030318500424600_ref011","first-page":"32","article-title":"An initial exploration of constructing the ontological framework for the history of the People's Republic of China","volume":"34","year":"2014","journal-title":"Journal of Modern Information in Chinese"},{"issue":"6","key":"key2025030318500424600_ref012","doi-asserted-by":"crossref","first-page":"1228","DOI":"10.1108\/JD-09-2016-0106","article-title":"Cultural heritage as digital noise: nineteenth century newspapers in the digital archive","volume":"73","year":"2017","journal-title":"Journal of Documentation"},{"key":"key2025030318500424600_ref013","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1162\/tacl_a_00300","article-title":"Spanbert: improving pre-training by representing and predicting spans","volume":"8","year":"2020","journal-title":"Transactions of the Association for Computational Linguistics"},{"issue":"2","key":"key2025030318500424600_ref014","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1108\/DLP-09-2015-0015","article-title":"Digital newspaper preservation through collaboration","volume":"32","year":"2016","journal-title":"Digital Library Perspectives"},{"issue":"6","key":"key2025030318500424600_ref015","first-page":"84","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","year":"2012","journal-title":"Communications of the ACM"},{"key":"key2025030318500424600_ref016","first-page":"8","article-title":"BERT for named entity recognition in contemporary and historical German","year":"2019"},{"key":"key2025030318500424600_ref017","article-title":"Albert: a lite bert for self-supervised learning of language representations","year":"2019","journal-title":"arXiv Preprint arXiv:1909.11942"},{"key":"key2025030318500424600_ref018","volume-title":"Construction and Reasoning Research on the Ontology of 'Records of the Three Kingdoms' in the Field of History","year":"2011"},{"key":"key2025030318500424600_ref019","article-title":"Roberta: a robustly optimized bert pretraining approach","year":"2019","journal-title":"arXiv Preprint arXiv:1907.11692"},{"issue":"5","key":"key2025030318500424600_ref020","first-page":"338","article-title":"The advancements and deepening of intelligence studies","year":"1996","journal-title":"Journal of the China Society for Scientific and Technical Information in Chinese"},{"key":"key2025030318500424600_ref021","first-page":"4348","article-title":"An open corpus for named entity recognition in historic newspapers","year":"2016"},{"key":"key2025030318500424600_ref022","first-page":"405","article-title":"Making Europe's historical newspapers searchable","year":"2016"},{"issue":"2","key":"key2025030318500424600_ref023","first-page":"33","article-title":"Construction and application research of the ontology framework for \u2018Zizhi Tongjian\u2019 in the field of history","volume":"24","year":"2010","journal-title":"Journal of Chinese Information Processing in Chinese"},{"issue":"4","key":"key2025030318500424600_ref024","first-page":"137","article-title":"Metadata elements design and application for Japanese Newspaper'Chosunsibo'Issued in Colonial Korea","volume":"50","year":"2019","journal-title":"Journal of Korean Library and Information Science Society"},{"key":"key2025030318500424600_ref025","first-page":"120","article-title":"A named entity recognition shootout for German","year":"2018"},{"key":"key2025030318500424600_ref026","first-page":"81","article-title":"Digital preservation of Old Persian periodicals in Iran with special reference to Iranian newspapers: strategies and challenge","year":"2015"},{"key":"key2025030318500424600_ref027","unstructured":"Simon, H. and Bart, K. (2001), \u201cGradient based learning applied to document recognition\u201d, Intelligent Signal Processing, IEEE, pp. 306-351."},{"key":"key2025030318500424600_ref028","article-title":"Very deep convolutional networks for large-scale image recognition","year":"2014","journal-title":"arXiv Preprint arXiv:1409.1556"},{"key":"key2025030318500424600_ref029","article-title":"Ernie: enhanced representation through knowledge integration","year":"2019","journal-title":"arXiv Preprint arXiv:1904.09223"},{"key":"key2025030318500424600_ref030","first-page":"1","article-title":"Going deeper with convolutions","year":"2015"},{"key":"key2025030318500424600_ref031","article-title":"Transfer learning for named entity recognition in historical Corpora","year":"2020","journal-title":"CLEF"},{"key":"key2025030318500424600_ref032","volume-title":"Research on Ontology-Based Construction of Domain Knowledge Elements","year":"2014"},{"key":"key2025030318500424600_ref033","article-title":"Visualizing and understanding convolutional networks","year":"2014"},{"issue":"3","key":"key2025030318500424600_ref034","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1108\/AJIM-03-2022-0130","article-title":"Measuring the interdisciplinary characteristics of Chinese research in library and information science based on knowledge elements","volume":"75","year":"2023","journal-title":"Aslib Journal of Information Management"},{"key":"key2025030318500424600_ref035","first-page":"573","article-title":"Extraction and evaluation of knowledge entities from scientific documents: eeke2020","year":"2020"},{"issue":"9","key":"key2025030318500424600_ref036","first-page":"39","article-title":"Knowledge units and exponential patterns","year":"1984","journal-title":"Science of Science and Management of S.&.T in Chinese"},{"key":"key2025030318500424600_ref037","article-title":"Boundary smoothing for named entity recognition","year":"2022","journal-title":"arXiv Preprint arXiv:2204.12031"}],"container-title":["Aslib Journal of Information Management"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/AJIM-05-2023-0180\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/AJIM-05-2023-0180\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:00:50Z","timestamp":1753398050000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ajim\/article\/77\/2\/260-281\/1243610"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,14]]},"references-count":37,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,11,14]]},"published-print":{"date-parts":[[2025,3,4]]}},"alternative-id":["10.1108\/AJIM-05-2023-0180"],"URL":"https:\/\/doi.org\/10.1108\/ajim-05-2023-0180","relation":{},"ISSN":["2050-3806"],"issn-type":[{"value":"2050-3806","type":"print"}],"subject":[],"published":{"date-parts":[[2023,11,14]]}}}