{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T08:41:36Z","timestamp":1769157696621,"version":"3.49.0"},"reference-count":32,"publisher":"Wiley","license":[{"start":{"date-parts":[[2023,4,30]],"date-time":"2023-04-30T00:00:00Z","timestamp":1682812800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100015758","name":"Adama Science and Technology University","doi-asserted-by":"publisher","award":["ASTU\/SM-R\/383\/21"],"award-info":[{"award-number":["ASTU\/SM-R\/383\/21"]}],"id":[{"id":"10.13039\/501100015758","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Applied Computational Intelligence and Soft Computing"],"published-print":{"date-parts":[[2023,4,30]]},"abstract":"<jats:p>This study aims to develop a hybridized deep learning model for generating semantically meaningful image captions in Amharic Language. Image captioning is a task that combines both computer vision and natural language processing (NLP) domains. However, existing studies in the English language primarily focus on visual features to generate captions, resulting in a gap between visual and textual features and inadequate semantic representation. To address this challenge, this study proposes a hybridized attention-based deep neural network (DNN) model. The model consists of an Inception-v3 convolutional neural network (CNN) encoder to extract image features, a visual attention mechanism to capture significant features, and a bidirectional gated recurrent unit (Bi-GRU) with attention decoder to generate the image captions. The model was trained on the Flickr8k and BNATURE datasets with English captions, which were translated into Amharic Language with the help of Google Translator and Amharic Language experts. The evaluation of the model showed improvement in its performance, with a 1G-BLEU score of 60.6, a 2G-BLEU score of 50.1, a 3G-BLEU score of 43.7, and a 4G-BLEU score of 38.8. Generally, this study highlights the effectiveness of the hybrid approach in generating Amharic Language image captions with better semantic meaning.<\/jats:p>","DOI":"10.1155\/2023\/9397325","type":"journal-article","created":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T04:01:19Z","timestamp":1682913679000},"page":"1-11","source":"Crossref","is-referenced-by-count":8,"title":["Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks"],"prefix":"10.1155","volume":"2023","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-7433-8406","authenticated-orcid":true,"given":"Rodas","family":"Solomon","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Adama Science and Technology University, Adama, Ethiopia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8806-9402","authenticated-orcid":true,"given":"Mesfin","family":"Abebe","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Adama Science and Technology University, Adama, Ethiopia"}]}],"member":"311","reference":[{"key":"1","doi-asserted-by":"publisher","DOI":"10.4236\/jcc.2020.88006"},{"key":"2","volume-title":"Effect of Preprocessing on Long Short Term Memory Based Sentiment Analysis for Amharic Language","author":"T. Fikre","year":"2020"},{"key":"3","article-title":"Character recognition of bilingual amharic-latin printed documents","author":"A. A. Kesito","year":"2018"},{"key":"4","doi-asserted-by":"publisher","DOI":"10.4000\/ijcol.538"},{"key":"5","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00583"},{"key":"6","first-page":"1","article-title":"Image captioning: Transforming objects into words","author":"S. Herdade","year":"2019"},{"key":"7","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-018-09973-5"},{"key":"8","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.05.080"},{"key":"9","doi-asserted-by":"crossref","unstructured":"AgughalamD. M.Bidirectional lstm approach to image captioning with scene features davis munachimso agughalam supervisors2020Dublin, IrelandNational College of IrelandMSc. Data Analytics","DOI":"10.1117\/12.2600465"},{"key":"10","doi-asserted-by":"publisher","DOI":"10.1145\/3115432"},{"key":"11","doi-asserted-by":"publisher","DOI":"10.14569\/IJACSA.2021.0120287"},{"key":"12","doi-asserted-by":"publisher","DOI":"10.1145\/3409388"},{"key":"13","first-page":"359","article-title":"Collective generation of natural image descriptions","author":"P. Kuznetsova"},{"key":"14","first-page":"1","article-title":"Im2Text: describing images using 1 million captioned photographs","author":"V. Ordonez"},{"key":"15","article-title":"A comprehensive survey of deep learning for image captioning","author":"M. D. Zakir Hossain","year":"2018"},{"key":"16","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15561-1_2"},{"key":"17","first-page":"444","article-title":"Corpus-guided sentence generation of natural images","author":"Y. Yang"},{"key":"18","article-title":"Deep learning for image captioning an encoder-decoder architecture with soft attention","author":"M. G. Mart\u00ednez","year":"2019"},{"key":"19","first-page":"1","article-title":"Unifying visual-semantic embeddings with multimodal Neural Language models","author":"R. Kiros","year":"2014"},{"key":"20","doi-asserted-by":"publisher","DOI":"10.1007\/s12652-020-02623-6"},{"key":"21","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-022-03686-0"},{"key":"22","doi-asserted-by":"publisher","DOI":"10.3233\/JIFS-189415"},{"key":"23","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.02.112"},{"key":"24","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.140"},{"key":"25","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.667"},{"key":"26","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.04.095"},{"key":"27","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2019.115648"},{"key":"28","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2018.10.054"},{"key":"29","doi-asserted-by":"publisher","DOI":"10.11591\/ijeecs.v21.i2.pp757-767"},{"key":"30","doi-asserted-by":"publisher","DOI":"10.1109\/ICCIT51783.2020.9392697"},{"key":"31","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/p16-2034"},{"key":"32","article-title":"An efficient technique for image captioning using deep neural network","author":"B. B. Phukan","year":"2020"}],"container-title":["Applied Computational Intelligence and Soft Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/acisc\/2023\/9397325.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/acisc\/2023\/9397325.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/acisc\/2023\/9397325.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T04:01:32Z","timestamp":1682913692000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/acisc\/2023\/9397325\/"}},"subtitle":[],"editor":[{"given":"Aniello","family":"Minutolo","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,4,30]]},"references-count":32,"alternative-id":["9397325","9397325"],"URL":"https:\/\/doi.org\/10.1155\/2023\/9397325","relation":{},"ISSN":["1687-9732","1687-9724"],"issn-type":[{"value":"1687-9732","type":"electronic"},{"value":"1687-9724","type":"print"}],"subject":[],"published":{"date-parts":[[2023,4,30]]}}}