{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,27]],"date-time":"2026-01-27T11:42:30Z","timestamp":1769514150906,"version":"3.49.0"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,12,26]],"date-time":"2024-12-26T00:00:00Z","timestamp":1735171200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,12,26]],"date-time":"2024-12-26T00:00:00Z","timestamp":1735171200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Raheem Sarwar"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Traditional image retrieval methods often face challenges in adapting to varying user preferences and dynamic datasets. To address these limitations, this research introduces a novel image retrieval framework utilizing deep deterministic policy gradients (DDPG) augmented with a self-adaptive reward mechanism (SARM). The DDPG-SARM framework dynamically adjusts rewards based on user feedback and retrieval context, enhancing the learning efficiency and retrieval accuracy of the agent. Key innovations include dynamic reward adjustment based on user feedback, context-aware reward structuring that considers the specific characteristics of each retrieval task, and an adaptive learning rate strategy to ensure robust and efficient model convergence. Extensive experimentation with the three distinct datasets demonstrates that the proposed framework significantly outperforms traditional methods, achieving the highest retrieval accuracy having 3.38%, 5.26%, and 0.21% improvement overall as compared to the mainstream models over DermaMNIST, PneumoniaMNIST, and OrganMNIST datasets, respectively. The findings contribute to the advancement of reinforcement learning applications in image retrieval, providing a user-centric solution adaptable to various dynamic environments. The proposed method also offers a promising direction for future developments in intelligent image retrieval systems.<\/jats:p>","DOI":"10.1007\/s11227-024-06764-9","type":"journal-article","created":{"date-parts":[[2024,12,26]],"date-time":"2024-12-26T11:56:10Z","timestamp":1735214170000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Deep deterministic policy gradients with a self-adaptive reward mechanism for image retrieval"],"prefix":"10.1007","volume":"81","author":[{"given":"Farooq","family":"Ahmad","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xinfeng","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zifang","family":"Tang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fahad","family":"Sabah","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Muhammad","family":"Azam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Raheem","family":"Sarwar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,12,26]]},"reference":[{"key":"6764_CR1","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1016\/j.aej.2024.03.045","volume":"95","author":"A Khamaj","year":"2024","unstructured":"Khamaj A, Ali AM (2024) Adapting user experience with reinforcement learning: personalizing interfaces based on user behavior analysis in real-time. Alex Eng J 95:164\u2013173","journal-title":"Alex Eng J"},{"issue":"2","key":"6764_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3633458","volume":"29","author":"T-C Liang","year":"2024","unstructured":"Liang T-C, Chang Y-C, Zhong Z, Bigdeli Y, Ho T-Y, Chakrabarty K, Fair R (2024) Dynamic adaptation using deep reinforcement learning for digital microfluidic biochips. ACM Trans Design Autom Electron Syst 29(2):1\u201324","journal-title":"ACM Trans Design Autom Electron Syst"},{"key":"6764_CR3","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2019.105596","volume":"83","author":"L Zhu","year":"2019","unstructured":"Zhu L, Zhang C, Zhang C, Zhang Z, Nie X, Zhou X, Liu W, Wang X (2019) Forming a new small sample deep learning model to predict total organic carbon content by combining unsupervised learning with semisupervised learning. Appl Soft Comput 83:105596","journal-title":"Appl Soft Comput"},{"issue":"2","key":"6764_CR4","doi-asserted-by":"publisher","first-page":"13898","DOI":"10.1002\/acm2.13898","volume":"24","author":"M Hu","year":"2023","unstructured":"Hu M, Zhang J, Matkovic L, Liu T, Yang X (2023) Reinforcement learning in medical image analysis: concepts, applications, challenges, and future directions. J Appl Clin Med Phys 24(2):13898","journal-title":"J Appl Clin Med Phys"},{"issue":"4","key":"6764_CR5","doi-asserted-by":"publisher","first-page":"5064","DOI":"10.1109\/TNNLS.2022.3207346","volume":"35","author":"X Wang","year":"2024","unstructured":"Wang X, Wang S, Liang X, Zhao D, Huang J, Xu X, Dai B, Miao Q (2024) Deep reinforcement learning: a survey. IEEE Trans Neural Netw Learn Syst 35(4):5064\u20135078","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"6764_CR6","unstructured":"Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2019) Continuous control with deep reinforcement learning. arXiv preprint"},{"key":"6764_CR7","unstructured":"Zhao H, Tang W, Yao D (2024) Policy optimization for continuous reinforcement learning. Adv Neural Inform Process Syst 36"},{"key":"6764_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2023.110756","volume":"152","author":"JK Viswanadhapalli","year":"2024","unstructured":"Viswanadhapalli JK, Elumalai VK, Shivram S, Shah S, Mahajan D (2024) Deep reinforcement learning with reward shaping for tracking control and vibration suppression of flexible link manipulator. Appl Soft Comput 152:110756","journal-title":"Appl Soft Comput"},{"issue":"6","key":"6764_CR9","doi-asserted-by":"publisher","first-page":"7686","DOI":"10.1109\/TPAMI.2022.3223407","volume":"45","author":"C Huang","year":"2023","unstructured":"Huang C, Wang G, Zhou Z, Zhang R, Lin L (2023) Reward-adaptive reinforcement learning: dynamic policy gradient optimization for bipedal locomotion. IEEE Trans Pattern Anal Mach Intell 45(6):7686\u20137695","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"6764_CR10","doi-asserted-by":"crossref","unstructured":"Xu M, Chen X, She Y, Jin Y, Wang J (2024) Time-varying weights in multi-reward architecture for deep reinforcement learning. IEEE Trans Emerg Topics Comput Intell","DOI":"10.1109\/TETCI.2024.3359039"},{"key":"6764_CR11","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.121880","volume":"238","author":"Z Tang","year":"2024","unstructured":"Tang Z, Li T, Wu D, Liu J, Yang Z (2024) A systematic literature review of reinforcement learning-based knowledge graph research. Expert Syst Appl 238:121880. https:\/\/doi.org\/10.1016\/j.eswa.2023.121880","journal-title":"Expert Syst Appl"},{"issue":"2","key":"6764_CR12","doi-asserted-by":"publisher","first-page":"885","DOI":"10.1007\/s10845-023-02087-3","volume":"35","author":"S De Blasi","year":"2024","unstructured":"De Blasi S, Bahrami M, Engels E, Gepperth A (2024) Safe contextual Bayesian optimization integrated in industrial control for self-learning machines. J Intell Manuf 35(2):885\u2013903","journal-title":"J Intell Manuf"},{"key":"6764_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2022.108015","volume":"101","author":"J Xu","year":"2022","unstructured":"Xu J, Zhang H, Qiu J (2022) A deep deterministic policy gradient algorithm based on averaged state-action estimation. Comput Electr Eng 101:108015","journal-title":"Comput Electr Eng"},{"key":"6764_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.energy.2021.121492","volume":"236","author":"W Zhang","year":"2021","unstructured":"Zhang W, Chen Q, Yan J, Zhang S, Xu J (2021) A novel asynchronous deep reinforcement learning model with adaptive early forecasting method and reward incentive mechanism for short-term load forecasting. Energy 236:121492. https:\/\/doi.org\/10.1016\/j.energy.2021.121492","journal-title":"Energy"},{"key":"6764_CR15","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2023.107877","volume":"169","author":"Z Xu","year":"2024","unstructured":"Xu Z, Wang S, Xu G, Liu Y, Yu M, Zhang H, Lukasiewicz T, Gu J (2024) Automatic data augmentation for medical image segmentation using adaptive sequence-length based deep reinforcement learning. Comput Biol Med 169:107877","journal-title":"Comput Biol Med"},{"key":"6764_CR16","unstructured":"Beukman M, Jarvis D, Klein R, James S, Rosman B (2024) Dynamics generalisation in reinforcement learning via adaptive context-aware policies. Adv Neural Inform Process Syst 36"},{"key":"6764_CR17","unstructured":"Yang R, Pan X, Luo F, Qiu S, Zhong H, Yu D, Chen J (2024) Rewards-in-context: multi-objective alignment of foundation models with dynamic preference adjustment. https:\/\/arxiv.org\/abs\/2402.10207"},{"issue":"2","key":"6764_CR18","doi-asserted-by":"publisher","first-page":"1543","DOI":"10.1007\/s10462-022-10205-5","volume":"56","author":"V Uc-Cetina","year":"2023","unstructured":"Uc-Cetina V, Navarro-Guerrero N, Martin-Gonzalez A, Weber C, Wermter S (2023) Survey on reinforcement learning for language processing. Artif Intell Rev 56(2):1543\u20131575","journal-title":"Artif Intell Rev"},{"issue":"4","key":"6764_CR19","doi-asserted-by":"publisher","first-page":"5343","DOI":"10.1007\/s11042-022-12178-7","volume":"82","author":"G Dhiman","year":"2023","unstructured":"Dhiman G, Kumar AV, Nirmalan R, Sujitha S, Srihari K, Yuvaraj N, Arulprakash P, Raja RA (2023) Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications. Multim Tools Appl 82(4):5343\u20135367","journal-title":"Multim Tools Appl"},{"key":"6764_CR20","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1016\/j.patrec.2024.04.019","volume":"182","author":"J Ye","year":"2024","unstructured":"Ye J, Wu Y, Peng D (2024) Low-quality image object detection based on reinforcement learning adaptive enhancement. Pattern Recogn Lett 182:67\u201375","journal-title":"Pattern Recogn Lett"},{"issue":"1","key":"6764_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2018.161","volume":"5","author":"P Tschandl","year":"2018","unstructured":"Tschandl P, Rosendahl C, Kittler H (2018) The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 5(1):1\u20139","journal-title":"Sci Data"},{"key":"6764_CR22","unstructured":"Codella N, Rotemberg V, Tschandl P, Celebi ME, Dusza S, Gutman D, Helba B, Kalloo A, Liopyris K, Marchetti M, et al. (2019) Skin lesion analysis toward melanoma detection 2018: a challenge hosted by the international skin imaging collaboration (ISIC). arXiv preprint"},{"issue":"5","key":"6764_CR23","doi-asserted-by":"publisher","first-page":"1122","DOI":"10.1016\/j.cell.2018.02.010","volume":"172","author":"DS Kermany","year":"2018","unstructured":"Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, Baxter SL, McKeown A, Yang G, Wu X, Yan F et al (2018) Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5):1122\u20131131","journal-title":"Cell"},{"key":"6764_CR24","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2022.102680","volume":"84","author":"P Bilic","year":"2023","unstructured":"Bilic P, Christ P, Li HB et al (2023) The liver tumor segmentation benchmark (lits). Med Image Anal 84:102680. https:\/\/doi.org\/10.1016\/j.media.2022.102680","journal-title":"Med Image Anal"},{"key":"6764_CR25","doi-asserted-by":"publisher","unstructured":"Yang J, Shi R, Ni B (2021) Medmnist classification decathlon: a lightweight automl benchmark for medical image analysis. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp 191\u2013195. https:\/\/doi.org\/10.1109\/ISBI48211.2021.9434062","DOI":"10.1109\/ISBI48211.2021.9434062"},{"issue":"8","key":"6764_CR26","doi-asserted-by":"publisher","first-page":"1885","DOI":"10.1109\/TMI.2019.2894854","volume":"38","author":"X Xu","year":"2019","unstructured":"Xu X, Zhou F, Liu B, Fu D, Bai X (2019) Efficient multiple organ localization in CT image using 3D region proposal network. IEEE Trans Med Imaging 38(8):1885\u20131898. https:\/\/doi.org\/10.1109\/TMI.2019.2894854","journal-title":"IEEE Trans Med Imaging"},{"issue":"1","key":"6764_CR27","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321\u2013357","journal-title":"J Artif Intell Res"},{"key":"6764_CR28","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"6764_CR29","unstructured":"Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. Adv Neural Inform Process Syst 28"},{"key":"6764_CR30","doi-asserted-by":"publisher","unstructured":"Jin H, Song Q, Hu X (2019) Auto-keras: an efficient neural architecture search system. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD \u201919, pp 1946\u20131956. Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/3292500.3330648","DOI":"10.1145\/3292500.3330648"},{"key":"6764_CR31","doi-asserted-by":"crossref","unstructured":"Liu J, Li Y, Cao G, Liu Y, Cao W (2022) Feature pyramid vision transformer for medmnist classification decathlon. In: 2022 International Joint Conference on Neural Networks (IJCNN), pp 1\u20138. IEEE","DOI":"10.1109\/IJCNN55064.2022.9892282"},{"key":"6764_CR32","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2022.106353","volume":"152","author":"Q Han","year":"2023","unstructured":"Han Q, Hou M, Wang H, Wu C, Tian S, Qiu Z, Zhou B (2023) EHDFL: Evolutionary hybrid domain feature learning based on windowed fast Fourier convolution pyramid for medical image classification. Comput Biol Med 152:106353","journal-title":"Comput Biol Med"},{"key":"6764_CR33","unstructured":"Mukhometzianov R, Carrillo J (2018) CapsNet comparative performance evaluation for image classification. CoRR arXiv:abs\/1805.11195"},{"issue":"3","key":"6764_CR34","doi-asserted-by":"publisher","first-page":"1865","DOI":"10.1007\/s40747-021-00347-4","volume":"8","author":"X Ai","year":"2022","unstructured":"Ai X, Zhuang J, Wang Y, Wan P, Fu Y (2022) ResCaps: an improved capsule network and its application in ultrasonic image classification of thyroid papillary carcinoma. Complex Intell Syst 8(3):1865\u20131873","journal-title":"Complex Intell Syst"},{"issue":"4","key":"6764_CR35","doi-asserted-by":"publisher","first-page":"23108","DOI":"10.1002\/ima.23108","volume":"34","author":"SB Sengul","year":"2024","unstructured":"Sengul SB, Ozkan IA (2024) MResCaps: enhancing capsule networks with parallel lanes and residual blocks for high-performance medical image classification. Int J Imaging Syst Technol 34(4):23108","journal-title":"Int J Imaging Syst Technol"},{"key":"6764_CR36","doi-asserted-by":"publisher","unstructured":"Farooq A, Zhang X (2023) Tongue image retrieval based on reinforcement learning. In: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition. ICCPR \u201922, pp 282\u2013289. Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/3581807.3581848","DOI":"10.1145\/3581807.3581848"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06764-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-024-06764-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-024-06764-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,26]],"date-time":"2024-12-26T12:03:34Z","timestamp":1735214614000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-024-06764-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,26]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["6764"],"URL":"https:\/\/doi.org\/10.1007\/s11227-024-06764-9","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"value":"0920-8542","type":"print"},{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,26]]},"assertion":[{"value":"22 November 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 December 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u00a0For the purpose of open access, the author(s) has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"336"}}