{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T18:44:00Z","timestamp":1772909040216,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,1,9]],"date-time":"2024-01-09T00:00:00Z","timestamp":1704758400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,1,9]],"date-time":"2024-01-09T00:00:00Z","timestamp":1704758400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Hum-Cent Intell Syst"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The significance of food in human health and well-being cannot be overemphasized. Nowadays, in our dynamic life, people are increasingly concerned about their health due to increased nutritional ailments. For this reason, mobile food-tracking applications that require a reliable and robust food classification system are gaining popularity. To address this, we propose a robust food recognition model using deep convolutional neural networks with a self-attention mechanism (FRCNNSAM). By training multiple FRCNNSAM structures with varying parameters, we combine their predictions through averaging. To prevent over-fitting and under-fitting data augmentation to generate extra training data, regularization to avoid excessive model complexity was used. The FRCNNSAM model is tested on two novel datasets: Food-101 and MA Food-121. The model achieved an impressive accuracy of 96.40% on the Food-101 dataset and 95.11% on MA Food-121. Compared to baseline transfer learning models, the FRCNNSAM model surpasses performance by 8.12%. Furthermore, the evaluation on random internet images demonstrates the model's strong generalization ability, rendering it suitable for food image recognition and classification tasks.<\/jats:p>","DOI":"10.1007\/s44230-023-00057-9","type":"journal-article","created":{"date-parts":[[2024,1,9]],"date-time":"2024-01-09T11:03:03Z","timestamp":1704798183000},"page":"171-186","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Automatic Food Recognition Using Deep Convolutional Neural Networks with Self-attention Mechanism"],"prefix":"10.1007","volume":"4","author":[{"given":"Rahib","family":"Abiyev","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0416-223X","authenticated-orcid":false,"given":"Joseph","family":"Adepoju","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,1,9]]},"reference":[{"key":"57_CR1","doi-asserted-by":"publisher","DOI":"10.1111\/exsy.12398","author":"RH Abiyev","year":"2019","unstructured":"Abiyev RH, Arslan M. Head mouse control system for people with disabilities. Expert Syst. 2019. https:\/\/doi.org\/10.1111\/exsy.12398.","journal-title":"Expert Syst"},{"issue":"1\u201314","key":"57_CR2","doi-asserted-by":"publisher","first-page":"3281135","DOI":"10.1155\/2021\/3281135","volume":"2021","author":"RH Abiyev","year":"2021","unstructured":"Abiyev RH, Abdullahi I. COVID-19 and pneumonia diagnosis in X-ray images using convolutional neural networks. Math Probl Eng. 2021;2021(1\u201314):3281135. https:\/\/doi.org\/10.1155\/2021\/3281135.","journal-title":"Math Probl Eng"},{"key":"57_CR3","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-42924-8_2","author":"RH Abiyev","year":"2023","unstructured":"Abiyev RH, Adepoju JA. Deep convolutional network for food image identification. Stud Comput Intell. 2023. https:\/\/doi.org\/10.1007\/978-3-031-42924-8_2.","journal-title":"Stud Comput Intell"},{"key":"57_CR4","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1016\/j.jvcir.2019.03.011","volume":"60","author":"E Aguilar","year":"2019","unstructured":"Aguilar E, Bola\u00f1os M, Radeva P. Regularized uncertainty-based multi-task learning model for food analysis. J Vis Commun Image Represent. 2019;60:360\u201370. https:\/\/doi.org\/10.1016\/j.jvcir.2019.03.011.","journal-title":"J Vis Commun Image Represent"},{"key":"57_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2022.105645","volume":"146","author":"E Aguilar","year":"2022","unstructured":"Aguilar E, Nagarajan B, Radeva P. Uncertainty-aware selecting for an ensemble of deep food recognition models. Comput Biol Med. 2022;146: 105645. https:\/\/doi.org\/10.1016\/j.compbiomed.2022.105645.","journal-title":"Comput Biol Med"},{"issue":"1","key":"57_CR6","first-page":"7","volume":"18","author":"AB Akhi","year":"2018","unstructured":"Akhi AB, Akter F, Khatun T, Uddin MS. Recognition and classification of fast food images. Global J Comput Sci Technol. 2018;18(1):7\u201313.","journal-title":"Global J Comput Sci Technol"},{"issue":"3","key":"57_CR7","doi-asserted-by":"publisher","first-page":"1905","DOI":"10.1007\/s00521-021-06488-4","volume":"34","author":"M Asgari-Chenaghlu","year":"2021","unstructured":"Asgari-Chenaghlu M, Feizi-Derakhshi M, Farzinvash L, Balafar MA, Motamed C. CWI: a multimodal deep learning approach for named entity recognition from social media using character, word and image features. Neural Comput Appl. 2021;34(3):1905\u201322. https:\/\/doi.org\/10.1007\/s00521-021-06488-4.","journal-title":"Neural Comput Appl"},{"key":"57_CR8","doi-asserted-by":"publisher","unstructured":"Attokaren DJ, Fernandes IG, Sriram A, Murthy YVS, Koolagudi SG (2017) Food classification from images using convolutional neural networks. TENCON 2017 - 2017 IEEE Region 10 Conference. 2801\u20132806, https:\/\/doi.org\/10.1109\/tencon.2017.8228338","DOI":"10.1109\/tencon.2017.8228338"},{"key":"57_CR9","doi-asserted-by":"publisher","DOI":"10.1016\/j.mlwa.2021.100106","volume":"6","author":"TR Bishop","year":"2021","unstructured":"Bishop TR, von Hinke S, Hollingsworth B, Lake AA, Brown H, Burgoine T. Automatic classification of takeaway food outlet cuisine type using machine (deep) learning. Mach Learn Appl. 2021;6: 100106. https:\/\/doi.org\/10.1016\/j.mlwa.2021.100106.","journal-title":"Mach Learn Appl"},{"key":"57_CR10","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10599-4_29","volume-title":"Computer vision \u2013 ECCV 2014. Lecture notes in computer science","author":"L Bossard","year":"2014","unstructured":"Bossard L, Guillaumin M, Van Gool L. Food-101 \u2013 mining discriminative components with random forests. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, editors. Computer vision \u2013 ECCV 2014. Lecture notes in computer science, vol. 8694. Cham: Springer; 2014. https:\/\/doi.org\/10.1007\/978-3-319-10599-4_29."},{"issue":"3","key":"57_CR11","doi-asserted-by":"publisher","first-page":"4241","DOI":"10.3233\/jifs-190353","volume":"37","author":"IJ Bush","year":"2019","unstructured":"Bush IJ, Abiyev R, Arslan M. Impact of machine learning techniques on hand gesture recognition. J Intell Fuzzy Syst. 2019;37(3):4241\u201352. https:\/\/doi.org\/10.3233\/jifs-190353.","journal-title":"J Intell Fuzzy Syst"},{"key":"57_CR12","unstructured":"Csurka G, Dance C, Fan L, Willamowski J, Bray C (2014) Visual categorization with bags of keypoints. In Proc ECCV Workshop on statistical learning in computer vision, 1:59\u201374, Prague"},{"issue":"21","key":"57_CR13","doi-asserted-by":"publisher","first-page":"33011","DOI":"10.1007\/s11042-021-11329-6","volume":"80","author":"A Fakhrou","year":"2021","unstructured":"Fakhrou A, Kunhoth J, Al Maadeed S. Smartphone-based food recognition system using multiple deep cnn models. Multimed Tool Appl. 2021;80(21):33011\u201332.","journal-title":"Multimed Tool Appl"},{"key":"57_CR14","unstructured":"Feinman R, Lake BM (2019) Learning a smooth kernel regularizer for convolutional neural networks. arXiv preprint arXiv:1903.01882. Accessed 10 Nov 2022."},{"issue":"9","key":"57_CR15","doi-asserted-by":"publisher","first-page":"1627","DOI":"10.1109\/tpami.2009.167","volume":"32","author":"PF Felzenszwalb","year":"2010","unstructured":"Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D. Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell. 2010;32(9):1627\u201345. https:\/\/doi.org\/10.1109\/tpami.2009.167.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"57_CR16","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2022.105151","volume":"115","author":"MA Ganaie","year":"2022","unstructured":"Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: a review. Eng Appl Artif Intell. 2022;115: 105151.","journal-title":"Eng Appl Artif Intell"},{"issue":"1","key":"57_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s12393-021-09302-y","volume":"14","author":"E Garc\u00eda-Armenta","year":"2022","unstructured":"Garc\u00eda-Armenta E, Guti\u00e9rrez-L\u00f3pez GF. Fractal micro-structure of foods. Food Eng Rev. 2022;14(1):1\u201319. https:\/\/doi.org\/10.1007\/s12393-021-09302-y.","journal-title":"Food Eng Rev"},{"key":"57_CR18","doi-asserted-by":"publisher","unstructured":"Hassannejad H, Matrella G, Ciampolini P, De Munari I, Mordonini M, Cagnoni S (2016) Food image recognition using very deep convolutional networks. Proc. of the 2nd Int. Workshop on Multimedia Assisted Dietary Management, pp 41\u201349. https:\/\/doi.org\/10.1145\/2986035.2986042","DOI":"10.1145\/2986035.2986042"},{"issue":"1","key":"57_CR19","doi-asserted-by":"publisher","DOI":"10.1088\/1757-899x\/1131\/1\/012007","volume":"1131","author":"VL Helen Josephine","year":"2021","unstructured":"Helen Josephine VL, Nirmala A, Alluri VL. Impact of hidden dense layers in convolutional neural network to enhance performance of classification model. IOP Conf Ser: Mater Sci Eng. 2021;1131(1): 012007. https:\/\/doi.org\/10.1088\/1757-899x\/1131\/1\/012007.","journal-title":"IOP Conf Ser: Mater Sci Eng"},{"key":"57_CR20","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1007\/978-3-030-49724-84","volume":"18","author":"C Kiourt","year":"2020","unstructured":"Kiourt C, Pavlidis G, Markantonatou S. Deep learning approaches in food recognition. machine learning paradigms. Learn Anal Intell Syst. 2020;18:83\u2013108. https:\/\/doi.org\/10.1007\/978-3-030-49724-84.","journal-title":"Learn Anal Intell Syst"},{"key":"57_CR21","doi-asserted-by":"publisher","DOI":"10.1016\/j.jneumeth.2020.108885","volume":"346","author":"E Lashgari","year":"2020","unstructured":"Lashgari E, Liang D, Maoz U. Data augmentation for deep-learning-based electroencephalography. J Neurosci Methods. 2020;346: 108885. https:\/\/doi.org\/10.1016\/j.jneumeth.2020.108885.","journal-title":"J Neurosci Methods"},{"key":"57_CR22","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-39601-9_4","volume-title":"Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture notes in computer science","author":"C Liu","year":"2016","unstructured":"Liu C, Cao Y, Luo Y, Chen G, Vokkarane V, Ma Y. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In: Chang C, Chiari L, Cao Y, Jin H, Mokhtari M, Aloulou H, editors. Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture notes in computer science, vol. 9677. Cham: Springer; 2016. https:\/\/doi.org\/10.1007\/978-3-319-39601-9_4."},{"key":"57_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.foodchem.2022.133243","volume":"391","author":"P Ma","year":"2022","unstructured":"Ma P, Zhang Z, Li Y, Yu N, Sheng J, K\u00fc\u00e7\u00fck McGinty H, Wang Q, Ahuja JK. Deep learning accurately predicts food categories and nutrients based on ingredient statements. Food Chem. 2022;391: 133243. https:\/\/doi.org\/10.1016\/j.foodchem.2022.133243.","journal-title":"Food Chem"},{"key":"57_CR24","doi-asserted-by":"publisher","unstructured":"Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. 2012 IEEE International Conference on Multimedia and Expo pp 25\u201330. https:\/\/doi.org\/10.1109\/icme.2012.157","DOI":"10.1109\/icme.2012.157"},{"key":"57_CR25","unstructured":"Mikulski B (2019) Understanding the softmax activation function | Bartosz Mikulski. Mikulskibartosz. https:\/\/www.mikulskibartosz.name\/understanding-the-softmax-activation-function\/. Accessed 1 Dec 2022."},{"key":"57_CR26","doi-asserted-by":"publisher","DOI":"10.3986\/alternator.2021.25","author":"S Mezgec","year":"2021","unstructured":"Mezgec S. The state of the art of automated food recognition. Alternator. 2021. https:\/\/doi.org\/10.3986\/alternator.2021.25.","journal-title":"Alternator"},{"key":"57_CR27","unstructured":"Mishra M (2020) Convolutional neural networks, explained. Towards Data Science. Retrieved November 9, 2022, from https:\/\/towardsdatascience.com\/convolutional-neural-networks-explained-9cc5188c4939. Accessed 9 Nov 2022."},{"key":"57_CR28","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-023-14603-x","author":"H Naseri","year":"2023","unstructured":"Naseri H, Mehrdad V. Novel CNN with investigation on accuracy by modifying stride, padding, kernel size and filter numbers. Multimed Tools Appl. 2023. https:\/\/doi.org\/10.1007\/s11042-023-14603-x.","journal-title":"Multimed Tools Appl"},{"issue":"3","key":"57_CR29","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1080\/24751839.2018.1446236","volume":"2","author":"G \u00d6zsert Yi\u011fit","year":"2018","unstructured":"\u00d6zsert Yi\u011fit G, \u00d6zyildirim BM. Comparison of convolutional neural network models for food image classification. J Inf Telecommun. 2018;2(3):347\u201357. https:\/\/doi.org\/10.1080\/24751839.2018.1446236.","journal-title":"J Inf Telecommun"},{"issue":"12","key":"57_CR30","doi-asserted-by":"publisher","first-page":"1758","DOI":"10.1109\/lsp.2017.2758862","volume":"24","author":"P Pandey","year":"2017","unstructured":"Pandey P, Deepthi A, Mandal B, Puhan NB. FoodNet: recognizing foods using ensemble of deep networks. IEEE Signal Process Lett. 2017;24(12):1758\u201362. https:\/\/doi.org\/10.1109\/lsp.2017.2758862.","journal-title":"IEEE Signal Process Lett"},{"key":"57_CR31","doi-asserted-by":"publisher","unstructured":"Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. Computer Vision and Pattern Recognition. arXiv:1712.04621v1, https:\/\/doi.org\/10.48550\/arXiv.1712.04621. Accessed 2 Nov 2022.","DOI":"10.48550\/arXiv.1712.04621"},{"key":"57_CR32","unstructured":"Qiu J, Lo FPW, Sun Y, Wang S, Lo B (2022) Mining discriminative food regions for accurate food recognition.\u00a0arXiv preprint arXiv:2207.03692. Accessed 26 Apr 2023."},{"issue":"4","key":"57_CR33","doi-asserted-by":"publisher","first-page":"4201","DOI":"10.1007\/s11227-020-03432-6","volume":"77","author":"C Rane","year":"2020","unstructured":"Rane C, Mehrotra R, Bhattacharyya S, Sharma M, Bhattacharya M. A novel attention fusion network-based framework to ensemble the predictions of CNNs for lymph node metastasis detection. J Supercomput. 2020;77(4):4201\u201320. https:\/\/doi.org\/10.1007\/s11227-020-03432-6.","journal-title":"J Supercomput"},{"issue":"3","key":"57_CR34","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1007\/s11263-013-0636-x","volume":"105","author":"J S\u00e1nchez","year":"2013","unstructured":"S\u00e1nchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the fisher vector: theory and practice. Int J Comput Vision. 2013;105(3):222\u201345. https:\/\/doi.org\/10.1007\/s11263-013-0636-x.","journal-title":"Int J Comput Vision"},{"issue":"1","key":"57_CR35","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1016\/j.gltp.2022.03.027","volume":"3","author":"G VijayaKumari","year":"2022","unstructured":"VijayaKumari G, Priyanka V, Vishwanath P. Food classification using transfer learning technique. Global Trans Proc. 2022;3(1):225\u20139.","journal-title":"Global Trans Proc"},{"key":"57_CR36","doi-asserted-by":"publisher","unstructured":"Yadav S, Alpana, Chand S (2021) Automated food image classification using deep learning approach. In: 2021 7th international conference on advanced computing and communication systems (ICACCS), 19\u201320 March 2021, Coimbatore, India. IEEE. https:\/\/doi.org\/10.1109\/icaccs51430.2021.9441889","DOI":"10.1109\/icaccs51430.2021.9441889"},{"issue":"4","key":"57_CR37","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1007\/s13244-018-0639-9","volume":"9","author":"R Yamashita","year":"2018","unstructured":"Yamashita R, Nishio M, Do RKG, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9(4):611\u201329. https:\/\/doi.org\/10.1007\/s13244-018-0639-9.","journal-title":"Insights Imaging"},{"issue":"11","key":"57_CR38","doi-asserted-by":"publisher","first-page":"3212","DOI":"10.1109\/tnnls.2018.2876865","volume":"30","author":"Z Zhao","year":"2019","unstructured":"Zhao Z, Zheng P, Xu S, Wu X. Object detection with deep learning: a review. IEEE Trans Neural Netw Learn Syst. 2019;30(11):3212\u201332. https:\/\/doi.org\/10.1109\/tnnls.2018.2876865.","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"57_CR39","doi-asserted-by":"crossref","unstructured":"Zhao H, Jia J, Koltun V (2020) Exploring self-attention for image recognition. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp 10076\u201310085)","DOI":"10.1109\/CVPR42600.2020.01009"}],"container-title":["Human-Centric Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44230-023-00057-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44230-023-00057-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44230-023-00057-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,17]],"date-time":"2024-04-17T10:57:04Z","timestamp":1713351424000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44230-023-00057-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,9]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,3]]}},"alternative-id":["57"],"URL":"https:\/\/doi.org\/10.1007\/s44230-023-00057-9","relation":{},"ISSN":["2667-1336"],"issn-type":[{"value":"2667-1336","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,9]]},"assertion":[{"value":"3 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 November 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that there are no competing interests for this research.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}