{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T23:27:55Z","timestamp":1773962875191,"version":"3.50.1"},"reference-count":49,"publisher":"Springer Science and Business Media LLC","issue":"7-8","license":[{"start":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T00:00:00Z","timestamp":1600041600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T00:00:00Z","timestamp":1600041600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100007065","name":"Universit\u00e0 degli Studi di Salerno","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007065","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Machine Vision and Applications"],"published-print":{"date-parts":[[2020,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Although in recent years we have witnessed an explosion of the scientific research in the recognition of facial soft biometrics such as gender, age and expression with deep neural networks, the recognition of ethnicity has not received the same attention from the scientific community. The growth of this field is hindered by two related factors: on the one hand, the absence of a dataset sufficiently large and representative does not allow an effective training of convolutional neural networks for the recognition of ethnicity; on the other hand, the collection of new ethnicity datasets is far from simple and must be carried out manually by humans trained to recognize the basic ethnicity groups using the somatic facial features. To fill this gap in the facial soft biometrics analysis, we propose the VGGFace2 Mivia Ethnicity Recognition (VMER) dataset, composed by more than 3,000,000 face images annotated with 4 ethnicity categories, namely African American, East Asian, Caucasian Latin and Asian Indian. The final annotations are obtained with a protocol which requires the opinion of three people belonging to different ethnicities, in order to avoid the bias introduced by the well-known other race effect. In addition, we carry out a comprehensive performance analysis of popular deep network architectures, namely VGG-16, VGG-Face, ResNet-50 and MobileNet v2. Finally, we perform a cross-dataset evaluation to demonstrate that the deep network architectures trained with VMER generalize on different test sets better than the same models trained on the largest ethnicity dataset available so far. The ethnicity labels of the VMER dataset and the code used for the experiments are available upon request at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/mivia.unisa.it\">https:\/\/mivia.unisa.it<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s00138-020-01123-z","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T05:02:38Z","timestamp":1600059758000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Benchmarking deep network architectures for ethnicity recognition using a new large face dataset"],"prefix":"10.1007","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5495-2432","authenticated-orcid":false,"given":"Antonio","family":"Greco","sequence":"first","affiliation":[]},{"given":"Gennaro","family":"Percannella","sequence":"additional","affiliation":[]},{"given":"Mario","family":"Vento","sequence":"additional","affiliation":[]},{"given":"Vincenzo","family":"Vigilante","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,14]]},"reference":[{"key":"1123_CR1","doi-asserted-by":"crossref","unstructured":"Ahmed, A., Yu, K., Xu, W., Gong, Y., Xing, E.: Training hierarchical feed-forward visual recognition models using transfer learning from pseudo-tasks. In: European Conference on Computer Vision, pp. 69\u201382. Springer (2008)","DOI":"10.1007\/978-3-540-88690-7_6"},{"issue":"3","key":"1123_CR2","first-page":"152","volume":"17","author":"I Anwar","year":"2017","unstructured":"Anwar, I., Islam, N.U.: Learned features are better for ethnicity classification. Cybern. Inf. Technol. 17(3), 152\u2013164 (2017)","journal-title":"Cybern. Inf. Technol."},{"key":"1123_CR3","doi-asserted-by":"publisher","first-page":"24171","DOI":"10.1109\/ACCESS.2018.2823378","volume":"6","author":"G Azzopardi","year":"2018","unstructured":"Azzopardi, G., Greco, A., Saggese, A., Vento, M.: Fusion of domain-specific and trainable features for gender recognition from face images. IEEE Access 6, 24171\u201324183 (2018)","journal-title":"IEEE Access"},{"key":"1123_CR4","unstructured":"Bastanfard, A., Nik, M.A., Dehshibi, M.M.: Iranian face database with age, pose and expression. Machine Vision pp. 50\u201355 (2007)"},{"key":"1123_CR5","doi-asserted-by":"crossref","unstructured":"Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: IEEE International Conference on Automatic Face & Gesture Recognition, pp. 67\u201374. IEEE (2018)","DOI":"10.1109\/FG.2018.00020"},{"key":"1123_CR6","doi-asserted-by":"crossref","unstructured":"Carletti, V., Greco, A., Percannella, G., Vento, M.: Age from faces in the deep learning revolution. IEEE Trans. Pattern Anal. Mach. Intell. (2019)","DOI":"10.1109\/TPAMI.2019.2910522"},{"key":"1123_CR7","doi-asserted-by":"crossref","unstructured":"Carletti, V., Greco, A., Saggese, A., Vento, M.: An effective real time gender recognition system for smart cameras. J. Ambient Intell. Human. Comput. 1\u201313 (2019)","DOI":"10.1007\/s12652-019-01267-5"},{"issue":"9","key":"1123_CR8","doi-asserted-by":"publisher","first-page":"1705","DOI":"10.1109\/TPAMI.2009.155","volume":"32","author":"J Chen","year":"2010","unstructured":"Chen, J., Shan, S., He, C., Zhao, G., Pietikainen, M., Chen, X., Gao, W.: Wld: a robust local image descriptor. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1705\u20131720 (2010)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"1","key":"1123_CR9","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1177\/001316446002000104","volume":"20","author":"J Cohen","year":"1960","unstructured":"Cohen, J.: A coefficient of agreement for nominal scales. Edu. Psychol. Meas. 20(1), 37\u201346 (1960)","journal-title":"Edu. Psychol. Meas."},{"key":"1123_CR10","doi-asserted-by":"crossref","unstructured":"Dantcheva, A., Elia, P., Ross, A.: What else does your biometric data reveal? A survey on soft biometrics. IEEE Trans. Inf. For. Secur. 441\u2013467 (2016)","DOI":"10.1109\/TIFS.2015.2480381"},{"key":"1123_CR11","doi-asserted-by":"crossref","unstructured":"Demirkus, M., Garg, K., Guler, S.: Automated person categorization for video surveillance using soft biometrics. In: Biometric Technology for Human Identification VII, vol. 7667, p. 76670P. International Society for Optics and Photonics (2010)","DOI":"10.1117\/12.851424"},{"key":"1123_CR12","doi-asserted-by":"crossref","unstructured":"Ding, C., Tao, D.: A comprehensive survey on pose-invariant face recognition. ACM Trans. Intell. Syst. Technol. 37 (2016)","DOI":"10.1145\/2845089"},{"issue":"1","key":"1123_CR13","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1007\/s00138-018-0976-1","volume":"30","author":"F Dornaika","year":"2019","unstructured":"Dornaika, F., Arganda-Carreras, I., Belver, C.: Age estimation in facial images through transfer learning. Mach. Vis. Appl. 30(1), 177\u2013187 (2019)","journal-title":"Mach. Vis. Appl."},{"issue":"3","key":"1123_CR14","first-page":"1","volume":"1341","author":"D Erhan","year":"2009","unstructured":"Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Univ. Montr. 1341(3), 1 (2009)","journal-title":"Univ. Montr."},{"key":"1123_CR15","doi-asserted-by":"crossref","unstructured":"Foggia, P., Greco, A., Percannella, G., Vento, M., Vigilante, V.: A system for gender recognition on mobile robots. In: International Conference on Applications of Intelligent Systems, p.\u00a09. ACM (2019)","DOI":"10.1145\/3309772.3309781"},{"issue":"12","key":"1123_CR16","doi-asserted-by":"publisher","first-page":"2483","DOI":"10.1109\/TPAMI.2014.2321570","volume":"36","author":"S Fu","year":"2014","unstructured":"Fu, S., He, H., Hou, Z.G.: Learning race from face: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2483\u20132509 (2014)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"1123_CR17","doi-asserted-by":"crossref","unstructured":"Fu, S.Y., Yang, G.S., Hou, Z.G.: Spiking neural networks based cortex like mechanism: a case study for facial expression recognition. In: International Conference on Neural Networks, pp. 1637\u20131642. IEEE (2011)","DOI":"10.1109\/IJCNN.2011.6033421"},{"issue":"1","key":"1123_CR18","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1109\/TSMCA.2007.909557","volume":"38","author":"W Gao","year":"2007","unstructured":"Gao, W., Cao, B., Shan, S., Chen, X., Zhou, D., Zhang, X., Zhao, D.: The cas-peal large-scale chinese face database and baseline evaluations. IEEE Trans. Syst. Man Cybern. A Syst. Humans 38(1), 149\u2013161 (2007)","journal-title":"IEEE Trans. Syst. Man Cybern. A Syst. Humans"},{"issue":"10","key":"1123_CR19","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1016\/j.imavis.2014.04.011","volume":"32","author":"G Guo","year":"2014","unstructured":"Guo, G., Mu, G.: A framework for joint estimation of age, gender and ethnicity on a large database. Image Vis. Comput. 32(10), 761\u2013770 (2014)","journal-title":"Image Vis. Comput."},{"issue":"11","key":"1123_CR20","doi-asserted-by":"publisher","first-page":"2597","DOI":"10.1109\/TPAMI.2017.2738004","volume":"40","author":"H Han","year":"2017","unstructured":"Han, H., Jain, A.K., Wang, F., Shan, S., Chen, X.: Heterogeneous face attribute estimation: a deep multi-task learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2597\u20132609 (2017)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"1123_CR21","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770\u2013778 (2016)","DOI":"10.1109\/CVPR.2016.90"},{"key":"1123_CR22","unstructured":"Hosoi, S., Takikawa, E., Kawade, M.: Ethnicity estimation with facial images. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 195\u2013200. IEEE (2004)"},{"key":"1123_CR23","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint arXiv:1704.04861 (2017)"},{"key":"1123_CR24","unstructured":"K\u00e4rkk\u00e4inen, K., Joo, J.: Fairface: Face attribute dataset for balanced race, gender, and age. Preprint arXiv:1908.04913 (2019)"},{"issue":"10","key":"1123_CR25","doi-asserted-by":"publisher","first-page":"1962","DOI":"10.1109\/TPAMI.2011.48","volume":"33","author":"N Kumar","year":"2011","unstructured":"Kumar, N., Berg, A., Belhumeur, P.N., Nayar, S.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962\u20131977 (2011)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"1123_CR26","unstructured":"Li, S., Deng, W.: Deep facial expression recognition: a survey. Preprint arXiv:1804.08348 (2018)"},{"key":"1123_CR27","unstructured":"Lin, H., Lu, H., Zhang, L.: A new automatic recognition system of gender, age and ethnicity. In: Congress on Intelligent Control and Automation, vol.\u00a02, pp. 9988\u20139991. IEEE (2006)"},{"key":"1123_CR28","doi-asserted-by":"crossref","unstructured":"Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730\u20133738 (2015)","DOI":"10.1109\/ICCV.2015.425"},{"key":"1123_CR29","doi-asserted-by":"publisher","first-page":"1532","DOI":"10.3389\/fpsyg.2014.01532","volume":"5","author":"V LoBue","year":"2015","unstructured":"LoBue, V., Thrasher, C.: The child affective facial expression (cafe) set: validity and reliability from untrained adults. Front. Psychol. 5, 1532 (2015)","journal-title":"Front. Psychol."},{"issue":"12","key":"1123_CR30","doi-asserted-by":"publisher","first-page":"1357","DOI":"10.1109\/34.817413","volume":"21","author":"MJ Lyons","year":"1999","unstructured":"Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357\u20131362 (1999)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"1123_CR31","unstructured":"Marx, K.: Encyclopedia britannica. Encyclopaedia Britannica Ultimate Reference Suite [M\/CD]. Chicago: Encyclopsedia Britannica (2012)"},{"issue":"05","key":"1123_CR32","doi-asserted-by":"publisher","first-page":"1250019","DOI":"10.1142\/S0218213012500194","volume":"21","author":"G Muhammad","year":"2012","unstructured":"Muhammad, G., Hussain, M., Alenezy, F., Bebis, G., Mirza, A.M., Aboalsamh, H.: Race classification from face images using local descriptors. Int. J. Artif. Intell. Tools 21(05), 1250019 (2012)","journal-title":"Int. J. Artif. Intell. Tools"},{"key":"1123_CR33","doi-asserted-by":"crossref","unstructured":"Parkhi, O.M., Vedaldi, A., Zisserman, A., et al.: Deep face recognition. BMVC 1, 6 (2015)","DOI":"10.5244\/C.29.41"},{"issue":"2","key":"1123_CR34","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1007\/s00138-017-0895-6","volume":"29","author":"Y Peng","year":"2018","unstructured":"Peng, Y., Yin, H.: Facial expression analysis and expression-invariant face recognition by manifold-based synthesis. Mach. Vis. Appl. 29(2), 263\u2013284 (2018)","journal-title":"Mach. Vis. Appl."},{"issue":"5","key":"1123_CR35","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1016\/S0262-8856(97)00070-X","volume":"16","author":"PJ Phillips","year":"1998","unstructured":"Phillips, P.J., Wechsler, H., Huang, J., Rauss, P.J.: The feret database and evaluation procedure for face-recognition algorithms. Image Vis. Comput. 16(5), 295\u2013306 (1998)","journal-title":"Image Vis. Comput."},{"key":"1123_CR36","unstructured":"Ranjan, R., Patel, V.M., Chellappa, R., Castillo, C.D.: Deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition (2018). US Patent App. 15\/746,237"},{"key":"1123_CR37","unstructured":"Ricanek, K., Tesafaye, T.: Morph: a longitudinal image database of normal adult age-progression. In: International Conference on Automatic Face and Gesture Recognition, pp. 341\u2013345. IEEE (2006)"},{"key":"1123_CR38","doi-asserted-by":"crossref","unstructured":"Riccio, D., Tortora, G., De\u00a0Marsico, M., Wechsler, H.: Ega-ethnicity, gender and age, a pre-annotated face database. In: IEEE Workshop on BIOMS, pp. 1\u20138. IEEE (2012)","DOI":"10.1109\/BIOMS.2012.6345776"},{"key":"1123_CR39","doi-asserted-by":"crossref","unstructured":"Roomi, S.M.M., Virasundarii, S., Selvamegala, S., Jeevanandham, S., Hariharasudhan, D.: Race classification based on facial features. In: Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, pp. 54\u201357. IEEE (2011)","DOI":"10.1109\/NCVPRIPG.2011.19"},{"key":"1123_CR40","unstructured":"Salah, S.H., Du, H., Al-Jawad, N.: Fusing local binary patterns with wavelet features for ethnicity identification. In: World Academy of Science, Engineering and Technology, 79, p. 471. World Academy of Science, Engineering and Technology (WASET) (2013)"},{"issue":"2","key":"1123_CR41","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1007\/s00138-018-0991-2","volume":"30","author":"L Seidenari","year":"2019","unstructured":"Seidenari, L., Rozza, A., Del Bimbo, A.: Real-time demographic profiling from face imagery with fisher vectors. Mach. Vis. Appl. 30(2), 359\u2013374 (2019)","journal-title":"Mach. Vis. Appl."},{"key":"1123_CR42","unstructured":"Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556 (2014)"},{"issue":"6","key":"1123_CR43","doi-asserted-by":"publisher","first-page":"902","DOI":"10.1016\/j.imavis.2009.11.005","volume":"28","author":"CE Thomaz","year":"2010","unstructured":"Thomaz, C.E., Giraldi, G.A.: A new ranking method for principal components analysis and its application to face image analysis. Image Vis. Comput. 28(6), 902\u2013913 (2010)","journal-title":"Image Vis. Comput."},{"key":"1123_CR44","unstructured":"Wu, B., Ai, H., Huang, C.: Facial image retrieval based on demographic classification. In: International Conference on Pattern Recognition, vol.\u00a03, pp. 914\u2013917. IEEE (2004)"},{"key":"1123_CR45","doi-asserted-by":"crossref","unstructured":"Xie, Y., Luu, K., Savvides, M.: A robust approach to facial ethnicity classification on large scale face databases. In: International Conference on Biometrics: Theory, Applications and Systems, pp. 143\u2013149. IEEE (2012)","DOI":"10.1109\/BTAS.2012.6374569"},{"key":"1123_CR46","doi-asserted-by":"crossref","unstructured":"Yi, D., Lei, Z., Li, S.Z.: Age estimation by multi-scale convolutional network. In: Asian Conference on Computer Vision, pp. 144\u2013158. Springer (2014)","DOI":"10.1007\/978-3-319-16811-1_10"},{"key":"1123_CR47","unstructured":"Zawbaa, H., Aly, S.A.: Hajj and umrah event recognition datasets. Preprint arXiv:1205.2345 (2012)"},{"key":"1123_CR48","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Song, Y., Qi, H.: Age progression\/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5810\u20135818 (2017)","DOI":"10.1109\/CVPR.2017.463"},{"key":"1123_CR49","doi-asserted-by":"crossref","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921\u20132929 (2016)","DOI":"10.1109\/CVPR.2016.319"}],"container-title":["Machine Vision and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00138-020-01123-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00138-020-01123-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00138-020-01123-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,13]],"date-time":"2021-09-13T23:22:54Z","timestamp":1631575374000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00138-020-01123-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,14]]},"references-count":49,"journal-issue":{"issue":"7-8","published-print":{"date-parts":[[2020,11]]}},"alternative-id":["1123"],"URL":"https:\/\/doi.org\/10.1007\/s00138-020-01123-z","relation":{},"ISSN":["0932-8092","1432-1769"],"issn-type":[{"value":"0932-8092","type":"print"},{"value":"1432-1769","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,14]]},"assertion":[{"value":"25 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 June 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 September 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 September 2020","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"67"}}