{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T11:22:23Z","timestamp":1775042543204,"version":"3.50.1"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"26","license":[{"start":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T00:00:00Z","timestamp":1680220800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T00:00:00Z","timestamp":1680220800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010661","name":"Horizon 2020 Framework Programme","doi-asserted-by":"publisher","award":["826276"],"award-info":[{"award-number":["826276"]}],"id":[{"id":"10.13039\/100010661","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Agence NAtionale de la Recherche","award":["ANR-19-CE23-0028"],"award-info":[{"award-number":["ANR-19-CE23-0028"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2023,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Zero-shot learning (ZSL) aims at recognizing classes for which no visual sample is available at training time. To address this issue, one can rely on a semantic description of each class. A typical ZSL model learns a mapping between the visual samples of seen classes and the corresponding semantic descriptions, in order to do the same on unseen classes at test time. State of the art approaches rely on generative models that synthesize visual features from the prototype of a class, such that a classifier can then be learned in a supervised manner. However, these approaches are usually biased towards seen classes whose visual instances are the only one that can be matched to a given class prototype. We propose a regularization method that can be applied to any conditional generative-based ZSL method, by leveraging only the semantic class prototypes. It learns to synthesize discriminative features for possible semantic description that are not available at training time, that is the unseen ones. The approach is evaluated for ZSL and GZSL on four datasets commonly used in the literature, either in inductive or transductive settings, with results on-par or above state of the art approaches. The code is available at<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/hanouticelina\/lsa-zsl\">https:\/\/github.com\/hanouticelina\/lsa-zsl<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s11042-023-14877-1","type":"journal-article","created":{"date-parts":[[2023,4,4]],"date-time":"2023-04-04T11:41:54Z","timestamp":1680608514000},"page":"40745-40759","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Learning semantic ambiguities for zero-shot learning"],"prefix":"10.1007","volume":"82","author":[{"given":"Celina","family":"Hanouti","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0520-8436","authenticated-orcid":false,"given":"Herv\u00e9","family":"Le Borgne","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,3,31]]},"reference":[{"key":"14877_CR1","doi-asserted-by":"crossref","unstructured":"Adjali O, Besancon R, Ferret O et al (2020) Multimodal entity linking for tweets. In: European conference on information retrieval, Lisbon, Portugal","DOI":"10.1007\/978-3-030-45439-5_31"},{"key":"14877_CR2","unstructured":"Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning. JMLR.org, ICML\u201917, vol 70, pp 214\u2013223"},{"key":"14877_CR3","doi-asserted-by":"crossref","unstructured":"Arora G, Verma VK, Mishra A et al (2017) Generalized zero-shot learning via synthesized examples. CoRR: http:\/\/arxiv.org\/1712.03878","DOI":"10.1109\/CVPR.2018.00450"},{"key":"14877_CR4","unstructured":"Bucher M, Herbin S, Jurie F (2017) Generating Visual Representations for Zero-Shot Classification. In: International conference on computer vision (ICCV) workshops: TASK-CV: transferring and adapting source knowledge in computer vision, Venise, Italy"},{"key":"14877_CR5","doi-asserted-by":"crossref","unstructured":"Chami I, Tamaazousti Y, Le Borgne H (2017) Amecon: abstract meta-concept features for text-illustration. In: ACM international conference on multimedia retrieval. ICMR, Bucharest","DOI":"10.1145\/3078971.3078993"},{"key":"14877_CR6","unstructured":"Chou YY, Lin HT (2021) Adaptive and generative zero-shot learning. In: International conference on learning representations"},{"key":"14877_CR7","unstructured":"Frome A, Corrado GS, Shlens J et al (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121\u20132129"},{"key":"14877_CR8","doi-asserted-by":"crossref","unstructured":"Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: Computer vision and pattern recognition. IEEE, pp 951\u2013958","DOI":"10.1109\/CVPR.2009.5206594"},{"key":"14877_CR9","doi-asserted-by":"crossref","unstructured":"Le Cacheux Y, Le Borgne H (2019a) From classical to generalized zero-shot learning: a simple adaptation process. In: International conference on multimedia modeling. Springer, pp 465\u2013477","DOI":"10.1007\/978-3-030-05716-9_38"},{"key":"14877_CR10","doi-asserted-by":"crossref","unstructured":"Le Cacheux Y, Le Borgne H, Crucianu M (2019b) Modeling inter and intra-class relations in the triplet loss for zero-shot learning. In: International conference on computer vision, pp 10,333\u201310,342","DOI":"10.1109\/ICCV.2019.01043"},{"key":"14877_CR11","doi-asserted-by":"crossref","unstructured":"Le Cacheux Y, Le Borgne H, Crucianu M (2020a) Using sentences as semantic embeddings for large scale zero-shot learning. In: ECCV 2020 workshop: transferring and adapting source knowledge in computer vision, Springer","DOI":"10.1007\/978-3-030-66415-2_42"},{"key":"14877_CR12","unstructured":"Le Cacheux Y, Le Borgne H, Crucianu M (2021) Zero-shot Learning with Deep Neural Networks for Object Recognition. In: Benois Pineau J, Zemmari A (eds) Multi-faceted deep learning. Springer, chap 6, pp 273\u2013288"},{"key":"14877_CR13","doi-asserted-by":"crossref","unstructured":"Le Cacheux Y, Popescu A, Le Borgne H (2020b) Webly supervised semantic embeddings for large scale zero-shot learning. In: Asian conference on computer vision","DOI":"10.1007\/978-3-030-69544-6_31"},{"key":"14877_CR14","doi-asserted-by":"crossref","unstructured":"Li J, Jing M, Lu K et al (2019) Leveraging the invariant side of generative zero-shot learning. In: IEEE computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR.2019.00758"},{"key":"14877_CR15","doi-asserted-by":"crossref","unstructured":"Myoupo D, Popescu A, Le Borgne H et al (2010) Multimodal image retrieval over a large database. In: Peters C, Caputo B, Gonzalo J et al (eds) Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments, Lecture notes in computer science. Springer Berlin \/ Heidelberg, pp 177\u2013184, Berlin","DOI":"10.1007\/978-3-642-15751-6_20"},{"key":"14877_CR16","doi-asserted-by":"crossref","unstructured":"Narayan S, Gupta A, Khan FS et al (2020) Latent embedding feedback and discriminative features for zero-shot classification. In: ECCV","DOI":"10.1007\/978-3-030-58542-6_29"},{"key":"14877_CR17","doi-asserted-by":"crossref","unstructured":"Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Indian conference on computer vision graphics and image processing","DOI":"10.1109\/ICVGIP.2008.47"},{"key":"14877_CR18","doi-asserted-by":"crossref","unstructured":"Patterson G, Hays J (2012) Sun attribute database: discovering, annotating and recognizing scene attributes. In: Computer vision and pattern recognition","DOI":"10.1109\/CVPR.2012.6247998"},{"key":"14877_CR19","doi-asserted-by":"crossref","unstructured":"Paul A, Krishnan NC, Munjal P (2019) Semantically aligned bias reducing zero shot learning. In: Proceedings of the ieee conference on computer vision and pattern recognition, pp 7056\u20137065","DOI":"10.1109\/CVPR.2019.00722"},{"key":"14877_CR20","doi-asserted-by":"crossref","unstructured":"Reed S, Akata Z, Schiele B et al (2016) Learning deep representations of fine-grained visual descriptions. In: Computer vision and pattern recognition","DOI":"10.1109\/CVPR.2016.13"},{"key":"14877_CR21","unstructured":"Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning"},{"key":"14877_CR22","doi-asserted-by":"crossref","unstructured":"Schonfeld E, Ebrahimi S, Sinha S et al (2019) Generalized zero-shot learning via aligned variational autoencoders. In: Computer vision and pattern recognition (CVPR) workshops","DOI":"10.1109\/CVPR.2019.00844"},{"issue":"9","key":"14877_CR23","doi-asserted-by":"publisher","first-page":"2212","DOI":"10.1109\/TPAMI.2019.2913857","volume":"42","author":"Y Tamaazousti","year":"2020","unstructured":"Tamaazousti Y, Le Borgne H, Hudelot C et al (2020) Learning more universal representations for transfer-learning. IEEE T Pattern Anal Mach Intell 42 (9):2212\u20132224","journal-title":"IEEE T Pattern Anal Mach Intell"},{"key":"14877_CR24","doi-asserted-by":"crossref","unstructured":"Tran TQN, Le Borgne H, Crucianu M (2015) Combining generic and specific information for cross-modal retrieval. In: In: Proc. ACM international conference on multimedia retrieval (ICMR)","DOI":"10.1145\/2671188.2749348"},{"key":"14877_CR25","doi-asserted-by":"crossref","unstructured":"Tran TQN, Le Borgne H, Crucianu M (2016a) Aggregating image and text quantized correlated components. In: Computer vision and pattern recognition, Las Vegas","DOI":"10.1109\/CVPR.2016.225"},{"key":"14877_CR26","doi-asserted-by":"crossref","unstructured":"Tran TQN, Le Borgne H, Crucianu M (2016b) Cross-modal classification by completing unimodal representations. In: ACM multimedia 2016 workshop: vision and language integration meets multimedia fusion, Amsterdam, The Netherlands","DOI":"10.1145\/2983563.2983570"},{"key":"14877_CR27","doi-asserted-by":"crossref","unstructured":"Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Machine learning and knowledge discovery in databases","DOI":"10.1007\/978-3-319-71246-8_48"},{"key":"14877_CR28","unstructured":"Wah C, Branson S, Welinder P et al (2011) The Caltech-UCSD Birds-200-2011 Dataset. Tech Rep, CNS-TR-2011-001, California Institute of Technology"},{"key":"14877_CR29","doi-asserted-by":"crossref","unstructured":"Xian Y, Lorenz T, Schiele B et al (2018) Feature generating networks for zero-shot learning. In: 2018 IEEE conference on computer vision and pattern recognition, computer vision foundation \/ IEEE computer society. CVPR 2018, Salt Lake City, pp 5542\u20135551","DOI":"10.1109\/CVPR.2018.00581"},{"key":"14877_CR30","doi-asserted-by":"crossref","unstructured":"Xian Y, Schiele B, Akata Z (2017) Zero-shot learning - the good the bad and the ugly. In: Computer vision and pattern recognition","DOI":"10.1109\/CVPR.2017.328"},{"key":"14877_CR31","doi-asserted-by":"crossref","unstructured":"Xian Y, Sharma S, Schiele B et al (2019) F-vaegan-d2: a feature generating framework for any-shot learning. In: Computer vision and pattern recognition, pp 10,267\u201310,276","DOI":"10.1109\/CVPR.2019.01052"},{"key":"14877_CR32","doi-asserted-by":"crossref","unstructured":"Ye M, Guo Y (2017) Zero-shot classification with discriminative semantic representation learning. In: Computer vision and pattern recognition, pp 5103\u20135111","DOI":"10.1109\/CVPR.2017.542"},{"key":"14877_CR33","unstructured":"Zhang H, Ciss\u0117 M, Dauphin YN et al (2018) Mixup: beyond empirical risk minimization. In: International conference on learning representations"},{"key":"14877_CR34","doi-asserted-by":"crossref","unstructured":"Znaidia A, Shabou A, Popescu A et al (2012) Multimodal feature generation framework for semantic image classification. In: ACM international conference on multimedia retrieval (ICMR 2012)","DOI":"10.1145\/2324796.2324842"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-023-14877-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-023-14877-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-023-14877-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T17:57:30Z","timestamp":1729187850000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-023-14877-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,31]]},"references-count":34,"journal-issue":{"issue":"26","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["14877"],"URL":"https:\/\/doi.org\/10.1007\/s11042-023-14877-1","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,31]]},"assertion":[{"value":"7 February 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 May 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 February 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 March 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interests"}}]}}