{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,5]],"date-time":"2026-06-05T16:12:08Z","timestamp":1780675928742,"version":"3.54.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"2s","license":[{"start":{"date-parts":[[2023,2,17]],"date-time":"2023-02-17T00:00:00Z","timestamp":1676592000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2021YFE0205300"],"award-info":[{"award-number":["2021YFE0205300"]}]},{"name":"Capital\u2019s Funds for Health Improvement and Research","award":["2020-2-4079"],"award-info":[{"award-number":["2020-2-4079"]}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["2020XD-A02-3"],"award-info":[{"award-number":["2020XD-A02-3"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p>Generating food images from recipe and ingredient information can be applied to many tasks such as food recommendation, recipe development, and health management. For the characteristics of food images, this paper proposes ML-CookGAN, a novel CGAN. This network enables the generation of food images based on recipe and ingredient labels. The generator of ML-CookGAN, Multi-Label Fusion Generator, converts recipe and ingredient labels into different granularity features and generates corresponding food images. The discriminator of ML-CookGAN, Multi-Branch Discriminator, implements discrimination and classification with a multi-branch structure. In addition, we propose two training strategies, Region-Wise Pooling and Image Style Distillation, to better the network performance. Region-Wise Pooling handles region-wise features with the discriminator. Image Style Distillation aims at extracting image latent features to assist image generation by an unsupervised method. The experiments conducted on VIREO Food-172 databases validate the proposed method to generate high-quality Chinese food images. And Region-Wise Pooling and Image Style Distillation are proven to enhance the diversity and realism of generated food images.<\/jats:p>","DOI":"10.1145\/3554738","type":"journal-article","created":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T10:55:13Z","timestamp":1660388113000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["ML-CookGAN: Multi-Label Generative Adversarial Network for Food Image Generation"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6133-4632","authenticated-orcid":false,"given":"Zhiming","family":"Liu","sequence":"first","affiliation":[{"name":"Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8076-1867","authenticated-orcid":false,"given":"Kai","family":"Niu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5010-3118","authenticated-orcid":false,"given":"Zhiqiang","family":"He","sequence":"additional","affiliation":[{"name":"Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,2,17]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"214","volume-title":"International Conference on Machine Learning","author":"Arjovsky Martin","year":"2017","unstructured":"Martin Arjovsky, Soumith Chintala, and L\u00e9on Bottou. 2017. Wasserstein generative adversarial networks. In International Conference on Machine Learning. PMLR, 214\u2013223."},{"key":"e_1_3_1_3_2","article-title":"A note on the inception score","author":"Barratt Shane","year":"2018","unstructured":"Shane Barratt and Rishi Sharma. 2018. A note on the inception score. arXiv preprint arXiv:1801.01973 (2018).","journal-title":"arXiv preprint arXiv:1801.01973"},{"key":"e_1_3_1_4_2","first-page":"394","volume-title":"International Conference on Image Analysis and Processing","author":"Bola\u00f1os Marc","year":"2017","unstructured":"Marc Bola\u00f1os, Aina Ferr\u00e0, and Petia Radeva. 2017. Food ingredients recognition through multi-label learning. In International Conference on Image Analysis and Processing. Springer, 394\u2013402."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2018.10.009"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10599-4_29"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2964315"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.3045639"},{"key":"e_1_3_1_9_2","first-page":"2217","volume-title":"Proceedings of the 27th ACM International Conference on Multimedia","author":"Cho Jaehyeong","year":"2019","unstructured":"Jaehyeong Cho, Wataru Shimoda, and Keiji Yanai. 2019. Ramen as you like: Sketch-based food image generation and editing. in Proceedings of the 27th ACM International Conference on Multimedia. 2217\u20132218."},{"key":"e_1_3_1_10_2","article-title":"A review of Generative Adversarial Networks (GANs) and its applications in a wide variety of disciplines\u2013from medical to remote sensing","author":"Dash Ankan","year":"2021","unstructured":"Ankan Dash, Junyi Ye, and Guiling Wang. 2021. A review of Generative Adversarial Networks (GANs) and its applications in a wide variety of disciplines\u2013from medical to remote sensing. arXiv preprint arXiv:2110.01442 (2021).","journal-title":"arXiv preprint arXiv:2110.01442"},{"key":"e_1_3_1_11_2","article-title":"Generative adversarial nets","volume":"27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_12_2","article-title":"A review on generative adversarial networks: Algorithms, theory, and applications","author":"Gui Jie","year":"2020","unstructured":"Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, and Jieping Ye. 2020. A review on generative adversarial networks: Algorithms, theory, and applications. arXiv preprint arXiv:2001.06937 (2020).","journal-title":"arXiv preprint arXiv:2001.06937"},{"key":"e_1_3_1_13_2","article-title":"Improved training of Wasserstein GANs","author":"Gulrajani Ishaan","year":"2017","unstructured":"Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. Improved training of Wasserstein GANs. arXiv preprint arXiv:1704.00028 (2017).","journal-title":"arXiv preprint arXiv:1704.00028"},{"key":"e_1_3_1_14_2","first-page":"1450","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Han Fangda","year":"2020","unstructured":"Fangda Han, Ricardo Guerrero, and Vladimir Pavlovic. 2020. CookGAN: Meal image synthesis from ingredients. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 1450\u20131458."},{"key":"e_1_3_1_15_2","article-title":"MPG: A multi-ingredient pizza image generator with conditional StyleGANs","author":"Han Fangda","year":"2020","unstructured":"Fangda Han, Guoyao Hao, Ricardo Guerrero, and Vladimir Pavlovic. 2020. MPG: A multi-ingredient pizza image generator with conditional StyleGANs. arXiv preprint arXiv:2012.02821 (2020).","journal-title":"arXiv preprint arXiv:2012.02821"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_17_2","article-title":"GANs trained by a two time-scale update rule converge to a local Nash equilibrium","volume":"30","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.19"},{"key":"e_1_3_1_21_2","article-title":"MVANet: Multi-tasks guided multi-view attention network for Chinese food recognition","author":"Liang Haozan","year":"2020","unstructured":"Haozan Liang, Guihua Wen, Yang Hu, Mingnan Luo, Pei Yang, and Yingxue Xu. 2020. MVANet: Multi-tasks guided multi-view attention network for Chinese food recognition. IEEE Transactions on Multimedia (2020).","journal-title":"IEEE Transactions on Multimedia"},{"key":"e_1_3_1_22_2","article-title":"Food and ingredient joint learning for fine-grained recognition","author":"Liu Chengxu","year":"2020","unstructured":"Chengxu Liu, Yuanzhi Liang, Yao Xue, Xueming Qian, and Jianlong Fu. 2020. Food and ingredient joint learning for fine-grained recognition. IEEE Transactions on Circuits and Systems for Video Technology (2020).","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"e_1_3_1_23_2","first-page":"14286","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Liu Steven","year":"2020","unstructured":"Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, and Antonio Torralba. 2020. Diverse image generation via self-conditioned GANs. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 14286\u201314295."},{"key":"e_1_3_1_24_2","article-title":"BAGAN: Data augmentation with balancing GAN","author":"Mariani Giovanni","year":"2018","unstructured":"Giovanni Mariani, Florian Scheidegger, Roxana Istrate, Costas Bekas, and Cristiano Malossi. 2018. BAGAN: Data augmentation with balancing GAN. arXiv preprint arXiv:1803.09655 (2018).","journal-title":"arXiv preprint arXiv:1803.09655"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2927476"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3329168"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350948"},{"key":"e_1_3_1_28_2","article-title":"Conditional generative adversarial nets","author":"Mirza Mehdi","year":"2014","unstructured":"Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).","journal-title":"arXiv preprint arXiv:1411.1784"},{"key":"e_1_3_1_29_2","article-title":"Spectral normalization for generative adversarial networks","author":"Miyato Takeru","year":"2018","unstructured":"Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).","journal-title":"arXiv preprint arXiv:1802.05957"},{"key":"e_1_3_1_30_2","first-page":"2642","volume-title":"International Conference on Machine Learning","author":"Odena Augustus","year":"2017","unstructured":"Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier GANs. In International Conference on Machine Learning. PMLR, 2642\u20132651."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413636"},{"key":"e_1_3_1_32_2","first-page":"8002","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Papadopoulos Dim P.","year":"2019","unstructured":"Dim P. Papadopoulos, Youssef Tamaazousti, Ferda Ofli, Ingmar Weber, and Antonio Torralba. 2019. How to make a pizza: Learning a compositional layer-based GAN model. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 8002\u20138011."},{"key":"e_1_3_1_33_2","first-page":"2234","article-title":"Improved techniques for training GANs","volume":"29","author":"Salimans Tim","year":"2016","unstructured":"Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. Advances in Neural Information Processing Systems 29 (2016), 2234\u20132242.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_34_2","first-page":"10453","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Salvador Amaia","year":"2019","unstructured":"Amaia Salvador, Michal Drozdzal, Xavier Giro-i Nieto, and Adriana Romero. 2019. Inverse cooking: Recipe generation from food images. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 10453\u201310462."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_1_36_2","article-title":"A note on the evaluation of generative models","author":"Theis Lucas","year":"2015","unstructured":"Lucas Theis, A\u00e4ron van den Oord, and Matthias Bethge. 2015. A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844 (2015).","journal-title":"arXiv preprint arXiv:1511.01844"},{"key":"e_1_3_1_37_2","article-title":"Food recommender systems: Important contributions, challenges and future research directions","author":"Trattner Christoph","year":"2017","unstructured":"Christoph Trattner and David Elsweiler. 2017. Food recommender systems: Important contributions, challenges and future research directions. arXiv preprint arXiv:1711.02760 (2017).","journal-title":"arXiv preprint arXiv:1711.02760"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2015.7280356"},{"key":"e_1_3_1_39_2","first-page":"5519","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zhu Bin","year":"2020","unstructured":"Bin Zhu and Chong-Wah Ngo. 2020. CookGAN: Causality based text-to-image synthesis. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 5519\u20135527."},{"key":"e_1_3_1_40_2","first-page":"11477","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Zhu Bin","year":"2019","unstructured":"Bin Zhu, Chong-Wah Ngo, Jingjing Chen, and Yanbin Hao. 2019. R2GAN: Cross-modal recipe retrieval with generative adversarial network. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 11477\u201311486."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3554738","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3554738","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:29Z","timestamp":1750182569000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3554738"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,17]]},"references-count":39,"journal-issue":{"issue":"2s","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1145\/3554738"],"URL":"https:\/\/doi.org\/10.1145\/3554738","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,17]]},"assertion":[{"value":"2021-11-05","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-23","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}