{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,15]],"date-time":"2025-11-15T10:30:58Z","timestamp":1763202658211,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T00:00:00Z","timestamp":1656288000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,27]]},"DOI":"10.1145\/3512527.3531389","type":"proceedings-article","created":{"date-parts":[[2022,6,23]],"date-time":"2022-06-23T22:23:32Z","timestamp":1656023012000},"page":"268-276","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Disentangled Representations and Hierarchical Refinement of Multi-Granularity Features for Text-to-Image Synthesis"],"prefix":"10.1145","author":[{"given":"Pei","family":"Dong","sequence":"first","affiliation":[{"name":"Shandong University, JiNan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lei","family":"Wu","sequence":"additional","affiliation":[{"name":"Shandong University, JiNan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lei","family":"Meng","sequence":"additional","affiliation":[{"name":"Shandong University, JiNan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangxu","family":"Meng","sequence":"additional","affiliation":[{"name":"Shandong University, JiNan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,6,27]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Asian Conference on Computer Vision. Springer, 100--116","author":"Chen Kevin","year":"2018","unstructured":"Kevin Chen , Christopher B Choy , Manolis Savva , Angel X Chang , Thomas Funkhouser , and Silvio Savarese . 2018 . Text2shape: Generating shapes from natural language by learning joint embeddings . In Asian Conference on Computer Vision. Springer, 100--116 . Kevin Chen, Christopher B Choy, Manolis Savva, Angel X Chang, Thomas Funkhouser, and Silvio Savarese. 2018. Text2shape: Generating shapes from natural language by learning joint embeddings. In Asian Conference on Computer Vision. Springer, 100--116."},{"key":"e_1_3_2_2_2_1","volume-title":"Proceedings of the 30th International Conference on Neural Information Processing Systems. 2180--2188","author":"Chen Xi","year":"2016","unstructured":"Xi Chen , Yan Duan , Rein Houthooft , John Schulman , Ilya Sutskever , and Pieter Abbeel . 2016 . Infogan: Interpretable representation learning by information maximizing generative adversarial nets . In Proceedings of the 30th International Conference on Neural Information Processing Systems. 2180--2188 . Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Proceedings of the 30th International Conference on Neural Information Processing Systems. 2180--2188."},{"key":"e_1_3_2_2_3_1","volume-title":"Sheraz Ahmed, Marcus Liwicki, and Muhammad Zeshan Afzal.","author":"Dash Ayushman","year":"2017","unstructured":"Ayushman Dash , John Cristian Borges Gamboa , Sheraz Ahmed, Marcus Liwicki, and Muhammad Zeshan Afzal. 2017 . Tac-gan-text conditioned auxiliary classifier generative adversarial network. arXiv preprint arXiv:1703.06412 (2017). Ayushman Dash, John Cristian Borges Gamboa, Sheraz Ahmed, Marcus Liwicki, and Muhammad Zeshan Afzal. 2017. Tac-gan-text conditioned auxiliary classifier generative adversarial network. arXiv preprint arXiv:1703.06412 (2017)."},{"key":"e_1_3_2_2_4_1","volume-title":"Generative adversarial nets. Advances in neural information processing systems 27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow , Jean Pouget-Abadie , Mehdi Mirza , Bing Xu , David Warde-Farley , Sherjil Ozair , Aaron Courville , and Yoshua Bengio . 2014. Generative adversarial nets. Advances in neural information processing systems 27 ( 2014 ). Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014)."},{"key":"e_1_3_2_2_5_1","volume-title":"Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel , Hubert Ramsauer , Thomas Unterthiner , Bernhard Nessler , and Sepp Hochreiter . 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 ( 2017 ). Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_2_6_1","volume-title":"International Conference on Learning Representations.","author":"Hinz Tobias","year":"2018","unstructured":"Tobias Hinz , Stefan Heinrich , and Stefan Wermter . 2018 . Generating Multiple Objects at Spatially Distinct Locations . In International Conference on Learning Representations. Tobias Hinz, Stefan Heinrich, and Stefan Wermter. 2018. Generating Multiple Objects at Spatially Distinct Locations. In International Conference on Learning Representations."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.3021209"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00833"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.215"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01245"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00649"},{"key":"e_1_3_2_2_12_1","first-page":"3948","article-title":"Pastegan: A semi-parametric method to generate image from scene graph","volume":"32","author":"Li Yikang","year":"2019","unstructured":"Yikang Li , Tao Ma , Yeqi Bai , Nan Duan , Sining Wei , and Xiaogang Wang . 2019 . Pastegan: A semi-parametric method to generate image from scene graph . Advances in Neural Information Processing Systems 32 (2019), 3948 -- 3958 . Yikang Li, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, and Xiaogang Wang. 2019. Pastegan: A semi-parametric method to generate image from scene graph. Advances in Neural Information Processing Systems 32 (2019), 3948--3958.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00898"},{"key":"e_1_3_2_2_14_1","volume-title":"Interactive image generation using scene graphs. arXiv preprint arXiv:1905.03743","author":"Mittal Gaurav","year":"2019","unstructured":"Gaurav Mittal , Shubham Agrawal , Anuva Agarwal , Sushant Mehta , and Tanya Marwah . 2019. Interactive image generation using scene graphs. arXiv preprint arXiv:1905.03743 ( 2019 ). Gaurav Mittal, Shubham Agrawal, Anuva Agarwal, Sushant Mehta, and Tanya Marwah. 2019. Interactive image generation using scene graphs. arXiv preprint arXiv:1905.03743 (2019)."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372278.3390684"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00160"},{"key":"e_1_3_2_2_17_1","volume-title":"International Conference on Machine Learning. PMLR, 1060--1069","author":"Reed Scott","year":"2016","unstructured":"Scott Reed , Zeynep Akata , Xinchen Yan , Lajanugen Logeswaran , Bernt Schiele , and Honglak Lee . 2016 . Generative adversarial text to image synthesis . In International Conference on Machine Learning. PMLR, 1060--1069 . Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In International Conference on Machine Learning. PMLR, 1060--1069."},{"key":"e_1_3_2_2_18_1","volume-title":"Learning what and where to draw. Advances in neural information processing systems 29","author":"Reed Scott E","year":"2016","unstructured":"Scott E Reed , Zeynep Akata , Santosh Mohan , Samuel Tenka , Bernt Schiele , and Honglak Lee . 2016. Learning what and where to draw. Advances in neural information processing systems 29 ( 2016 ), 217--225. Scott E Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. 2016. Learning what and where to draw. Advances in neural information processing systems 29 (2016), 217--225."},{"key":"e_1_3_2_2_19_1","volume-title":"Improved techniques for training gans. Advances in neural information processing systems 29","author":"Salimans Tim","year":"2016","unstructured":"Tim Salimans , Ian Goodfellow , Wojciech Zaremba , Vicki Cheung , Alec Radford , and Xi Chen . 2016. Improved techniques for training gans. Advances in neural information processing systems 29 ( 2016 ), 2234--2242. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. Advances in neural information processing systems 29 (2016), 2234--2242."},{"key":"e_1_3_2_2_20_1","volume-title":"NIPS Workshop on Bayesian Deep Learning","volume":"8","author":"Shu Rui","year":"2017","unstructured":"Rui Shu , Hung Bui , and Stefano Ermon . 2017 . Ac-gan learns a biased distribution . In NIPS Workshop on Bayesian Deep Learning , Vol. 8 . Rui Shu, Hung Bui, and Stefano Ermon. 2017. Ac-gan learns a biased distribution. In NIPS Workshop on Bayesian Deep Learning, Vol. 8."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00665"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_2_2_23_1","unstructured":"CatherineWah Steve Branson PeterWelinder Pietro Perona and Serge Belongie. 2011. The caltech-ucsd birds-200--2011 dataset. (2011).  CatherineWah Steve Branson PeterWelinder Pietro Perona and Serge Belongie. 2011. The caltech-ucsd birds-200--2011 dataset. (2011)."},{"key":"e_1_3_2_2_24_1","volume-title":"Proceedings of the 26th ACM international conference on Multimedia. 274--282","author":"Yuan Yufeng","year":"2018","unstructured":"GuanshuoWang, Yufeng Yuan , Xiong Chen , Jiwei Li , and Xi Zhou . 2018 . Learning discriminative features with multiple granularities for person re-identification . In Proceedings of the 26th ACM international conference on Multimedia. 274--282 . GuanshuoWang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. 2018. Learning discriminative features with multiple granularities for person re-identification. In Proceedings of the 26th ACM international conference on Multimedia. 274--282."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3391709","article-title":"End-to-End Text-to-Image Synthesis with Spatial Constrains","volume":"11","author":"Wang Min","year":"2020","unstructured":"Min Wang , Congyan Lang , Liqian Liang , Songhe Feng , Tao Wang , and Yutong Gao . 2020 . End-to-End Text-to-Image Synthesis with Spatial Constrains . ACM Transactions on Intelligent Systems and Technology (TIST) 11 , 4 (2020), 1 -- 19 . Min Wang, Congyan Lang, Liqian Liang, Songhe Feng, Tao Wang, and Yutong Gao. 2020. End-to-End Text-to-Image Synthesis with Spatial Constrains. ACM Transactions on Intelligent Systems and Technology (TIST) 11, 4 (2020), 1--19.","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME46284.2020.9102761"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00143"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.629"},{"key":"e_1_3_2_2_29_1","volume-title":"Realistic image synthesis with stacked generative adversarial networks","author":"Zhang Han","year":"2018","unstructured":"Han Zhang , Tao Xu , Hongsheng Li , Shaoting Zhang , Xiaogang Wang , Xiaolei Huang , and Dimitris N Metaxas . 2018. Stackgan++ : Realistic image synthesis with stacked generative adversarial networks . IEEE transactions on pattern analysis and machine intelligence 41, 8 ( 2018 ), 1947--1962. Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N Metaxas. 2018. Stackgan++: Realistic image synthesis with stacked generative adversarial networks. IEEE transactions on pattern analysis and machine intelligence 41, 8 (2018), 1947--1962."},{"key":"e_1_3_2_2_30_1","volume-title":"International Conference on Machine Learning. PMLR, 11117--11128","author":"Zhang Jize","year":"2020","unstructured":"Jize Zhang , Bhavya Kailkhura , and T Yong-Jin Han . 2020 . Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning . In International Conference on Machine Learning. PMLR, 11117--11128 . Jize Zhang, Bhavya Kailkhura, and T Yong-Jin Han. 2020. Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning. In International Conference on Machine Learning. PMLR, 11117--11128."},{"key":"e_1_3_2_2_31_1","first-page":"256","article-title":"PixelBrush: Art Generation from text with GANs. In Cl. Proj. Stanford CS231N Convolutional Neural Networks Vis. Recognition","volume":"2017","author":"Zhi Jiale","year":"2017","unstructured":"Jiale Zhi . 2017 . PixelBrush: Art Generation from text with GANs. In Cl. Proj. Stanford CS231N Convolutional Neural Networks Vis. Recognition , Sprint 2017. 256 . Jiale Zhi. 2017. PixelBrush: Art Generation from text with GANs. In Cl. Proj. Stanford CS231N Convolutional Neural Networks Vis. Recognition, Sprint 2017. 256.","journal-title":"Sprint"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00167"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00595"}],"event":{"name":"ICMR '22: International Conference on Multimedia Retrieval","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Newark NJ USA","acronym":"ICMR '22"},"container-title":["Proceedings of the 2022 International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512527.3531389","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3512527.3531389","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:12Z","timestamp":1750188612000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512527.3531389"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,27]]},"references-count":34,"alternative-id":["10.1145\/3512527.3531389","10.1145\/3512527"],"URL":"https:\/\/doi.org\/10.1145\/3512527.3531389","relation":{},"subject":[],"published":{"date-parts":[[2022,6,27]]},"assertion":[{"value":"2022-06-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}