{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:01:46Z","timestamp":1750309306461,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":46,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,3,10]],"date-time":"2023-03-10T00:00:00Z","timestamp":1678406400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,3,10]]},"DOI":"10.1145\/3589572.3589579","type":"proceedings-article","created":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T16:19:07Z","timestamp":1686327547000},"page":"42-50","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["SkeletonGAN: Fine-Grained Pose Synthesis of Human-Object Interactions"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-0146-0724","authenticated-orcid":false,"given":"Qi","family":"Sun","sequence":"first","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8061-2343","authenticated-orcid":false,"given":"Nanxi","family":"Chen","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4130-6271","authenticated-orcid":false,"given":"Ruipeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7478-4544","authenticated-orcid":false,"given":"Jiamao","family":"Li","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3307-9838","authenticated-orcid":false,"given":"Xiaolin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,6,9]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"International conference on machine learning. PMLR, 214\u2013223","author":"Arjovsky Martin","year":"2017","unstructured":"Martin Arjovsky , Soumith Chintala , and L\u00e9on Bottou . 2017 . Wasserstein generative adversarial networks . In International conference on machine learning. PMLR, 214\u2013223 . Martin Arjovsky, Soumith Chintala, and L\u00e9on Bottou. 2017. Wasserstein generative adversarial networks. In International conference on machine learning. PMLR, 214\u2013223."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00466"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00870"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00242"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.168"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01092"},{"key":"e_1_3_2_1_7_1","volume-title":"Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516","author":"Dinh Laurent","year":"2014","unstructured":"Laurent Dinh , David Krueger , and Yoshua Bengio . 2014 . Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014). Laurent Dinh, David Krueger, and Yoshua Bengio. 2014. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014)."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.6.2.152-159"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413854"},{"key":"e_1_3_2_1_10_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Y. Bengio. 2014. Generative Adversarial Nets. In Neural Information Processing Systems.  Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Y. Bengio. 2014. Generative Adversarial Nets. In Neural Information Processing Systems."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58574-7_13"},{"key":"e_1_3_2_1_12_1","volume-title":"Nips","author":"Heusel Martin","year":"2017","unstructured":"Martin Heusel , Hubert Ramsauer , Thomas Unterthiner , Bernhard Nessler , and Sepp Hochreiter . 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Advances in Neural Information Processing Systems 2017-December , Nips ( 2017 ), 6627\u20136638. arxiv:1706.08500 Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Advances in Neural Information Processing Systems 2017-December, Nips (2017), 6627\u20136638. arxiv:1706.08500"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV51458.2022.00324"},{"key":"#cr-split#-e_1_3_2_1_14_1.1","doi-asserted-by":"crossref","unstructured":"Zhi Hou Xiaojiang Peng Yu Qiao and Dacheng Tao. 2020. Visual Compositional Learning for Human-Object Interaction Detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12360 LNCS (2020) 584-600. https:\/\/doi.org\/10.1007\/978-3-030-58555-6_35 arxiv:2007.12407 10.1007\/978-3-030-58555-6_35","DOI":"10.1007\/978-3-030-58555-6_35"},{"key":"#cr-split#-e_1_3_2_1_14_1.2","doi-asserted-by":"crossref","unstructured":"Zhi Hou Xiaojiang Peng Yu Qiao and Dacheng Tao. 2020. Visual Compositional Learning for Human-Object Interaction Detection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12360 LNCS (2020) 584-600. https:\/\/doi.org\/10.1007\/978-3-030-58555-6_35 arxiv:2007.12407","DOI":"10.1007\/978-3-030-58555-6_35"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00133"},{"key":"e_1_3_2_1_16_1","volume-title":"Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215","author":"Karacan Levent","year":"2016","unstructured":"Levent Karacan , Zeynep Akata , Aykut Erdem , and Erkut Erdem . 2016. Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215 ( 2016 ). Levent Karacan, Zeynep Akata, Aykut Erdem, and Erkut Erdem. 2016. Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215 (2016)."},{"key":"e_1_3_2_1_17_1","volume-title":"2nd International Conference on Learning Representations, ICLR 2014 - Conference Track ProceedingsMl","author":"P.","year":"2014","unstructured":"Diederik\u00a0 P. Kingma and Max Welling. 2014. Auto-encoding variational bayes . 2nd International Conference on Learning Representations, ICLR 2014 - Conference Track ProceedingsMl ( 2014 ), 1\u201314. arxiv:1312.6114 Diederik\u00a0P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. 2nd International Conference on Learning Representations, ICLR 2014 - Conference Track ProceedingsMl (2014), 1\u201314. arxiv:1312.6114"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-016-0981-7"},{"key":"e_1_3_2_1_19_1","volume-title":"NeurIPS","author":"Li Yikang","year":"2019","unstructured":"Yikang Li , Tao Ma , Yeqi Bai , Nan Duan , Sining Wei , and Xiaogang Wang . 2019. PasteGAN: A semi-parametric method to generate image from scene graph. Advances in Neural Information Processing Systems 32 , NeurIPS ( 2019 ), 1\u201311. arxiv:1905.01608 Yikang Li, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, and Xiaogang Wang. 2019. PasteGAN: A semi-parametric method to generate image from scene graph. Advances in Neural Information Processing Systems 32, NeurIPS (2019), 1\u201311. arxiv:1905.01608"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00046"},{"key":"e_1_3_2_1_21_1","first-page":"740","article-title":"Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8693 LNCS","volume":"5","author":"Lin Tsung\u00a0Yi","year":"2014","unstructured":"Tsung\u00a0Yi Lin , Michael Maire , Serge Belongie , James Hays , Pietro Perona , Deva Ramanan , Piotr Doll\u00e1r , and C.\u00a0 Lawrence Zitnick . 2014 . Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8693 LNCS , PART 5 (2014), 740 \u2013 755 . https:\/\/doi.org\/10.1007\/978-3-319-10602-1_48 arxiv:1405.0312 10.1007\/978-3-319-10602-1_48 Tsung\u00a0Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C.\u00a0Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8693 LNCS, PART 5 (2014), 740\u2013755. https:\/\/doi.org\/10.1007\/978-3-319-10602-1_48 arxiv:1405.0312","journal-title":"PART"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01121"},{"key":"e_1_3_2_1_23_1","volume-title":"Conditional Generative Adversarial Nets. CoRR abs\/1411.1784","author":"Mirza Mehdi","year":"2014","unstructured":"Mehdi Mirza and Simon Osindero . 2014. Conditional Generative Adversarial Nets. CoRR abs\/1411.1784 ( 2014 ). arXiv:1411.1784http:\/\/arxiv.org\/abs\/1411.1784 Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. CoRR abs\/1411.1784 (2014). arXiv:1411.1784http:\/\/arxiv.org\/abs\/1411.1784"},{"key":"e_1_3_2_1_24_1","volume-title":"Stacked Hourglass Networks for Human Pose Estimation. CoRR abs\/1603.06937","author":"Newell Alejandro","year":"2016","unstructured":"Alejandro Newell , Kaiyu Yang , and Jia Deng . 2016. Stacked Hourglass Networks for Human Pose Estimation. CoRR abs\/1603.06937 ( 2016 ). arXiv:1603.06937http:\/\/arxiv.org\/abs\/1603.06937 Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked Hourglass Networks for Human Pose Estimation. CoRR abs\/1603.06937 (2016). arXiv:1603.06937http:\/\/arxiv.org\/abs\/1603.06937"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.18178\/joig.9.2.50-54"},{"key":"e_1_3_2_1_26_1","volume-title":"Real-time 2d multi-person pose estimation on cpu: Lightweight openpose. arXiv preprint arXiv:1811.12004","author":"Osokin Daniil","year":"2018","unstructured":"Daniil Osokin . 2018. Real-time 2d multi-person pose estimation on cpu: Lightweight openpose. arXiv preprint arXiv:1811.12004 ( 2018 ). Daniil Osokin. 2018. Real-time 2d multi-person pose estimation on cpu: Lightweight openpose. arXiv preprint arXiv:1811.12004 (2018)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00244"},{"key":"e_1_3_2_1_28_1","volume-title":"6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings","author":"Petzka Henning","year":"2018","unstructured":"Henning Petzka , Asja Fischer , and Denis Lukovnikov . 2018 . On the regularization of Wasserstein GaNs . 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings (2018), 1\u201324. arxiv:1709.08894 Henning Petzka, Asja Fischer, and Denis Lukovnikov. 2018. On the regularization of Wasserstein GaNs. 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings (2018), 1\u201324. arxiv:1709.08894"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00899"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00918"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00160"},{"key":"e_1_3_2_1_32_1","volume-title":"Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434","author":"Radford Alec","year":"2015","unstructured":"Alec Radford , Luke Metz , and Soumith Chintala . 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 ( 2015 ). Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)."},{"key":"e_1_3_2_1_33_1","volume-title":"High-Resolution Image Synthesis with Latent Diffusion Models. CoRR abs\/2112.10752","author":"Rombach Robin","year":"2021","unstructured":"Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , and Bj\u00f6rn Ommer . 2021. High-Resolution Image Synthesis with Latent Diffusion Models. CoRR abs\/2112.10752 ( 2021 ). arXiv:2112.10752https:\/\/arxiv.org\/abs\/2112.10752 Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj\u00f6rn Ommer. 2021. High-Resolution Image Synthesis with Latent Diffusion Models. CoRR abs\/2112.10752 (2021). arXiv:2112.10752https:\/\/arxiv.org\/abs\/2112.10752"},{"key":"e_1_3_2_1_34_1","volume-title":"Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402","author":"Schuhmann Christoph","year":"2022","unstructured":"Christoph Schuhmann , Romain Beaumont , Richard Vencu , Cade Gordon , Ross Wightman , Mehdi Cherti , Theo Coombes , Aarush Katta , Clayton Mullis , Mitchell Wortsman , 2022. Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402 ( 2022 ). Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, 2022. Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402 (2022)."},{"key":"e_1_3_2_1_35_1","unstructured":"Subarna Tripathi Anahita Bhiwandiwalla Alexei Bastidas and Hanlin Tang. 2019. Using Scene Graph Context to Improve Image Generation. (2019). arxiv:1901.03762http:\/\/arxiv.org\/abs\/1901.03762  Subarna Tripathi Anahita Bhiwandiwalla Alexei Bastidas and Hanlin Tang. 2019. Using Scene Graph Context to Improve Image Generation. (2019). arxiv:1901.03762http:\/\/arxiv.org\/abs\/1901.03762"},{"key":"e_1_3_2_1_36_1","volume-title":"Conditional Image Generation with PixelCNN Decoders. CoRR abs\/1606.05328","author":"van\u00a0den Oord A\u00e4ron","year":"2016","unstructured":"A\u00e4ron van\u00a0den Oord , Nal Kalchbrenner , Oriol Vinyals , Lasse Espeholt , Alex Graves , and Koray Kavukcuoglu . 2016. Conditional Image Generation with PixelCNN Decoders. CoRR abs\/1606.05328 ( 2016 ). arXiv:1606.05328http:\/\/arxiv.org\/abs\/1606.05328 A\u00e4ron van\u00a0den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional Image Generation with PixelCNN Decoders. CoRR abs\/1606.05328 (2016). arXiv:1606.05328http:\/\/arxiv.org\/abs\/1606.05328"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00143"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2856256"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46487-9_40"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-71278-5_11"},{"key":"e_1_3_2_1_41_1","volume-title":"Pose-Guided Person Image Synthesis for Data Augmentation in Pedestrian Detection. In 2021 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1493\u20131500","author":"Zhi Rong","year":"2021","unstructured":"Rong Zhi , Zijie Guo , Wuqiang Zhang , Baofeng Wang , Vitali Kaiser , Julian Wiederer , and Fabian\u00a0 B Flohr . 2021 . Pose-Guided Person Image Synthesis for Data Augmentation in Pedestrian Detection. In 2021 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1493\u20131500 . Rong Zhi, Zijie Guo, Wuqiang Zhang, Baofeng Wang, Vitali Kaiser, Julian Wiederer, and Fabian\u00a0B Flohr. 2021. Pose-Guided Person Image Synthesis for Data Augmentation in Pedestrian Detection. In 2021 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1493\u20131500."},{"key":"e_1_3_2_1_42_1","volume-title":"Toward multimodal image-to-image translation. Advances in neural information processing systems 30","author":"Zhu Jun-Yan","year":"2017","unstructured":"Jun-Yan Zhu , Richard Zhang , Deepak Pathak , Trevor Darrell , Alexei\u00a0 A Efros , Oliver Wang , and Eli Shechtman . 2017. Toward multimodal image-to-image translation. Advances in neural information processing systems 30 ( 2017 ). Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei\u00a0A Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00245"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3068236"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00551"}],"event":{"name":"ICMVA 2023: 2023 The 6th International Conference on Machine Vision and Applications","acronym":"ICMVA 2023","location":"Singapore Singapore"},"container-title":["Proceedings of the 2023 6th International Conference on Machine Vision and Applications"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589572.3589579","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589572.3589579","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:45Z","timestamp":1750291425000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589572.3589579"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,10]]},"references-count":46,"alternative-id":["10.1145\/3589572.3589579","10.1145\/3589572"],"URL":"https:\/\/doi.org\/10.1145\/3589572.3589579","relation":{},"subject":[],"published":{"date-parts":[[2023,3,10]]},"assertion":[{"value":"2023-06-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}