{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,27]],"date-time":"2026-05-27T15:39:59Z","timestamp":1779896399071,"version":"3.53.1"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,10,15]],"date-time":"2019-10-15T00:00:00Z","timestamp":1571097600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Program for Support of Top-notch Young Professionals"},{"DOI":"10.13039\/501100012659","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61733007"],"award-info":[{"award-number":["61733007"]}],"id":[{"id":"10.13039\/501100012659","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Program for HUST Academic Frontier Youth Team","award":["2017QYTD08"],"award-info":[{"award-number":["2017QYTD08"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,10,15]]},"DOI":"10.1145\/3343031.3350929","type":"proceedings-article","created":{"date-parts":[[2019,10,21]],"date-time":"2019-10-21T16:32:26Z","timestamp":1571675546000},"page":"1500-1508","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":122,"title":["Editing Text in the Wild"],"prefix":"10.1145","author":[{"given":"Liang","family":"Wu","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chengquan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Baidu Inc., Shenzhen, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jiaming","family":"Liu","sequence":"additional","affiliation":[{"name":"Baidu Inc., Shanghai, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Junyu","family":"Han","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jingtuo","family":"Liu","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Errui","family":"Ding","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiang","family":"Bai","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2019,10,15]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Jacob Andreas Marcus Rohrbach Trevor Darrell and Dan Klein. 2016. Learning to Compose Neural Networks for Question Answering. In NAACL-HLT . 1545--1554.  Jacob Andreas Marcus Rohrbach Trevor Darrell and Dan Klein. 2016. Learning to Compose Neural Networks for Question Answering. In NAACL-HLT . 1545--1554.","DOI":"10.18653\/v1\/N16-1181"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Samaneh Azadi Matthew Fisher Vladimir G Kim Zhaowen Wang Eli Shechtman and Trevor Darrell. 2018. Multi-content gan for few-shot font style transfer. In CVPR. 7564--7573.  Samaneh Azadi Matthew Fisher Vladimir G Kim Zhaowen Wang Eli Shechtman and Trevor Darrell. 2018. Multi-content gan for few-shot font style transfer. In CVPR. 7564--7573.","DOI":"10.1109\/CVPR.2018.00789"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Guha Balakrishnan Amy Zhao Adrian V Dalca Fredo Durand and John Guttag. 2018. Synthesizing images of humans in unseen poses. In CVPR. 8340--8348.  Guha Balakrishnan Amy Zhao Adrian V Dalca Fredo Durand and John Guttag. 2018. Synthesizing images of humans in unseen poses. In CVPR. 8340--8348.","DOI":"10.1109\/CVPR.2018.00870"},{"key":"e_1_3_2_1_4_1","unstructured":"Shancheng Fang Hongtao Xie Zheng-Jun Zha Nannan Sun Jianlong Tan and Yongdong Zhang. 2018. Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling. In ACM Multimedia. ACM 248--256.  Shancheng Fang Hongtao Xie Zheng-Jun Zha Nannan Sun Jianlong Tan and Yongdong Zhang. 2018. Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling. In ACM Multimedia. ACM 248--256."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Victor Fragoso Steffen Gauglitz Shane Zamora Jim Kleban and Matthew Turk. 2011. TranslatAR: A mobile augmented reality translator. In WACV. IEEE 497--502.  Victor Fragoso Steffen Gauglitz Shane Zamora Jim Kleban and Matthew Turk. 2011. TranslatAR: A mobile augmented reality translator. In WACV. IEEE 497--502.","DOI":"10.1109\/WACV.2011.5711545"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Leon A Gatys Alexander S Ecker and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414--2423.  Leon A Gatys Alexander S Ecker and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414--2423.","DOI":"10.1109\/CVPR.2016.265"},{"key":"e_1_3_2_1_7_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In NeurIPS. 2672--2680.  Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In NeurIPS. 2672--2680."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Ankush Gupta Andrea Vedaldi and Andrew Zisserman. 2016. Synthetic data for text localisation in natural images. In CVPR. 2315--2324.  Ankush Gupta Andrea Vedaldi and Andrew Zisserman. 2016. Synthetic data for text localisation in natural images. In CVPR. 2315--2324.","DOI":"10.1109\/CVPR.2016.254"},{"key":"e_1_3_2_1_9_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778."},{"key":"e_1_3_2_1_10_1","volume-title":"Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. 448--456.","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy . 2015 . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. 448--456. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. 448--456."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Phillip Isola Jun-Yan Zhu Tinghui Zhou and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR . 1125--1134.  Phillip Isola Jun-Yan Zhu Tinghui Zhou and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR . 1125--1134.","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_3_2_1_12_1","unstructured":"M. Jaderberg K. Simonyan A. Vedaldi and A. Zisserman. 2014. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition. arXiv preprint arXiv:1406.2227 (2014).  M. Jaderberg K. Simonyan A. Vedaldi and A. Zisserman. 2014. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition. arXiv preprint arXiv:1406.2227 (2014)."},{"key":"e_1_3_2_1_13_1","volume-title":"Perceptual losses for real-time style transfer and super-resolution","author":"Johnson Justin","unstructured":"Justin Johnson , Alexandre Alahi , and Li Fei-Fei . 2016. Perceptual losses for real-time style transfer and super-resolution . In ECCV. Springer , 694--711. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer, 694--711."},{"key":"e_1_3_2_1_14_1","volume-title":"ICDAR 2013 robust reading competition. In ICDAR. IEEE, 1484--1493","author":"Karatzas Dimosthenis","year":"2013","unstructured":"Dimosthenis Karatzas , Faisal Shafait , Seiichi Uchida , Masakazu Iwamura , Lluis Gomez i Bigorda , Sergi Robles Mestre , Joan Mas , David Fernandez Mota , Jon Almazan Almazan , and Lluis Pere De Las Heras . 2013 . ICDAR 2013 robust reading competition. In ICDAR. IEEE, 1484--1493 . Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazan Almazan, and Lluis Pere De Las Heras. 2013. ICDAR 2013 robust reading competition. In ICDAR. IEEE, 1484--1493."},{"key":"e_1_3_2_1_15_1","volume-title":"Adam: A Method for Stochastic Optimization. In ICLR. 13.","author":"Kingma Diederik P","year":"2015","unstructured":"Diederik P Kingma and Jimmy Ba . 2015 . Adam: A Method for Stochastic Optimization. In ICLR. 13. Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR. 13."},{"key":"e_1_3_2_1_16_1","volume-title":"Scene Text Detection and Recognition: The Deep Learning Era. arXiv preprint arXiv:1811.04256","author":"Long Shangbang","year":"2018","unstructured":"Shangbang Long , Xin He , and Cong Yao . 2018. Scene Text Detection and Recognition: The Deep Learning Era. arXiv preprint arXiv:1811.04256 ( 2018 ). Shangbang Long, Xin He, and Cong Yao. 2018. Scene Text Detection and Recognition: The Deep Learning Era. arXiv preprint arXiv:1811.04256 (2018)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2017.181"},{"key":"e_1_3_2_1_18_1","volume-title":"V-net: Fully convolutional neural networks for volumetric medical image segmentation. In IC3DV","author":"Milletari Fausto","year":"2016","unstructured":"Fausto Milletari , Nassir Navab , and Seyed-Ahmad Ahmadi . 2016 . V-net: Fully convolutional neural networks for volumetric medical image segmentation. In IC3DV . IEEE , 565--571. Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. 2016. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In IC3DV . IEEE, 565--571."},{"key":"e_1_3_2_1_19_1","volume-title":"Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784","author":"Mirza Mehdi","year":"2014","unstructured":"Mehdi Mirza and Simon Osindero . 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 ( 2014 ). Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)."},{"key":"e_1_3_2_1_20_1","volume-title":"Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957","author":"Miyato Takeru","year":"2018","unstructured":"Takeru Miyato , Toshiki Kataoka , Masanori Koyama , and Yuichi Yoshida . 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 ( 2018 ). Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2017.141"},{"key":"e_1_3_2_1_22_1","unstructured":"Alec Radford Luke Metz and Soumith Chintala. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR .  Alec Radford Luke Metz and Soumith Chintala. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR ."},{"key":"e_1_3_2_1_23_1","volume-title":"U-net: Convolutional networks for biomedical image segmentation","author":"Ronneberger Olaf","year":"2015","unstructured":"Olaf Ronneberger , Philipp Fischer , and Thomas Brox . 2015 . U-net: Convolutional networks for biomedical image segmentation . In MICCAI. Springer , 234--241. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234--241."},{"key":"e_1_3_2_1_24_1","volume-title":"STEFANN: Scene Text Editor","author":"Roy Prasun","year":"2019","unstructured":"Prasun Roy , Saumik Bhattacharya , Subhankar Ghosh , and Umapada Pal . 2019 . STEFANN: Scene Text Editor using Font Adaptive Neural Network . arXiv preprint arXiv:1903.01192 (2019). Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, and Umapada Pal. 2019. STEFANN: Scene Text Editor using Font Adaptive Neural Network. arXiv preprint arXiv:1903.01192 (2019)."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2646371"},{"key":"e_1_3_2_1_27_1","volume-title":"Aster: An attentional scene text recognizer with flexible rectification","author":"Shi Baoguang","year":"2018","unstructured":"Baoguang Shi , Mingkun Yang , Xinggang Wang , Pengyuan Lyu , Cong Yao , and Xiang Bai . 2018 . Aster: An attentional scene text recognizer with flexible rectification . IEEE TPAMI ( 2018). Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. 2018. Aster: An attentional scene text recognizer with flexible rectification. IEEE TPAMI (2018)."},{"key":"e_1_3_2_1_28_1","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR .  Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In ICLR ."},{"key":"e_1_3_2_1_29_1","volume-title":"Learning to write stylized chinese characters by reading a handful of examples. IJCAI","author":"Sun Danyang","year":"2017","unstructured":"Danyang Sun , Tongzheng Ren , Chongxun Li , Hang Su , and Jun Zhu . 2017. Learning to write stylized chinese characters by reading a handful of examples. IJCAI ( 2017 ). Danyang Sun, Tongzheng Ren, Chongxun Li, Hang Su, and Jun Zhu. 2017. Learning to write stylized chinese characters by reading a handful of examples. IJCAI (2017)."},{"key":"e_1_3_2_1_30_1","volume-title":"et almbox","author":"Wang Zhou","year":"2004","unstructured":"Zhou Wang , Alan C Bovik , Hamid R Sheikh , Eero P Simoncelli , et almbox . 2004 . Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing , Vol. 13 , 4 (2004), 600--612. Zhou Wang, Alan C Bovik, Hamid R Sheikh, Eero P Simoncelli, et almbox. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing , Vol. 13, 4 (2004), 600--612."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Shuai Yang Jiaying Liu Zhouhui Lian and Zongming Guo. 2017. Awesome typography: Statistics-based text effects transfer. In NeurIPS . 7464--7473.  Shuai Yang Jiaying Liu Zhouhui Lian and Zongming Guo. 2017. Awesome typography: Statistics-based text effects transfer. In NeurIPS . 7464--7473.","DOI":"10.1109\/CVPR.2017.308"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33011238"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Shuai Yang Jiaying Liu Wenhan Yang and Zongming Guo. 2018. Context-Aware Unsupervised Text Stylization. In ACM Multimedia. ACM 1688--1696.  Shuai Yang Jiaying Liu Wenhan Yang and Zongming Guo. 2018. Context-Aware Unsupervised Text Stylization. In ACM Multimedia. ACM 1688--1696.","DOI":"10.1145\/3240508.3240580"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Chengquan Zhang Borong Liang Zuming Huang Mengyi En Junyu Han Errui Ding and Xinghao Ding. 2019 a. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. In CVPR . 10552--10561.  Chengquan Zhang Borong Liang Zuming Huang Mengyi En Junyu Han Errui Ding and Xinghao Ding. 2019 a. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. In CVPR . 10552--10561.","DOI":"10.1109\/CVPR.2019.01080"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.3301801"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/357994.358023"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"crossref","unstructured":"Yexun Zhang Ya Zhang and Wenbin Cai. 2018. Separating style and content for generalized style transfer. In CVPR. 8447--8455.  Yexun Zhang Ya Zhang and Wenbin Cai. 2018. Separating style and content for generalized style transfer. In CVPR. 8447--8455.","DOI":"10.1109\/CVPR.2018.00881"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Zheng Zhang Chengquan Zhang Wei Shen Cong Yao Wenyu Liu and Xiang Bai. 2016. Multi-oriented text detection with fully convolutional networks. In CVPR . 4159--4167.  Zheng Zhang Chengquan Zhang Wei Shen Cong Yao Wenyu Liu and Xiang Bai. 2016. Multi-oriented text detection with fully convolutional networks. In CVPR . 4159--4167.","DOI":"10.1109\/CVPR.2016.451"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"crossref","unstructured":"Xinyu Zhou Cong Yao He Wen Yuzhi Wang Shuchang Zhou Weiran He and Jiajun Liang. 2017. EAST: an efficient and accurate scene text detector. In CVPR. 5551--5560.  Xinyu Zhou Cong Yao He Wen Yuzhi Wang Shuchang Zhou Weiran He and Jiajun Liang. 2017. EAST: an efficient and accurate scene text detector. In CVPR. 5551--5560.","DOI":"10.1109\/CVPR.2017.283"},{"key":"e_1_3_2_1_40_1","unstructured":"Jun-Yan Zhu Taesung Park Phillip Isola and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV . 2223--2232.  Jun-Yan Zhu Taesung Park Phillip Isola and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV . 2223--2232."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Zhen Zhu Tengteng Huang Baoguang Shi Miao Yu Bofei Wang and Xiang Bai. 2019. Progressive Pose Attention Transfer for Person Image Generation. In CVPR . 2347--2356.  Zhen Zhu Tengteng Huang Baoguang Shi Miao Yu Bofei Wang and Xiang Bai. 2019. Progressive Pose Attention Transfer for Person Image Generation. In CVPR . 2347--2356.","DOI":"10.1109\/CVPR.2019.00245"}],"event":{"name":"MM '19: The 27th ACM International Conference on Multimedia","location":"Nice France","acronym":"MM '19","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 27th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3343031.3350929","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3343031.3350929","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:13:17Z","timestamp":1750201997000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3343031.3350929"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,15]]},"references-count":41,"alternative-id":["10.1145\/3343031.3350929","10.1145\/3343031"],"URL":"https:\/\/doi.org\/10.1145\/3343031.3350929","relation":{},"subject":[],"published":{"date-parts":[[2019,10,15]]},"assertion":[{"value":"2019-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}