{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,30]],"date-time":"2025-06-30T04:04:12Z","timestamp":1751256252596,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":15,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,9,9]],"date-time":"2019-09-09T00:00:00Z","timestamp":1567987200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,9,9]]},"DOI":"10.1145\/3341162.3345601","type":"proceedings-article","created":{"date-parts":[[2019,9,11]],"date-time":"2019-09-11T16:16:21Z","timestamp":1568218581000},"page":"482-485","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Neural caption generation over figures"],"prefix":"10.1145","author":[{"given":"Charles","family":"Chen","sequence":"first","affiliation":[{"name":"Ohio University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruiyi","family":"Zhang","sequence":"additional","affiliation":[{"name":"Duke University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sungchul","family":"Kim","sequence":"additional","affiliation":[{"name":"Adobe Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Scott","family":"Cohen","sequence":"additional","affiliation":[{"name":"Adobe Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tong","family":"Yu","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ryan","family":"Rossi","sequence":"additional","affiliation":[{"name":"Adobe Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Razvan","family":"Bunescu","sequence":"additional","affiliation":[{"name":"Ohio University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,9,9]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of the ACL workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. 65--72","author":"Banerjee Satanjeev","year":"2005","unstructured":"Satanjeev Banerjee and Alon Lavie . 2005 . METEOR: An automatic metric for MT evaluation with improved correlation with human judgments . In Proceedings of the ACL workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. 65--72 . Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and\/or Summarization. 65--72."},{"key":"e_1_3_2_1_2_1","volume-title":"Imagenet: A large-scale hierarchical image database. In CVPR. Ieee, 248--255.","author":"Deng Jia","year":"2009","unstructured":"Jia Deng , Wei Dong , Richard Socher , Li-Jia Li , Kai Li , and Li Fei-Fei . 2009 . Imagenet: A large-scale hierarchical image database. In CVPR. Ieee, 248--255. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. Ieee, 248--255."},{"key":"e_1_3_2_1_3_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR . 770--778."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_5_1","volume-title":"DVQA: Understanding Data Visualizations via Question Answering. In CVPR. 5648--5656.","author":"Kafle Kushal","year":"2018","unstructured":"Kushal Kafle , Brian Price , Scott Cohen , and Christopher Kanan . 2018 . DVQA: Understanding Data Visualizations via Question Answering. In CVPR. 5648--5656. Kushal Kafle, Brian Price, Scott Cohen, and Christopher Kanan. 2018. DVQA: Understanding Data Visualizations via Question Answering. In CVPR. 5648--5656."},{"key":"e_1_3_2_1_6_1","volume-title":"Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300","author":"Kahou Samira Ebrahimi","year":"2017","unstructured":"Samira Ebrahimi Kahou , Adam Atkinson , Vincent Michalski , \u00c1kos K\u00e1d\u00e1r , Adam Trischler , and Yoshua Bengio . 2017 . Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300 (2017). Samira Ebrahimi Kahou, Adam Atkinson, Vincent Michalski, \u00c1kos K\u00e1d\u00e1r, Adam Trischler, and Yoshua Bengio. 2017. Figureqa: An annotated figure dataset for visual reasoning. arXiv preprint arXiv:1710.07300 (2017)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR. 3128--3137.  Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic alignments for generating image descriptions. In CVPR . 3128--3137.","DOI":"10.1109\/CVPR.2015.7298932"},{"key":"e_1_3_2_1_8_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_1_9_1","volume-title":"Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin . 2004 . Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out (2004). Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out (2004)."},{"volume-title":"Microsoft coco: Common objects in context","author":"Lin Tsung-Yi","key":"e_1_3_2_1_10_1","unstructured":"Tsung-Yi Lin , Michael Maire , Serge Belongie , James Hays , Pietro Perona , Deva Ramanan , Piotr Doll\u00e1r , and C Lawrence Zitnick . 2014. Microsoft coco: Common objects in context . In ECCV. Springer , 740--755. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV. Springer, 740--755."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"crossref","unstructured":"Steven J Rennie Etienne Marcheret Youssef Mroueh Jarret Ross and Vaibhava Goel. 2016. Self-critical sequence training for image captioning. In CVPR.  Steven J Rennie Etienne Marcheret Youssef Mroueh Jarret Ross and Vaibhava Goel. 2016. Self-critical sequence training for image captioning. In CVPR .","DOI":"10.1109\/CVPR.2017.131"},{"volume-title":"FigureSeer: Parsing result-figures in research papers","author":"Siegel Noah","key":"e_1_3_2_1_13_1","unstructured":"Noah Siegel , Zachary Horvitz , Roie Levin , Santosh Divvala , and Ali Farhadi . 2016. FigureSeer: Parsing result-figures in research papers . In ECCV. Springer , 664--680. Noah Siegel, Zachary Horvitz, Roie Levin, Santosh Divvala, and Ali Farhadi. 2016. FigureSeer: Parsing result-figures in research papers. In ECCV. Springer, 664--680."},{"key":"e_1_3_2_1_14_1","volume-title":"Cider: Consensus-based image description evaluation. In CVPR. 4566--4575.","author":"Vedantam Ramakrishna","year":"2015","unstructured":"Ramakrishna Vedantam , C Lawrence Zitnick , and Devi Parikh . 2015 . Cider: Consensus-based image description evaluation. In CVPR. 4566--4575. Ramakrishna Vedantam, C Lawrence Zitnick, and Devi Parikh. 2015. Cider: Consensus-based image description evaluation. In CVPR. 4566--4575."},{"key":"e_1_3_2_1_15_1","unstructured":"Kelvin Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhudinov Rich Zemel and Yoshua Bengio. 2015. Show attend and tell: Neural image caption generation with visual attention. In ICML. 2048--2057.   Kelvin Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhudinov Rich Zemel and Yoshua Bengio. 2015. Show attend and tell: Neural image caption generation with visual attention. In ICML . 2048--2057."}],"event":{"name":"UbiComp '19: The 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing","sponsor":["SIGMOBILE ACM Special Interest Group on Mobility of Systems, Users, Data and Computing","SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"location":"London United Kingdom","acronym":"UbiComp '19"},"container-title":["Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3341162.3345601","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3341162.3345601","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:13:09Z","timestamp":1750201989000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3341162.3345601"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,9]]},"references-count":15,"alternative-id":["10.1145\/3341162.3345601","10.1145\/3341162"],"URL":"https:\/\/doi.org\/10.1145\/3341162.3345601","relation":{},"subject":[],"published":{"date-parts":[[2019,9,9]]},"assertion":[{"value":"2019-09-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}