{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T16:36:45Z","timestamp":1772642205102,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T00:00:00Z","timestamp":1665360000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,10]]},"DOI":"10.1145\/3503161.3548422","type":"proceedings-article","created":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T15:42:46Z","timestamp":1665416566000},"page":"4857-4866","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["Towards Complex Document Understanding By Discrete Reasoning"],"prefix":"10.1145","author":[{"given":"Fengbin","family":"Zhu","sequence":"first","affiliation":[{"name":"National University of Singapore &amp; 6Estates Pte Ltd, Singapore, Singapore"}]},{"given":"Wenqiang","family":"Lei","sequence":"additional","affiliation":[{"name":"Sichuan University, Chengdu, China"}]},{"given":"Fuli","family":"Feng","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, Hefei, China"}]},{"given":"Chao","family":"Wang","sequence":"additional","affiliation":[{"name":"6Estates Pte Ltd, Singapore, Singapore"}]},{"given":"Haozhou","family":"Zhang","sequence":"additional","affiliation":[{"name":"Sichuan University, Chengdu, China"}]},{"given":"Tat-Seng","family":"Chua","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2022,10,10]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"crossref","unstructured":"Daniel Andor Luheng He Kenton Lee and Emily Pitler. 2019. Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension. In EMNLP-IJCNLP. ACL 5947--5952. Daniel Andor Luheng He Kenton Lee and Emily Pitler. 2019. Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension. In EMNLP-IJCNLP. ACL 5947--5952.","DOI":"10.18653\/v1\/D19-1609"},{"key":"e_1_3_2_2_2_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV). 993--1003","author":"Appalaraju Srikar","unstructured":"Srikar Appalaraju , Bhavan Jasani , Bhargava Urala Kota , Yusheng Xie , and R. Manmatha . 2021. DocFormer: End-to-End Transformer for Document Understanding . In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV). 993--1003 . Srikar Appalaraju, Bhavan Jasani, Bhargava Urala Kota, Yusheng Xie, and R. Manmatha. 2021. DocFormer: End-to-End Transformer for Document Understanding. In Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV). 993--1003."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00439"},{"key":"e_1_3_2_2_4_1","unstructured":"Daniel G Bobrow. 1964. Natural language input for a computer problem solving system. (1964). Daniel G Bobrow. 1964. Natural language input for a computer problem solving system. (1964)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"crossref","unstructured":"Kunlong Chen Weidi Xu Xingyi Cheng Zou Xiaochuan Yuyu Zhang Le Song Taifeng Wang Yuan Qi and Wei Chu. 2020. Question Directed Graph Attention Network for Numerical Reasoning over Text. In EMNLP-IJCNLP. ACL 6759--6768. Kunlong Chen Weidi Xu Xingyi Cheng Zou Xiaochuan Yuyu Zhang Le Song Taifeng Wang Yuan Qi and Wei Chu. 2020. Question Directed Graph Attention Network for Numerical Reasoning over Text. In EMNLP-IJCNLP. ACL 6759--6768.","DOI":"10.18653\/v1\/2020.emnlp-main.549"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.343"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.300"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1272"},{"key":"e_1_3_2_2_9_1","volume-title":"Document AI: Benchmarks, Models and Applications. CoRR abs\/2111.08609","author":"Cui Lei","year":"2021","unstructured":"Lei Cui , Yiheng Xu , Tengchao Lv , and Furu Wei . 2021 . Document AI: Benchmarks, Models and Applications. CoRR abs\/2111.08609 (2021). Lei Cui, Yiheng Xu, Tengchao Lv, and Furu Wei. 2021. Document AI: Benchmarks, Models and Applications. CoRR abs\/2111.08609 (2021)."},{"key":"e_1_3_2_2_10_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","volume":"1","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Volume 1 (Long and Short Papers). 4171--4186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186."},{"key":"e_1_3_2_2_11_1","volume-title":"Proc. of NAACL.","author":"Dua Dheeru","year":"2019","unstructured":"Dheeru Dua , Yizhong Wang , Pradeep Dasigi , Gabriel Stanovsky , Sameer Singh , and Matt Gardner . 2019 . DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs . In Proc. of NAACL. Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, and Matt Gardner. 2019. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. In Proc. of NAACL."},{"key":"e_1_3_2_2_12_1","volume-title":"LAMBERT: Layout-Aware language Modeling using BERT for information extraction. CoRR abs\/2002.08087","author":"Garncarek Lukasz","year":"2020","unstructured":"Lukasz Garncarek , Rafal Powalski , Tomasz Stanislawek , Bartosz Topolski , Piotr Halama , and Filip Gralinski . 2020 . LAMBERT: Layout-Aware language Modeling using BERT for information extraction. CoRR abs\/2002.08087 (2020). arXiv:2002.08087 Lukasz Garncarek, Rafal Powalski, Tomasz Stanislawek, Bartosz Topolski, Piotr Halama, and Filip Gralinski. 2020. LAMBERT: Layout-Aware language Modeling using BERT for information extraction. CoRR abs\/2002.08087 (2020). arXiv:2002.08087"},{"key":"e_1_3_2_2_13_1","volume-title":"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Goyal Yash","year":"2017","unstructured":"Yash Goyal , Tejas Khot , Douglas Summers-Stay , Dhruv Batra , and Devi Parikh . 2017 . Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR). Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, and Devi Parikh. 2017. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. In Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_2_14_1","volume-title":"Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout. CoRR abs\/2003.02356","author":"Gralinski Filip","year":"2020","unstructured":"Filip Gralinski , Tomasz Stanislawek , Anna Wr\u00f3blewska , Dawid Lipinski , Agnieszka Kaliska , Paulina Rosalska , Bartosz Topolski , and Przemyslaw Biecek . 2020 . Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout. CoRR abs\/2003.02356 (2020). arXiv:2003.02356 Filip Gralinski, Tomasz Stanislawek, Anna Wr\u00f3blewska, Dawid Lipinski, Agnieszka Kaliska, Paulina Rosalska, Bartosz Topolski, and Przemyslaw Biecek. 2020. Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout. CoRR abs\/2003.02356 (2020). arXiv:2003.02356"},{"key":"e_1_3_2_2_15_1","volume-title":"Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units. CoRR abs\/1606.08415","author":"Hendrycks Dan","year":"2016","unstructured":"Dan Hendrycks and Kevin Gimpel . 2016. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units. CoRR abs\/1606.08415 ( 2016 ). arXiv:1606.08415 Dan Hendrycks and Kevin Gimpel. 2016. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units. CoRR abs\/1606.08415 (2016). arXiv:1606.08415"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.398"},{"key":"e_1_3_2_2_17_1","unstructured":"Teakgyu Hong DongHyun Kim Mingi Ji Wonseok Hwang Daehyun Nam and Sungrae Park. 2021. {BROS}: A Pre-trained Language Model for Understanding Texts in Document. https:\/\/openreview.net\/forum?id=punMXQEsPr0 Teakgyu Hong DongHyun Kim Mingi Ji Wonseok Hwang Daehyun Nam and Sungrae Park. 2021. {BROS}: A Pre-trained Language Model for Understanding Texts in Document. https:\/\/openreview.net\/forum?id=punMXQEsPr0"},{"key":"e_1_3_2_2_18_1","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Hu Minghao","unstructured":"Minghao Hu , Yuxing Peng , Zhen Huang , and Dongsheng Li. 2019. A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Association for Computational Linguistics , 1596--1606. Minghao Hu, Yuxing Peng, Zhen Huang, and Dongsheng Li. 2019. A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1596--1606."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1084"},{"key":"e_1_3_2_2_20_1","volume-title":"ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction. In 2019 International Conference on Document Analysis and Recognition (ICDAR). 1516--1520","author":"Huang Zheng","unstructured":"Zheng Huang , Kai Chen , Jianhua He , Xiang Bai , Dimosthenis Karatzas , Shijian Lu , and C. V. Jawahar . 2019 . ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction. In 2019 International Conference on Document Analysis and Recognition (ICDAR). 1516--1520 . Zheng Huang, Kai Chen, Jianhua He, Xiang Bai, Dimosthenis Karatzas, Shijian Lu, and C. V. Jawahar. 2019. ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction. In 2019 International Conference on Document Analysis and Recognition (ICDAR). 1516--1520."},{"key":"e_1_3_2_2_21_1","volume-title":"Hazim Kemal Ekenel, and Jean-Philippe Thiran","author":"Jaume Guillaume","year":"2019","unstructured":"Guillaume Jaume , Hazim Kemal Ekenel, and Jean-Philippe Thiran . 2019 . FUNSD : A Dataset for Form Understanding in Noisy Scanned Documents. CoRR abs\/1905.13538 (2019). arXiv:1905.13538 Guillaume Jaume, Hazim Kemal Ekenel, and Jean-Philippe Thiran. 2019. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents. CoRR abs\/1905.13538 (2019). arXiv:1905.13538"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1026"},{"key":"e_1_3_2_2_23_1","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Li Chenliang","unstructured":"Chenliang Li , Bin Bi , Ming Yan , Wei Wang , Songfang Huang , Fei Huang , and Luo Si. 2021. StructuralLM: Structural Pre-training for Form Understanding . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . Association for Computational Linguistics , 6309--6318. Chenliang Li, Bin Bi, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, and Luo Si. 2021. StructuralLM: Structural Pre-training for Form Understanding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 6309--6318."},{"key":"e_1_3_2_2_24_1","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association","author":"Li Minghao","year":"2020","unstructured":"Minghao Li , Lei Cui , Shaohan Huang , FuruWei, Ming Zhou , and Zhoujun Li . 2020 . TableBank: Table Benchmark for Image-based Table Detection and Recognition . In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association , 1918--1925. Minghao Li, Lei Cui, Shaohan Huang, FuruWei, Ming Zhou, and Zhoujun Li. 2020. TableBank: Table Benchmark for Image-based Table Detection and Recognition. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, 1918--1925."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.5"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.coling-main.82"},{"key":"e_1_3_2_2_27_1","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Liu Qianying","unstructured":"Qianying Liu , Wenyv Guan , Sujian Li , and Daisuke Kawahara . 2019. Treestructured Decoding for Solving Math Word Problems . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Association for Computational Linguistics , 2370--2379. Qianying Liu, Wenyv Guan, Sujian Li, and Daisuke Kawahara. 2019. Treestructured Decoding for Solving Math Word Problems. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2370--2379."},{"key":"e_1_3_2_2_28_1","volume-title":"Dimosthenis Karatzas, Ernest Valveny, and C. V Jawahar.","author":"Mathew Minesh","year":"2021","unstructured":"Minesh Mathew , Viraj Bagal , Rub\u00e8n P\u00e9rez Tito , Dimosthenis Karatzas, Ernest Valveny, and C. V Jawahar. 2021 . InfographicVQA. arXiv:2104.12756 [cs.CV] Minesh Mathew, Viraj Bagal, Rub\u00e8n P\u00e9rez Tito, Dimosthenis Karatzas, Ernest Valveny, and C. V Jawahar. 2021. InfographicVQA. arXiv:2104.12756 [cs.CV]"},{"key":"e_1_3_2_2_29_1","unstructured":"Minesh Mathew Dimosthenis Karatzas R. Manmatha and C. V. Jawahar. 2020. DocVQA: A Dataset for VQA on Document Images. CoRR abs\/2007.00398 (2020). arXiv:2007.00398 Minesh Mathew Dimosthenis Karatzas R. Manmatha and C. V. Jawahar. 2020. DocVQA: A Dataset for VQA on Document Images. CoRR abs\/2007.00398 (2020). arXiv:2007.00398"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1142"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"crossref","unstructured":"Qiu Ran Yankai Lin Peng Li Jie Zhou and Zhiyuan Liu. 2019. NumNet: Machine Reading Comprehension with Numerical Reasoning. In EMNLP-IJCNLP. 2474--2484. Qiu Ran Yankai Lin Peng Li Jie Zhou and Zhiyuan Liu. 2019. NumNet: Machine Reading Comprehension with Numerical Reasoning. In EMNLP-IJCNLP. 2474--2484.","DOI":"10.18653\/v1\/D19-1251"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.248"},{"key":"e_1_3_2_2_34_1","volume-title":"Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, and Marcus Rohrbach. 2019","author":"Singh Amanpreet","year":"1904","unstructured":"Amanpreet Singh , Vivek Natarajan , Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, and Marcus Rohrbach. 2019 . Towards VQA Models that can Read. CoRR abs\/ 1904 .08920 (2019). arXiv:1904.08920 Amanpreet Singh, Vivek Natarajan, Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, and Marcus Rohrbach. 2019. Towards VQA Models that can Read. CoRR abs\/1904.08920 (2019). arXiv:1904.08920"},{"key":"e_1_3_2_2_35_1","volume-title":"CVPR","author":"Smock Brandon","year":"2021","unstructured":"Brandon Smock , Rohith Pesala , and Robin Abraham . 2021 . PubTables-1M: Towards comprehensive table extraction from unstructured documents . In CVPR 2022. Brandon Smock, Rohith Pesala, and Robin Abraham. 2021. PubTables-1M: Towards comprehensive table extraction from unstructured documents. In CVPR 2022."},{"key":"e_1_3_2_2_36_1","volume-title":"A survey of deep learning approaches for ocr and document understanding. arXiv preprint arXiv:2011.13534","author":"Subramani Nishant","year":"2020","unstructured":"Nishant Subramani , Alexandre Matton , Malcolm Greaves , and Adrian Lam . 2020. A survey of deep learning approaches for ocr and document understanding. arXiv preprint arXiv:2011.13534 ( 2020 ). Nishant Subramani, Alexandre Matton, Malcolm Greaves, and Adrian Lam. 2020. A survey of deep learning approaches for ocr and document understanding. arXiv preprint arXiv:2011.13534 (2020)."},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"crossref","unstructured":"Ryota Tanaka Kyosuke Nishida and Sen Yoshida. 2021. VisualMRC: Machine Reading Comprehension on Document Images. In AAAI. Ryota Tanaka Kyosuke Nishida and Sen Yoshida. 2021. VisualMRC: Machine Reading Comprehension on Document Images. In AAAI.","DOI":"10.1609\/aaai.v35i15.17635"},{"key":"e_1_3_2_2_38_1","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani Ashish","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017. Attention is All you Need . In Advances in Neural Information Processing Systems , I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett (Eds.), Vol. 30 . Curran Associates, Inc. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett (Eds.), Vol. 30. Curran Associates, Inc."},{"key":"e_1_3_2_2_39_1","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1064--1069","author":"Wang Lei","year":"2018","unstructured":"Lei Wang , Yan Wang , Deng Cai , Dongxiang Zhang , and Xiaojiang Liu . 2018 . Translating a MathWord Problem to a Expression Tree . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1064--1069 . Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, and Xiaojiang Liu. 2018. Translating a MathWord Problem to a Expression Tree. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1064--1069."},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1088"},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467434"},{"key":"e_1_3_2_2_42_1","volume-title":"Spurthi Amba Hombaiah, and Michael Bendersky","author":"Li Cheng","year":"2021","unstructured":"Te-LinWu, Cheng Li , Mingyang Zhang , Tao Chen , Spurthi Amba Hombaiah, and Michael Bendersky . 2021 . LAMPRET : Layout-Aware Multimodal PreTraining for Document Understanding. CoRR abs\/2104.08405 (2021). arXiv:2104.08405 Te-LinWu, Cheng Li, Mingyang Zhang, Tao Chen, Spurthi Amba Hombaiah, and Michael Bendersky. 2021. LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding. CoRR abs\/2104.08405 (2021). arXiv:2104.08405"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"crossref","unstructured":"Zhipeng Xie and Shichao Sun. 2019. A Goal-Driven Tree-Structured Neural Model for Math Word Problems.. In IJCAI. 5299--5305. Zhipeng Xie and Shichao Sun. 2019. A Goal-Driven Tree-Structured Neural Model for Math Word Problems.. In IJCAI. 5299--5305.","DOI":"10.24963\/ijcai.2019\/736"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403172"},{"key":"e_1_3_2_2_45_1","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Xu Yang","unstructured":"Yang Xu , Yiheng Xu , Tengchao Lv , Lei Cui , Furu Wei , Guoxin Wang , Yijuan Lu , Dinei Florencio , Cha Zhang , Wanxiang Che , Min Zhang , and Lidong Zhou . 2021. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . Association for Computational Linguistics , 2579--2591. Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, and Lidong Zhou. 2021. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 2579--2591."},{"key":"e_1_3_2_2_46_1","unstructured":"Pengcheng Yin Graham Neubig Wen-tau Yih and Sebastian Riedel. 2020. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. In ACL. ACL 8413--8426. Pengcheng Yin Graham Neubig Wen-tau Yih and Sebastian Riedel. 2020. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. In ACL. ACL 8413--8426."},{"key":"e_1_3_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.362"},{"key":"e_1_3_2_2_48_1","volume-title":"UK","author":"Zhong Xu","year":"2020","unstructured":"Xu Zhong , Elaheh Shafiei Bavani , and Antonio Jimeno Yepes . 2020 . Image-Based Table Recognition: Data, Model, and Evaluation. In Computer Vision -- ECCV 2020: 16th European Conference, Glasgow , UK , August 23-28, 2020, Proceedings, Part XXI. Springer-Verlag, 564--580. https:\/\/doi.org\/10.1007\/978-3-030-58589-1_34 10.1007\/978-3-030-58589-1_34 Xu Zhong, Elaheh Shafiei Bavani, and Antonio Jimeno Yepes. 2020. Image-Based Table Recognition: Data, Model, and Evaluation. In Computer Vision -- ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXI. Springer-Verlag, 564--580. https:\/\/doi.org\/10.1007\/978-3-030-58589-1_34"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2019.00166"},{"key":"e_1_3_2_2_50_1","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Zhu Fengbin","unstructured":"Fengbin Zhu , Wenqiang Lei , Youcheng Huang , Chao Wang , Shuo Zhang , Jiancheng Lv , Fuli Feng , and Tat-Seng Chua . 2021. TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) . Association for Computational Linguistics , 3277--3287. Fengbin Zhu, Wenqiang Lei, Youcheng Huang, Chao Wang, Shuo Zhang, Jiancheng Lv, Fuli Feng, and Tat-Seng Chua. 2021. TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 3277--3287."},{"key":"e_1_3_2_2_51_1","volume-title":"Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering. CoRR abs\/2101.00774","author":"Zhu Fengbin","year":"2021","unstructured":"Fengbin Zhu , Wenqiang Lei , Chao Wang , Jianming Zheng , Soujanya Poria , and Tat-Seng Chua . 2021. Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering. CoRR abs\/2101.00774 ( 2021 ). Fengbin Zhu, Wenqiang Lei, Chao Wang, Jianming Zheng, Soujanya Poria, and Tat-Seng Chua. 2021. Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering. CoRR abs\/2101.00774 (2021)."}],"event":{"name":"MM '22: The 30th ACM International Conference on Multimedia","location":"Lisboa Portugal","acronym":"MM '22","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 30th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548422","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3503161.3548422","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:17Z","timestamp":1750182557000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548422"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,10]]},"references-count":51,"alternative-id":["10.1145\/3503161.3548422","10.1145\/3503161"],"URL":"https:\/\/doi.org\/10.1145\/3503161.3548422","relation":{},"subject":[],"published":{"date-parts":[[2022,10,10]]},"assertion":[{"value":"2022-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}