{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,8]],"date-time":"2026-06-08T23:06:44Z","timestamp":1780960004760,"version":"3.54.1"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2022,7,30]],"date-time":"2022-07-30T00:00:00Z","timestamp":1659139200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"crossref","award":["2020AAA0108800"],"award-info":[{"award-number":["2020AAA0108800"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62137002, 61937001,62176209, 62176207, 62172324, 62192781, 62106190, and 62050194"],"award-info":[{"award-number":["62137002, 61937001,62176209, 62176207, 62172324, 62192781, 62106190, and 62050194"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Innovative Research Group of the National Natural Science Foundation of China","award":["61721002"],"award-info":[{"award-number":["61721002"]}]},{"name":"Innovation Research Team of Ministry of Education","award":["IRT 17R86"],"award-info":[{"award-number":["IRT 17R86"]}]},{"name":"Xi\u2019an Jiaotong University, China Postdoctoral Science Foundation","award":["2020M683493"],"award-info":[{"award-number":["2020M683493"]}]},{"name":"China Knowledge Centre for Engineering Science and Technology, the Fundamental Research Funds for the Central Universities","award":["xhj032021013-02"],"award-info":[{"award-number":["xhj032021013-02"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2022,12,31]]},"abstract":"<jats:p>Diagram is a special form of visual expression for representing complex concepts, logic, and knowledge, which widely appears in educational scenes such as textbooks, blogs, and encyclopedias. Current research on diagrams preliminarily focuses on natural disciplines such as Biology and Geography, whose expressions are still similar to natural images. In this article, we construct the first novel geometric type of diagrams dataset in Computer Science field, which has more abstract expressions and complex logical relations. The dataset has exhaustive annotations of objects and relations for about 1,300 diagrams and 3,500 question-answer pairs. We introduce the tasks of diagram classification (DC) and diagram question answering (DQA) based on the new dataset, and propose the Diagram Paring Net (DPN) that focuses on analyzing the topological structure and text information of diagrams. We use DPN-based models to solve DC and DQA tasks, and compare the performances to well-known natural images classification models and visual question answering models. Our experiments show the effectiveness of the proposed DPN-based models on diagram understanding tasks, also indicate that our dataset is more complex compared to previous natural image understanding datasets. The presented dataset opens new challenges for research in diagram understanding, and the DPN method provides a novel perspective for studying such data. Our dataset can be available from https:\/\/github.com\/WayneWong97\/CSDia.<\/jats:p>","DOI":"10.1145\/3522689","type":"journal-article","created":{"date-parts":[[2022,3,18]],"date-time":"2022-03-18T18:13:12Z","timestamp":1647627192000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Computer Science Diagram Understanding with Topology Parsing"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5536-6515","authenticated-orcid":false,"given":"Shaowei","family":"Wang","sequence":"first","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3074-8565","authenticated-orcid":false,"given":"Lingling","family":"Zhang","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9434-6990","authenticated-orcid":false,"given":"Xuan","family":"Luo","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7708-9718","authenticated-orcid":false,"given":"Yi","family":"Yang","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7574-3931","authenticated-orcid":false,"given":"Xin","family":"Hu","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4874-2567","authenticated-orcid":false,"given":"Tao","family":"Qin","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6004-0675","authenticated-orcid":false,"given":"Jun","family":"Liu","sequence":"additional","affiliation":[{"name":"SPKLSTN Lab, Xi\u2019an, Shaanxi, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,7,30]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Jaided AI. 2020. EasyOCR. Retrieved October 9 2020 from https:\/\/github.com\/JaidedAI\/EasyOCR."},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.279"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-010-0182-0"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.285"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_7_2","article-title":"Telling juxtapositions: Using repetition and alignable difference in diagram understanding","author":"Ferguson Ronald W.","year":"1998","unstructured":"Ronald W. Ferguson and Kenneth D. Forbus. 1998. Telling juxtapositions: Using repetition and alignable difference in diagram understanding. Advances in Analogy Research 5, 1 (1998), 109\u2013117.","journal-title":"Advances in Analogy Research"},{"key":"e_1_3_2_8_2","first-page":"510","volume-title":"Proceedings of the AAAI\/IAAI","author":"Ferguson Ronald W.","year":"2000","unstructured":"Ronald W. Ferguson and Kenneth D. Forbus. 2000. GeoRep: A flexible tool for spatial representation of line drawings. In Proceedings of the AAAI\/IAAI. 510\u2013516."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2003.1227811"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.441"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3360901.3364420"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_13_2","article-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and  \\( \\lt \\) 0.5 MB model size","author":"Iandola Forrest N.","year":"2017","unstructured":"Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2017. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \\( \\lt \\) 0.5 MB model size. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017).","journal-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.215"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_15"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.571"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1347"},{"key":"e_1_3_2_18_2","unstructured":"Jin-Hwa Kim Jaehyun Jun and Byoung-Tak Zhang. 2018. Bilinear attention networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems ."},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-016-0981-7"},{"key":"e_1_3_2_20_2","doi-asserted-by":"crossref","unstructured":"Jayant Krishnamurthy Oyvind Tafjord and Aniruddha Kembhavi. 2016. Semantic parsing to probabilistic programs for situated question answering. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing .","DOI":"10.18653\/v1\/D16-1016"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_23_2","volume-title":"Digital Logic Circuit","author":"Liu Changshu","year":"2002","unstructured":"Changshu Liu. 2002. Digital Logic Circuit. National Defense Industry Press."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1080\/01431160600746456"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.5555\/1404505"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-45442-5_36"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-1029"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v28i1.9146"},{"key":"e_1_3_2_32_2","unstructured":"Clifford A. Shaffer. 2012. Data structures and algorithm analysis. Prentice Hall Upper Saddle River NJ."},{"key":"e_1_3_2_33_2","volume-title":"High Score Notes of Data Structure","author":"Shuai Hui","year":"2018","unstructured":"Hui Shuai. 2018. High Score Notes of Data Structure. China Machine Press."},{"key":"e_1_3_2_34_2","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_3_2_35_2","volume-title":"Principles of Computer Organization","author":"Tang Shuofei","year":"2000","unstructured":"Shuofei Tang, Xudong Liu, and Cheng Wang. 2000. Principles of Computer Organization. Higher Education Press."},{"key":"e_1_3_2_36_2","volume-title":"Computer Operating System","author":"Tang Xiaodan","year":"2007","unstructured":"Xiaodan Tang, Hongbing Liang, Fengping Zhe, and Ziying Tang. 2007. Computer Operating System. Xidian University Press."},{"key":"e_1_3_2_37_2","volume-title":"Proceedings of the COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics","author":"Watanabe Yasuhiko","year":"1998","unstructured":"Yasuhiko Watanabe and Makoto Nagao. 1998. Diagram understanding using integration of layout information and textual information. In Proceedings of the COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.634"},{"key":"e_1_3_2_39_2","volume-title":"Data Structure C version","author":"Yan Weimin","year":"2002","unstructured":"Weimin Yan and Minwei Wu. 2002. Data Structure C version. TsingHua University Press."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00166"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00644"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.202"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.683"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522689","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3522689","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:15Z","timestamp":1750188615000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3522689"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,30]]},"references-count":42,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,12,31]]}},"alternative-id":["10.1145\/3522689"],"URL":"https:\/\/doi.org\/10.1145\/3522689","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,30]]},"assertion":[{"value":"2021-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}