{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,10]],"date-time":"2026-07-10T13:22:38Z","timestamp":1783689758284,"version":"3.55.0"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2021,2,2]],"date-time":"2021-02-02T00:00:00Z","timestamp":1612224000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,2,2]],"date-time":"2021-02-02T00:00:00Z","timestamp":1612224000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Projekt DEAL"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["IJDAR"],"published-print":{"date-parts":[[2021,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We address the problem of offline handwritten diagram recognition. Recently, it has been shown that diagram symbols can be directly recognized with deep learning object detectors. However, object detectors are not able to recognize the diagram structure. We propose Arrow R-CNN, the first deep learning system for joint symbol and structure recognition in handwritten diagrams. Arrow R-CNN extends the Faster R-CNN object detector with an arrow head and tail keypoint predictor and a diagram-aware postprocessing method. We propose a network architecture and data augmentation methods targeted at small diagram datasets. Our diagram-aware postprocessing method addresses the insufficiencies of standard Faster R-CNN postprocessing. It reconstructs a diagram from a set of symbol detections and arrow keypoints. Arrow R-CNN improves state-of-the-art substantially: on a scanned flowchart dataset, we increase the rate of recognized diagrams from 37.7 to 78.6%.<\/jats:p>","DOI":"10.1007\/s10032-020-00361-1","type":"journal-article","created":{"date-parts":[[2021,2,2]],"date-time":"2021-02-02T07:04:26Z","timestamp":1612249466000},"page":"3-17","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Arrow R-CNN for handwritten diagram recognition"],"prefix":"10.1007","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4364-0086","authenticated-orcid":false,"given":"Bernhard","family":"Sch\u00e4fer","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8437-7993","authenticated-orcid":false,"given":"Margret","family":"Keuper","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0209-3859","authenticated-orcid":false,"given":"Heiner","family":"Stuckenschmidt","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,2,2]]},"reference":[{"key":"361_CR1","doi-asserted-by":"crossref","unstructured":"Awal, A.M., Feng, G., Mouch\u00e8re, H., Viard-Gaudin, C.: First experiments on a new online handwritten flowchart database. In: Document Recognition and Retrieval XVIII, p. 78740A (2011)","DOI":"10.1117\/12.876624"},{"key":"361_CR2","doi-asserted-by":"crossref","unstructured":"Bresler, M., Phan, T.V., Prusa, D., Nakagawa, M., Hlav\u00e1c, V.: Recognition System for On-Line Sketched Diagrams. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 563\u2013568 (2014)","DOI":"10.1109\/ICFHR.2014.100"},{"key":"361_CR3","doi-asserted-by":"crossref","unstructured":"Bresler, M., Pr\u016f\u0161a, D., Hlav\u00e1\u010d, V.: Modeling flowchart structure recognition as a max-sum problem. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1215\u20131219 (2013)","DOI":"10.1109\/ICDAR.2013.246"},{"key":"361_CR4","unstructured":"Bresler, M., Pr\u016f\u0161a, D., Hlav\u00e1\u010d, V.: Simultaneous segmentation and recognition of graphical symbols using a composite descriptor. In: 18th Computer Vision Winter Workshop, vol. 13, pp. 16\u201323 (2013)"},{"key":"361_CR5","doi-asserted-by":"crossref","unstructured":"Bresler, M., Pr\u016f\u0161a, D., Hlav\u00e1\u010d, V.: Detection of arrows in on-line sketched diagrams using relative stroke positioning. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 610\u2013617 (2015)","DOI":"10.1109\/WACV.2015.87"},{"issue":"3","key":"361_CR6","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1007\/s10032-016-0269-z","volume":"19","author":"M Bresler","year":"2016","unstructured":"Bresler, M., Pr\u016f\u0161a, D., Hlav\u00e1\u010d, V.: Online recognition of sketched arrow-connected diagrams. Int. J. Doc. Anal. Recognit. (IJDAR) 19(3), 253\u2013267 (2016)","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"361_CR7","doi-asserted-by":"crossref","unstructured":"Bresler, M., Pr\u016f\u0161a, D., Hlav\u00e1\u010d, V.: Recognizing off-line flowcharts by reconstructing strokes and using on-line recognition techniques. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 48\u201353 (2016)","DOI":"10.1109\/ICFHR.2016.0022"},{"key":"361_CR8","unstructured":"Buslaev, A., Parinov, A., Khvedchenya, E., Iglovikov, V.I., Kalinin, A.A.: Albumentations: Fast and flexible image augmentations. arXiv:1809.06839 [cs] (2018)"},{"key":"361_CR9","doi-asserted-by":"crossref","unstructured":"Carton, C., Lemaitre, A., Co\u00fcasnon, B.: Fusion of statistical and structural information for flowchart recognition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1210\u20131214 (2013)","DOI":"10.1109\/ICDAR.2013.245"},{"key":"361_CR10","doi-asserted-by":"crossref","unstructured":"Cherubini, M., Venolia, G., DeLine, R., Ko, A.J.: Let\u2019s go to the whiteboard: how and why software developers use drawings. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI \u201907, pp. 557\u2013566 (2007)","DOI":"10.1145\/1240624.1240714"},{"key":"361_CR11","unstructured":"Gervais, P., Deselaers, T., Aksan, E., Hilliges, O.: The DIDI dataset: digital ink diagram data. arXiv:2002.09303 [cs] (2020)"},{"key":"361_CR12","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580\u2013587 (2014)","DOI":"10.1109\/CVPR.2014.81"},{"key":"361_CR13","unstructured":"Goyal, P., Doll\u00e1r, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., Tulloch, A., Jia, Y., He, K.: Accurate, large minibatch SGD: training ImageNet in 1 Hour. arXiv:1706.02677 [cs] (2017)"},{"key":"361_CR14","doi-asserted-by":"publisher","unstructured":"Julca-Aguilar, F., Mouch\u00e8re, H., Viard-Gaudin, C., Hirata, N.S.T.: A general framework for the recognition of online handwritten graphics. Int. J. Doc. Anal. Recognit. (IJDAR) 23, 143\u2013160 (2020). https:\/\/doi.org\/10.1007\/s10032-019-00349-6","DOI":"10.1007\/s10032-019-00349-6"},{"key":"361_CR15","doi-asserted-by":"crossref","unstructured":"Julca-Aguilar, F.D., Hirata, N.S.T.: Symbol detection in online handwritten graphics using faster R-CNN. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 151\u2013156 (2018)","DOI":"10.1109\/DAS.2018.79"},{"issue":"1","key":"361_CR16","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1007\/s11263-016-0981-7","volume":"123","author":"R Krishna","year":"2017","unstructured":"Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A., Bernstein, M.S., Fei-Fei, L.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1), 32\u201373 (2017)","journal-title":"Int. J. Comput. Vis."},{"issue":"4","key":"361_CR17","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1007\/s10032-019-00336-x","volume":"22","author":"P Krishnan","year":"2019","unstructured":"Krishnan, P., Jawahar, C.V.: HWNet v2: an efficient word image representation for handwritten documents. Int. J. Doc. Anal. Recognit. (IJDAR) 22(4), 387\u2013405 (2019)","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"361_CR18","doi-asserted-by":"crossref","unstructured":"Lemaitre, A., Mouch\u00e8re, H., Camillerapp, J., Co\u00fcasnon, B.: Interest of syntactic knowledge for on-line flowchart recognition. In: Graphics Recognition. New Trends and Challenges, Lecture Notes in Computer Science, pp. 89\u201398 (2013)","DOI":"10.1007\/978-3-642-36824-0_9"},{"key":"361_CR19","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117\u20132125 (2017)","DOI":"10.1109\/CVPR.2017.106"},{"key":"361_CR20","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Computer Vision\u2014ECCV 2014, Lecture Notes in Computer Science, pp. 740\u2013755 (2014)","DOI":"10.1007\/978-3-319-10602-1_48"},{"issue":"1","key":"361_CR21","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1007\/s100320200071","volume":"5","author":"UV Marti","year":"2002","unstructured":"Marti, U.V., Bunke, H.: The IAM-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39\u201346 (2002)","journal-title":"Int. J. Doc. Anal. Recognit."},{"key":"361_CR22","unstructured":"Massa, F., Girshick, R.: Maskrcnn-benchmark: fast, modular reference implementation of instance segmentation and object detection algorithms in PyTorch. https:\/\/github.com\/facebookresearch\/maskrcnn-benchmark (2018)"},{"key":"361_CR23","doi-asserted-by":"crossref","unstructured":"Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Computer Vision\u2014ECCV 2016, Lecture Notes in Computer Science, pp. 483\u2013499 (2016)","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"361_CR24","unstructured":"Notowidigdo, M., Miller, R.C.: Off-line sketch interpretation. In: AAAI Fall Symposium, pp. 120\u2013126. Arlington, VA (2004)"},{"key":"361_CR25","first-page":"8024","volume":"32","author":"A Paszke","year":"2019","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8024\u20138035 (2019)","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"361_CR26","first-page":"91","volume":"28","author":"S Ren","year":"2015","unstructured":"Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91\u201399 (2015)","journal-title":"Adv. Neural Inf. Process. Syst."},{"issue":"3","key":"361_CR27","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1109\/MIS.2016.24","volume":"31","author":"K Santosh","year":"2016","unstructured":"Santosh, K., Wendling, L., Antani, S., Thoma, G.R.: Overlaid arrow detection for labeling regions of interest in biomedical images. IEEE Intell. Syst. 31(3), 66\u201375 (2016)","journal-title":"IEEE Intell. Syst."},{"key":"361_CR28","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-13-2339-3","volume-title":"Document Image Analysis: Current Trends and Challenges in Graphics Recognition","author":"KC Santosh","year":"2018","unstructured":"Santosh, K.C.: Document Image Analysis: Current Trends and Challenges in Graphics Recognition. Springer, Berlin (2018)"},{"issue":"05","key":"361_CR29","doi-asserted-by":"publisher","first-page":"1657002","DOI":"10.1142\/S0218001416570020","volume":"30","author":"KC Santosh","year":"2016","unstructured":"Santosh, K.C., Alam, N., Roy, P.P., Wendling, L., Antani, S., Thoma, G.R.: A simple and efficient arrowhead detection technique in biomedical images. Int. J. Pattern Recognit. Artif. Intell. 30(05), 1657002 (2016)","journal-title":"Int. J. Pattern Recognit. Artif. Intell."},{"issue":"3","key":"361_CR30","doi-asserted-by":"publisher","first-page":"331","DOI":"10.1016\/j.patrec.2011.09.040","volume":"33","author":"KC Santosh","year":"2012","unstructured":"Santosh, K.C., Lamiroy, B., Wendling, L.: Symbol recognition using spatial relations. Pattern Recognit. Lett. 33(3), 331\u2013341 (2012)","journal-title":"Pattern Recognit. Lett."},{"issue":"1","key":"361_CR31","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1007\/s10032-013-0205-4","volume":"17","author":"KC Santosh","year":"2014","unstructured":"Santosh, K.C., Lamiroy, B., Wendling, L.: Integrating vocabulary clustering with spatial relations for symbol recognition. Int. J. Doc. Anal. Recognit. (IJDAR) 17(1), 61\u201378 (2014)","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"361_CR32","doi-asserted-by":"crossref","unstructured":"Sch\u00e4fer, B., Stuckenschmidt, H.: Arrow R-CNN for flowchart recognition. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), p. 7 (2019)","DOI":"10.1109\/ICDARW.2019.00007"},{"key":"361_CR33","doi-asserted-by":"crossref","unstructured":"Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: Proceedings of the 7th International Conference on Document Analysis and Recognition - Volume 2, ICDAR \u201903, p. 958 (2003)","DOI":"10.1109\/ICDAR.2003.1227801"},{"key":"361_CR34","doi-asserted-by":"crossref","unstructured":"Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 529\u2013545 (2018)","DOI":"10.1007\/978-3-030-01231-1_33"},{"key":"361_CR35","doi-asserted-by":"crossref","unstructured":"Toshev, A., Szegedy, C.: DeepPose: Human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653\u20131660 (2014)","DOI":"10.1109\/CVPR.2014.214"},{"issue":"2","key":"361_CR36","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/s10032-017-0284-8","volume":"20","author":"C Wang","year":"2017","unstructured":"Wang, C., Mouch\u00e8re, H., Lemaitre, A., Viard-Gaudin, C.: Online flowchart understanding by combining max-margin Markov random field with grammatical analysis. Int. J. Doc. Anal. Recognit. (IJDAR) 20(2), 123\u2013136 (2017)","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"361_CR37","doi-asserted-by":"crossref","unstructured":"Wang, C., Mouch\u00e8re, H., Viard-Gaudin, C., Jin, L.: Combined segmentation and recognition of online handwritten diagrams with high order markov random field. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 252\u2013257 (2016)","DOI":"10.1109\/ICFHR.2016.0056"},{"key":"361_CR38","unstructured":"Wu, J., Wang, C., Zhang, L., Rui, Y.: Offline sketch parsing via shapeness estimation. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)"},{"key":"361_CR39","doi-asserted-by":"crossref","unstructured":"Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466\u2013481 (2018)","DOI":"10.1007\/978-3-030-01231-1_29"},{"key":"361_CR40","doi-asserted-by":"crossref","unstructured":"Yun, X.L., Zhang, Y.M., Ye, J.Y., Liu, C.L.: Online handwritten diagram recognition with graph attention networks. In: Image and Graphics, Lecture Notes in Computer Science, pp. 232\u2013244 (2019)","DOI":"10.1007\/978-3-030-34120-6_19"},{"issue":"3","key":"361_CR41","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1007\/s10032-019-00335-y","volume":"22","author":"Z Zhong","year":"2019","unstructured":"Zhong, Z., Sun, L., Huo, Q.: An anchor-free region proposal network for faster R-CNN-based text detection approaches. Int. J. Doc. Anal. Recognit. (IJDAR) 22(3), 315\u2013327 (2019)","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"}],"container-title":["International Journal on Document Analysis and Recognition (IJDAR)"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-020-00361-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10032-020-00361-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-020-00361-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,23]],"date-time":"2024-08-23T11:21:35Z","timestamp":1724412095000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10032-020-00361-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,2]]},"references-count":41,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2021,6]]}},"alternative-id":["361"],"URL":"https:\/\/doi.org\/10.1007\/s10032-020-00361-1","relation":{},"ISSN":["1433-2833","1433-2825"],"issn-type":[{"value":"1433-2833","type":"print"},{"value":"1433-2825","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,2]]},"assertion":[{"value":"6 February 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 October 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 December 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 February 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}