{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T09:03:29Z","timestamp":1765357409830,"version":"3.37.3"},"reference-count":61,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,7,13]],"date-time":"2023-07-13T00:00:00Z","timestamp":1689206400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,13]],"date-time":"2023-07-13T00:00:00Z","timestamp":1689206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62171135"],"award-info":[{"award-number":["62171135"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100003392","name":"Natural Science Foundation of Fujian Province","doi-asserted-by":"publisher","award":["2022J06010"],"award-info":[{"award-number":["2022J06010"]}],"id":[{"id":"10.13039\/501100003392","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Industry-University Research Project of Education Department 2020"},{"name":"Industry Software Project of Industry Department of Fujian Province 2023"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recently, segmentation-based approaches have been proposed to tackle arbitrary-shaped text detection. The trade-off between speed and accuracy is still a challenge that hinders its deployment in practical applications. Previous methods adopt complex pipelines to improve accuracy while ignoring inference speed. Moreover, the performance of most efficient scene text detectors often suffers from weak feature extraction when equipping lightweight networks. In this paper, we propose a novel distillation method for efficient and accurate arbitrary-shaped text detection, termed kernel-mask knowledge distillation. Our approach equips a low computational-cost visual transformer module (VTM) and a feature adaptation layer to make full use of feature-based and response-based knowledge in distillation. More specifically, first, the text features are obtained by aggregating the multi-level information extracted in the respective backbones of the teacher and student networks. Second, the text features are respectively sent to the VTM to enhance the feature representation ability. Then, we distill the feature-based and response-based kernel knowledge of the teacher network to obtain an efficient and accurate arbitrary-shaped text detection model. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method. It is worth noting that our method can achieve a competitive <jats:italic>F<\/jats:italic>-measure of 86.92% at 34.5 FPS on Total-text. Code is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/giganticpower\/KKDnet\">https:\/\/github.com\/giganticpower\/KKDnet<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s40747-023-01134-z","type":"journal-article","created":{"date-parts":[[2023,7,13]],"date-time":"2023-07-13T02:02:21Z","timestamp":1689213741000},"page":"75-86","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Kernel-mask knowledge distillation for efficient and accurate arbitrary-shaped text detection"],"prefix":"10.1007","volume":"10","author":[{"given":"Honghui","family":"Chen","sequence":"first","affiliation":[]},{"given":"Yuhang","family":"Qiu","sequence":"additional","affiliation":[]},{"given":"Mengxi","family":"Jiang","sequence":"additional","affiliation":[]},{"given":"Jianhui","family":"Lin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2876-653X","authenticated-orcid":false,"given":"Pingping","family":"Chen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,13]]},"reference":[{"key":"1134_CR1","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Las Vegas, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1134_CR2","doi-asserted-by":"crossref","unstructured":"Huang G, Liu Z, Van Der\u00a0Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Honolulu, pp 4700\u20134708","DOI":"10.1109\/CVPR.2017.243"},{"key":"1134_CR3","first-page":"91","volume":"28","author":"S Ren","year":"2015","unstructured":"Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91\u201399","journal-title":"Adv Neural Inf Process Syst"},{"key":"1134_CR4","doi-asserted-by":"crossref","unstructured":"Hu H, Zhang C, Luo Y, Wang Y, Han J, Ding E (2017) Wordsup: exploiting word annotations for character based text detection. In: Proceedings of the IEEE international conference on computer vision, IEEE, Venice, pp 4940\u20134949","DOI":"10.1109\/ICCV.2017.529"},{"key":"1134_CR5","doi-asserted-by":"crossref","unstructured":"Liu X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018) Fots: fast oriented text spotting with a unified network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Salt Lake City, pp 5676\u20135685","DOI":"10.1109\/CVPR.2018.00595"},{"key":"1134_CR6","doi-asserted-by":"crossref","unstructured":"Lyu P, Yao C, Wu W, Yan S, Bai X (2018) Multi-oriented scene text detection via corner localization and region segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Salt Lake City, pp 7553\u20137563","DOI":"10.1109\/CVPR.2018.00788"},{"key":"1134_CR7","doi-asserted-by":"crossref","unstructured":"He K, Gkioxari G, Doll\u00e1r P, Girshick R (2017) Mask R-cnn. In: Proceedings of the IEEE international conference on computer vision, IEEE, Seoul, pp 2961\u20132969","DOI":"10.1109\/ICCV.2017.322"},{"key":"1134_CR8","doi-asserted-by":"crossref","unstructured":"Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Salt Lake City, pp 8759\u20138768","DOI":"10.1109\/CVPR.2018.00913"},{"key":"1134_CR9","doi-asserted-by":"crossref","unstructured":"Deng D, Liu H, Li X, Cai D (2018) Pixellink: detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, AAAI, Louisiana, 32","DOI":"10.1609\/aaai.v32i1.12269"},{"key":"1134_CR10","doi-asserted-by":"crossref","unstructured":"Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE\/CVF international conference on computer vision, IEEE, Seoul, vol 5, pp 8440\u20138449","DOI":"10.1109\/ICCV.2019.00853"},{"key":"1134_CR11","doi-asserted-by":"publisher","first-page":"11474","DOI":"10.1609\/aaai.v34i07.6812","volume":"34","author":"M Liao","year":"2020","unstructured":"Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. Proceedings of the AAAI conference on artificial intelligence, AAAI, New York 34:11474\u201311481","journal-title":"Proceedings of the AAAI conference on artificial intelligence, AAAI, New York"},{"key":"1134_CR12","doi-asserted-by":"crossref","unstructured":"Zhang S-X, Zhu X, Yang C, Wang H, Yin X-C (2021) Adaptive boundary proposal network for arbitrary shape text detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, IEEE, Montreal, pp 1305\u20131314","DOI":"10.1109\/ICCV48922.2021.00134"},{"key":"1134_CR13","doi-asserted-by":"crossref","unstructured":"Zhu Y, Chen J, Liang L, Kuang Z, Jin L, Zhang W (2021) Fourier contour embedding for arbitrary-shaped text detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Online, pp 3123\u20133131","DOI":"10.1109\/CVPR46437.2021.00314"},{"key":"1134_CR14","doi-asserted-by":"crossref","unstructured":"Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), Springer, Munich, pp 20\u201336","DOI":"10.1007\/978-3-030-01216-8_2"},{"key":"1134_CR15","doi-asserted-by":"crossref","unstructured":"Baek Y, Lee B, Han D, Yun S, Lee H (2019) Character region awareness for text detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Long beach, pp 9365\u20139374","DOI":"10.1109\/CVPR.2019.00959"},{"key":"1134_CR16","doi-asserted-by":"publisher","first-page":"465","DOI":"10.1016\/j.neucom.2020.10.099","volume":"453","author":"G Deng","year":"2021","unstructured":"Deng G, Ming Y, Xue J-H (2021) RFRN: a recurrent feature refinement network for accurate and efficient scene text detection. Neurocomputing 453:465\u2013481","journal-title":"Neurocomputing"},{"key":"1134_CR17","unstructured":"Yuliang L, Lianwen J, Shuaitao Z, Sheng Z (2017) Detecting curve text in the wild: new dataset and new solution. arXiv preprint. arXiv:1712.02170"},{"key":"1134_CR18","doi-asserted-by":"crossref","unstructured":"Ch\u2019ng CK, Chan CS (2017) Total-text: a comprehensive dataset for scene text detection and recognition. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), IEEE, Kyoto, vol 1. pp 935\u2013942","DOI":"10.1109\/ICDAR.2017.157"},{"key":"1134_CR19","unstructured":"Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Providence, pp 1083\u20131090"},{"key":"1134_CR20","doi-asserted-by":"crossref","unstructured":"Liao M, Shi B, Bai X, Wang X, Liu W (2017) Textboxes: a fast text detector with a single deep neural network. In: Thirty-first AAAI conference on artificial intelligence AAAI, San Francisco, vol 31","DOI":"10.1609\/aaai.v31i1.11196"},{"issue":"8","key":"1134_CR21","doi-asserted-by":"publisher","first-page":"3676","DOI":"10.1109\/TIP.2018.2825107","volume":"27","author":"M Liao","year":"2018","unstructured":"Liao M, Shi B, Bai X (2018) Textboxes++: a single-shot oriented scene text detector. IEEE Trans Image Process 27(8):3676\u20133690","journal-title":"IEEE Trans Image Process"},{"key":"1134_CR22","doi-asserted-by":"crossref","unstructured":"Liu Y, Jin L (2017) Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Hawai, pp 1962\u20131969","DOI":"10.1109\/CVPR.2017.368"},{"issue":"11","key":"1134_CR23","doi-asserted-by":"publisher","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","volume":"20","author":"J Ma","year":"2018","unstructured":"Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia 20(11):3111\u20133122","journal-title":"IEEE Trans Multimedia"},{"key":"1134_CR24","doi-asserted-by":"crossref","unstructured":"Huang M, Liu Y, Peng Z, Liu C, Lin D, Zhu S, Yuan N, Ding K, Jin L (2022) Swintextspotter: scene text spotting via better synergy between text detection and text recognition. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, New Orleans, vol 5, pp 4593\u20134603","DOI":"10.1109\/CVPR52688.2022.00455"},{"key":"1134_CR25","doi-asserted-by":"crossref","unstructured":"Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Las Vegas, pp 4159\u20134167","DOI":"10.1109\/CVPR.2016.451"},{"key":"1134_CR26","unstructured":"He T, Huang W, Qiao Y, Yao J (2016) Accurate text localization in natural image with cascaded convolutional text network. arXiv preprint. arXiv:1603.09423"},{"key":"1134_CR27","doi-asserted-by":"crossref","unstructured":"He D, Yang X, Liang C, Zhou Z, Ororbi AG, Kifer D, Lee Giles C (2017) Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Hawaii, pp 3519\u20133528","DOI":"10.1109\/CVPR.2017.58"},{"key":"1134_CR28","doi-asserted-by":"crossref","unstructured":"Wu Y, Natarajan P (2017) Self-organized text detection with minimal post-processing via border learning. In: Proceedings of the IEEE international conference on computer vision, IEEE, Vencie, pp 5000\u20135009","DOI":"10.1109\/ICCV.2017.535"},{"key":"1134_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107336","volume":"110","author":"Y Zhu","year":"2021","unstructured":"Zhu Y, Du J (2021) Textmountain: accurate scene text detection via instance segmentation. Pattern Recognit 110:107336","journal-title":"Pattern Recognit"},{"key":"1134_CR30","first-page":"79","volume-title":"International conference on document analysis and recognition","author":"W Zhang","year":"2021","unstructured":"Zhang W, Qiu Y, Liao M, Zhang R, Wei X, Bai X (2021) Scene text detection with scribble line. International conference on document analysis and recognition, vol 5. Springer, Berlin, pp 79\u201394"},{"key":"1134_CR31","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.117564","volume":"205","author":"D Zhong","year":"2022","unstructured":"Zhong D, Lyu S, Shivakumara P, Pal U, Lu Y (2022) Text proposals with location-awareness-attention network for arbitrarily shaped scene text detection and recognition. Expert Syst Appl 205:117564","journal-title":"Expert Syst Appl"},{"issue":"9","key":"1134_CR32","doi-asserted-by":"publisher","first-page":"3337","DOI":"10.3390\/s22093337","volume":"22","author":"J Kang","year":"2022","unstructured":"Kang J, Ibrayim M, Hamdulla A (2022) MR-FPN: multi-level residual feature pyramid text detection network based on self-attention environment. Sensors 22(9):3337","journal-title":"Sensors"},{"issue":"16","key":"1134_CR33","doi-asserted-by":"publisher","first-page":"6262","DOI":"10.3390\/s22166262","volume":"22","author":"M Ibrayim","year":"2022","unstructured":"Ibrayim M, Li Y, Hamdulla A (2022) Scene text detection based on two-branch feature extraction. Sensors 22(16):6262","journal-title":"Sensors"},{"key":"1134_CR34","doi-asserted-by":"crossref","unstructured":"Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Hawaii, pp 5551\u20135560","DOI":"10.1109\/CVPR.2017.283"},{"key":"1134_CR35","doi-asserted-by":"crossref","unstructured":"Liu Z, Lin G, Yang S, Feng J, Lin W, Goh WL (2018) Learning Markov clustering networks for scene text detection. arXiv preprint. arXiv:1805.08365","DOI":"10.1109\/CVPR.2018.00725"},{"key":"1134_CR36","doi-asserted-by":"crossref","unstructured":"Wang P, Zhang C, Qi F, Liu S, Zhang X, Lyu P, Han J, Liu J, Ding E, Shi G (2021) Pgnet: real-time arbitrarily-shaped text spotting with point gathering network. arXiv preprint. arXiv:2104.05458","DOI":"10.1609\/aaai.v35i4.16383"},{"key":"1134_CR37","doi-asserted-by":"crossref","unstructured":"Bucilu\u01ce C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Association for Computing Machinery, Philadelphia, pp 535\u2013541","DOI":"10.1145\/1150402.1150464"},{"key":"1134_CR38","unstructured":"Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint. arXiv:1503.02531"},{"key":"1134_CR39","unstructured":"Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. arXiv preprint. arXiv:1412.6550"},{"key":"1134_CR40","unstructured":"Huang Z, Wang N (2017) Like what you like: knowledge distill via neuron selectivity transfer. arXiv preprint. arXiv:1707.01219"},{"key":"1134_CR41","doi-asserted-by":"crossref","unstructured":"Ahn S, Hu SX, Damianou A, Lawrence ND, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Long Beach, pp 9163\u20139171","DOI":"10.1109\/CVPR.2019.00938"},{"key":"1134_CR42","doi-asserted-by":"crossref","unstructured":"Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: Proceedings of the AAAI conference on artificial intelligence, AAAI, Hawaii, vol 33, pp 3779\u20133787","DOI":"10.1609\/aaai.v33i01.33013779"},{"key":"1134_CR43","unstructured":"Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. Adv Neural Inf Process Syst, vol 30, pp 742\u2013751"},{"key":"1134_CR44","doi-asserted-by":"crossref","unstructured":"Li Q, Jin S, Yan J (2017) Mimicking very efficient network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Hawaii, pp 6356\u20136364","DOI":"10.1109\/CVPR.2017.776"},{"key":"1134_CR45","doi-asserted-by":"crossref","unstructured":"Wang T, Yuan L, Zhang X, Feng J (2019) Distilling object detectors with fine-grained feature imitation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Long Beach, pp 4933\u20134942","DOI":"10.1109\/CVPR.2019.00507"},{"key":"1134_CR46","doi-asserted-by":"crossref","unstructured":"Dai X, Jiang Z, Wu Z, Bao Y, Wang Z, Liu S, Zhou E (2021) General instance distillation for object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Online, pp 7842\u20137851","DOI":"10.1109\/CVPR46437.2021.00775"},{"key":"1134_CR47","doi-asserted-by":"crossref","unstructured":"Guo J, Han K, Wang Y, Wu H, Chen X, Xu C, Xu C (2021) Distilling object detectors via decoupled features. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Online, pp 2154\u20132164","DOI":"10.1109\/CVPR46437.2021.00219"},{"key":"1134_CR48","unstructured":"Wu B, Xu C, Dai X, Wan A, Zhang P, Yan Z, Tomizuka M, Gonzalez J, Keutzer K, Vajda P (2020) Visual transformers: Token-based image representation and processing for computer vision. arXiv preprint. arXiv:2006.03677"},{"key":"1134_CR49","doi-asserted-by":"crossref","unstructured":"Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Online, pp 6881\u20136890","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"1134_CR50","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint. arXiv:2010.11929"},{"key":"1134_CR51","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998\u20136008"},{"key":"1134_CR52","doi-asserted-by":"crossref","unstructured":"Lin T-Y, Doll\u00e1r P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Hawaii, pp 2117\u20132125","DOI":"10.1109\/CVPR.2017.106"},{"key":"1134_CR53","doi-asserted-by":"crossref","unstructured":"Ghiasi G, Lin T-Y, Le QV (2019) NAS-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Long Beach, pp 7036\u20137045","DOI":"10.1109\/CVPR.2019.00720"},{"key":"1134_CR54","doi-asserted-by":"crossref","unstructured":"Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Seattle, pp 10781\u201310790","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"1134_CR55","unstructured":"Zhang S, He X, Yan S (2019) Latentgnn: learning efficient non-local relations for visual recognition. In: International conference on machine learning. PMLR, pp 7374\u20137383"},{"key":"1134_CR56","doi-asserted-by":"crossref","unstructured":"Yuan L, Tay FE, Li G, Wang T, Feng J (2020) Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Seattle, pp 3903\u20133911","DOI":"10.1109\/CVPR42600.2020.00396"},{"key":"1134_CR57","doi-asserted-by":"crossref","unstructured":"Milletari F, Navab N, Ahmadi S (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: Proceedings of the 2016 fourth international conference on 3D vision (3DV), IEEE, California, pp 565\u2013571","DOI":"10.1109\/3DV.2016.79"},{"key":"1134_CR58","doi-asserted-by":"crossref","unstructured":"Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Las Vegas, pp 2315\u20132324","DOI":"10.1109\/CVPR.2016.254"},{"issue":"11","key":"1134_CR59","doi-asserted-by":"publisher","first-page":"4737","DOI":"10.1109\/TIP.2014.2353813","volume":"23","author":"C Yao","year":"2014","unstructured":"Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Trans Image Process 23(11):4737\u20134749","journal-title":"IEEE Trans Image Process"},{"key":"1134_CR60","doi-asserted-by":"crossref","unstructured":"Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE, Long Beach, pp 9336\u20139345","DOI":"10.1109\/CVPR.2019.00956"},{"key":"1134_CR61","doi-asserted-by":"crossref","unstructured":"Wang W, Xie E, Li X, Liu X, Liang D, Zhibo Y, Lu T, Shen C (2021) Pan++: towards efficient and accurate end-to-end spotting of arbitrarily-shaped text. IEEE Trans Pattern Anal Mach Intell 44(9):5349\u20135367","DOI":"10.1109\/TPAMI.2021.3077555"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01134-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01134-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01134-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,10]],"date-time":"2024-02-10T22:09:53Z","timestamp":1707602993000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01134-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,13]]},"references-count":61,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,2]]}},"alternative-id":["1134"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01134-z","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"type":"print","value":"2199-4536"},{"type":"electronic","value":"2198-6053"}],"subject":[],"published":{"date-parts":[[2023,7,13]]},"assertion":[{"value":"3 July 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 May 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 July 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no financial or proprietary interests in any material discussed in this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}