{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:52:18Z","timestamp":1760233938521,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2021,3,10]],"date-time":"2021-03-10T00:00:00Z","timestamp":1615334400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"This work was funded by the National Natural Science Foundation of China (NSFC)","award":["61671376 , 61771386"],"award-info":[{"award-number":["61671376 , 61771386"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>A challenging aspect of scene text detection is to handle curved texts. In order to avoid the tedious manual annotations for training curve text detector, and to overcome the limitation of regression-based text detectors to irregular text, we introduce straightforward and efficient instance-aware curved scene text detector, namely, look more than twice (LOMT), which makes the regression-based text detection results gradually change from loosely bounded box to compact polygon. LOMT mainly composes of curve text shape approximation module and component merging network. The shape approximation module uses a particle swarm optimization-based text shape approximation method (called PSO-TSA) to fine-tune the quadrilateral text detection results to fit the curved text. The component merging network merges incomplete text sub-parts of text instances into more complete polygon through instance awareness, called ICMN. Experiments on five text datasets demonstrate that our method not only achieves excellent performance but also has relatively high speed. Ablation experiments show that PSO-TSA can solve the text\u2019s shape optimization problem efficiently, and ICMN has a satisfactory merger effect.<\/jats:p>","DOI":"10.3390\/s21061945","type":"journal-article","created":{"date-parts":[[2021,3,10]],"date-time":"2021-03-10T20:51:42Z","timestamp":1615409502000},"page":"1945","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["A Straightforward and Efficient Instance-Aware Curved Text Detector"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6672-7948","authenticated-orcid":false,"given":"Fan","family":"Zhao","sequence":"first","affiliation":[{"name":"Department of Information Science, Xi\u2019an University of Technology, Xi\u2019an 710054, China"}]},{"given":"Sidi","family":"Shao","sequence":"additional","affiliation":[{"name":"Department of Information Science, Xi\u2019an University of Technology, Xi\u2019an 710054, China"}]},{"given":"Lin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Information Science, Xi\u2019an University of Technology, Xi\u2019an 710054, China"}]},{"given":"Zhiquan","family":"Wen","sequence":"additional","affiliation":[{"name":"Department of Information Science, Xi\u2019an University of Technology, Xi\u2019an 710054, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,3,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Epshtein, B., Ofek, E., and Wexler, Y. (2010, January 13\u201318). Detecting Text in Natural Scenes with Stroke Width Transform. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540041"},{"key":"ref_2","unstructured":"Neumann, L., and Matas, J. (2010, January 8\u201312). A method for text localization and recognition in real-world images. Proceedings of the Asian Conference on Computer Vision, Queenstown, New Zealand."},{"key":"ref_3","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201310). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3111","DOI":"10.1109\/TMM.2018.2818020","article-title":"Arbitrary-oriented scene text detection via rotation proposals","volume":"20","author":"Ma","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1109\/TIP.2018.2825107","article-title":"Textboxes++: A single-shot oriented scene text detector","volume":"27","author":"Liao","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wang, F., Zhao, L., Li, X., and Wang, X. (2018, January 18\u201323). Geometry-aware scene text detection with instance transformation network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00150"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"44219","DOI":"10.1109\/ACCESS.2019.2908933","article-title":"Ftpn: Scene text detection with feature pyramid based text proposal network","volume":"7","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.patcog.2019.02.002","article-title":"Curved scene text detection via transverse and longitudinal sequence connection","volume":"90","author":"Liu","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., and Liang, J. (2017, January 21\u201326). East: An efficient and accurate scene text detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.283"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Ch\u2019ng, C.K., and Chan, C.S. (2017, January 9\u201315). Total-text: A comprehensive dataset for scene text detection and recognition. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan.","DOI":"10.1109\/ICDAR.2017.157"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Bu\u0161ta, M., Patel, Y., and Matas, J. (2018, January 2\u20136). E2e-mlt-an unconstrained end-to-end method for multi-language scene text. Proceedings of the Asian Conference on Computer Vision, Perth, WA, Australia.","DOI":"10.1007\/978-3-030-21074-8_11"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, C., Liang, B., Huang, Z., En, M., Han, J., Ding, E., and Ding, X. (2019, January 16\u201320). Look more than once: An accurate detector for text of arbitrary shapes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01080"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"106954","DOI":"10.1016\/j.patcog.2019.06.020","article-title":"Seglink++: Detecting dense and arbitrary-shaped scene text by instance-aware component grouping","volume":"96","author":"Tang","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.patcog.2018.10.012","article-title":"A pooling based scene text proposal technique for scene text reading in the wild","volume":"87","author":"NguyenVan","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"107026","DOI":"10.1016\/j.patcog.2019.107026","article-title":"Realtime multi-scale scene text detection with scale-based region proposal network","volume":"98","author":"He","year":"2020","journal-title":"Pattern Recognit."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"106964","DOI":"10.1016\/j.patcog.2019.106964","article-title":"Rotated cascade R-CNN: A shape robust detector with coordinate regression","volume":"96","author":"Zhu","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., and Shao, S. (2019, January 16\u201320). Shape robust text detection with progressive scale expansion network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00956"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., and Yan, J. (2018, January 18\u201323). Fots: Fast oriented text spotting with a unified network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00595"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, X., Jiang, Y., Luo, Z., Liu, C.L., Choi, H., and Kim, S. (2019, January 16\u201320). Arbitrary shape scene text detection with adaptive text region representation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00661"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"He, W., Zhang, X.-Y., Yin, F., and Liu, C.-L. (2017, January 22\u201329). Deep direct regression for multi-oriented scene text detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.87"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wu, Y., and Natarajan, P. (2017, January 21\u201326). Self-organized text detection with minimal post-processing via border learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.535"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"He, D., Yang, X., Liang, C., Zhou, Z., Ororbi, A.G., Kifer, D., and Giles, C.L. (2017, January 21\u201326). Multi-scale fcn with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.58"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Deng, D., Liu, H., Li, X., and Cai, D. (2017, January 2\u20137). Pixellink: Detecting scene text via instance segmentation. Proceedings of the AAAI-18 AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.12269"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Xue, C., Lu, S., and Zhan, F. (2018, January 8\u201314). Accurate scene text detection through border semantics awareness and bootstrapping. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_22"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., and Bai, X. (2016, January 27\u201330). Multi-oriented text detection with fully convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.451"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1016\/j.patrec.2017.08.030","article-title":"Fast: Facilitated and accurate scene text proposals through fcn guided pruning","volume":"119","author":"Bazazian","year":"2019","journal-title":"Pattern Recognit. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"5566","DOI":"10.1109\/TIP.2019.2900589","article-title":"Textfield: Learning a deep direction field for irregular scene text detection","volume":"28","author":"Xu","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, J., Zhang, C., Sun, Y., Han, J., and Ding, E. (2018, January 2\u20136). Detecting text in the wild with deep character embedding network. Proceedings of the Asian Conference on Computer Vision, Perth, WA, Australia.","DOI":"10.1007\/978-3-030-20870-7_31"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2918","DOI":"10.1109\/TIP.2019.2954218","article-title":"Arbitrarily Shaped Scene Text Detection with a Mask Tightness Text Detector","volume":"29","author":"Liu","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_31","unstructured":"Xie, E., Zang, Y., Shao, S., Yu, G., Yao, C., and Li, G. (February, January 27). Scene text detection with supervised pyramid context network. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Lyu, P., Yao, C., Wu, W., Yan, S., and Bai, X. (2018, January 18\u201323). Multi-oriented scene text detection via corner localization and region segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00788"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Long, S., Ruan, J., Zhang, W., He, X., Wu, W., and Yao, C. (2018, January 8\u201314). Textsnake: A flexible representation for detecting text of arbitrary shapes. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01216-8_2"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Baek, Y., Lee, B., Han, D., Yun, S., and Lee, H. (2019, January 16\u201320). Character region awareness for text detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00959"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Bigorda, L.G.i., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., and de las Heras, L.P. (2013, January 16\u201320). ICDAR 2013 robust reading competition. Document Analysis and Recognition (ICDAR). Proceedings of the 2013 12th International Conference on IEEE Computer Society, Niigata, Japan.","DOI":"10.1109\/ICDAR.2013.221"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., J Matas, L.N., Chandrasekhar, V.R., and Lu, S. (2015, January 23\u201326). ICDAR 2015 competition on robust reading. Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia.","DOI":"10.1109\/ICDAR.2015.7333942"},{"key":"ref_37","unstructured":"Yao, C., Bai, X., Liu, W., Ma, Y., and Tu, Z. (2012, January 18\u201320). Detecting texts of arbitrary orientations in natural images. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_40","unstructured":"Kennedy, J., and Eberhar, R. (December, January 27). Particle swarm optimization. Proceedings of the ICNN\u201995-International Conference on Neural Networks, Perth, WA, Australia."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Nayef, N., Yin, F., Bizid, I., Choi, H., Feng, Y., Karatzas, D., Luo, Z.B., Pal, U., Rigaud, C., and Chazalon, J. (2017, January 9\u201315). ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) IEEE, Kyoto, Japan.","DOI":"10.1109\/ICDAR.2017.237"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhang, Z. (2018, January 4\u20136). Improved adam optimizer for deep neural networks. Proceedings of the 2018 IEEE\/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada.","DOI":"10.1109\/IWQoS.2018.8624183"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/6\/1945\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:33:34Z","timestamp":1760160814000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/6\/1945"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,10]]},"references-count":43,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,3]]}},"alternative-id":["s21061945"],"URL":"https:\/\/doi.org\/10.3390\/s21061945","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,3,10]]}}}