{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T09:26:29Z","timestamp":1779182789763,"version":"3.51.4"},"reference-count":35,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T00:00:00Z","timestamp":1676505600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Deep learning-based computer vision algorithms, especially image segmentation, have been successfully applied to pixel-level crack detection. The prediction accuracy relies heavily on detecting the performance of fine-grained cracks and removing crack-like noise. We propose a fast encoder-decoder network with scaling attention. We focus on a low-level feature map by minimizing encoder-decoder pairs and adopting an Atrous Spatial Pyramid Pooling (ASPP) layer to improve the detection accuracy of tiny cracks. Another challenge is the reduction in crack-like noise. This introduces a novel scaling attention, AG+, to suppress irrelevant regions. However, removing crack-like noise, such as grooving, is difficult by using only improved segmentation networks. In this study, a crack dataset is generated. It contains 11,226 sets of images and masks, which are effective for detecting detailed tiny cracks and removing non-semantic objects. Our model is evaluated on the generated dataset and compared with state-of-the-art segmentation networks. We use the mean Dice coefficient (mDice) and mean Intersection over union (mIoU) to compare the performance and FLOPs for computational complexity. The experimental results show that our model improves the detection accuracy of fine-grained cracks and reduces the computational cost dramatically. The mDice score of the proposed model is close to the best score, with only a 1.2% difference but two times fewer FLOPs.<\/jats:p>","DOI":"10.3390\/s23042244","type":"journal-article","created":{"date-parts":[[2023,2,17]],"date-time":"2023-02-17T01:32:56Z","timestamp":1676597576000},"page":"2244","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Fast Attention CNN for Fine-Grained Crack Segmentation"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0369-8742","authenticated-orcid":false,"given":"Hyunnam","family":"Lee","sequence":"first","affiliation":[{"name":"Incheon International Airport Corporation, Incheon 22382, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2329-7375","authenticated-orcid":false,"given":"Juhan","family":"Yoo","sequence":"additional","affiliation":[{"name":"Department of Computer, Semyung University, Jecheon 02468, Republic of Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Jha, D., Riegler, M.A., Johansen, D., Halvorsen, P., and Johansen, H.D. (2020, January 28\u201330). Doubleu-net: A deep convolutional neural network for medical image segmentation. Proceedings of the 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), Rochester, MN, USA.","DOI":"10.1109\/CBMS49503.2020.00111"},{"key":"ref_4","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_5","unstructured":"Ho, J., Kalchbrenner, N., Weissenborn, D., and Salimans, T. (2019). Axial attention in multidimensional transformers. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Liu, H., Miao, X., Mertz, C., Xu, C., and Kong, H. (2021, January 11\u201317). CrackFormer: Transformer network for fine-grained crack detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00376"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1525","DOI":"10.1109\/TITS.2019.2910595","article-title":"Feature pyramid and hierarchical boosting network for pavement crack detection","volume":"21","author":"Yang","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1498","DOI":"10.1109\/TIP.2018.2878966","article-title":"Deepcrack: Learning hierarchical convolutional features for crack detection","volume":"28","author":"Zou","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Chun, P.j., Yamane, T., and Tsuzuki, Y. (2021). Automatic detection of cracks in asphalt pavement using deep learning to overcome weaknesses in images and GIS visualization. Appl. Sci., 11.","DOI":"10.3390\/app11030892"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"5325","DOI":"10.1109\/TIM.2019.2959292","article-title":"NB-FCN: Real-time accurate crack detection in inspection videos using deep fully convolutional network and parametric data fusion","volume":"69","author":"Chen","year":"2019","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"110157","DOI":"10.1016\/j.engstruct.2019.110157","article-title":"Increasing the robustness of material-specific deep learning models for crack detection across different materials","volume":"206","author":"Alipour","year":"2020","journal-title":"Eng. Struct."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Alfarrarjeh, A., Trivedi, D., Kim, S.H., and Shahabi, C. (2018, January 10\u201313). A deep learning approach for road damage detection from smartphone images. Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA.","DOI":"10.1109\/BigData.2018.8621899"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"805","DOI":"10.1111\/mice.12297","article-title":"Automated pixel-level pavement crack detection on 3D asphalt surfaces using a deep-learning network","volume":"32","author":"Zhang","year":"2017","journal-title":"Comput.-Aided Civ. Infrastruct. Eng."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1109\/TITS.2019.2891167","article-title":"Pixel-level cracking detection on 3D asphalt pavement images through deep-learning-based CrackNet-V","volume":"21","author":"Fei","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.autcon.2018.11.028","article-title":"Autonomous concrete crack detection using deep fully convolutional neural network","volume":"99","author":"Dung","year":"2019","journal-title":"Autom. Constr."},{"key":"ref_17","first-page":"1","article-title":"An effective hybrid atrous convolutional network for pixel-level crack detection","volume":"70","author":"Chen","year":"2021","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., and Shao, L. (2020, January 4\u20138). Pranet: Parallel reverse attention network for polyp segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru.","DOI":"10.1007\/978-3-030-59725-2_26"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lou, A., Guan, S., Ko, H., and Loew, M.H. (2022, January 20\u201314). CaraNet: Context axial reverse attention network for segmentation of small medical objects. Proceedings of the Medical Imaging 2022: Image Processing, San Diego, CA, USA.","DOI":"10.1117\/12.2611802"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Kim, T., Lee, H., and Kim, D. (2021, January 20\u201324). UACANet: Uncertainty augmented context attention for polyp segmentation. Proceedings of the 29th ACM International Conference on Multimedia, Virtual.","DOI":"10.1145\/3474085.3475375"},{"key":"ref_21","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., and Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.neucom.2019.01.036","article-title":"DeepCrack: A deep hierarchical feature learning architecture for crack segmentation","volume":"338","author":"Liu","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"8016","DOI":"10.1109\/TIE.2019.2945265","article-title":"SDDNet: Real-time crack segmentation","volume":"67","author":"Choi","year":"2020","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"114892","DOI":"10.1109\/ACCESS.2020.3003638","article-title":"Automated pavement crack segmentation using U-Net-based convolutional neural network","volume":"8","author":"Lau","year":"2020","journal-title":"IEEE Access"},{"key":"ref_26","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_27","first-page":"12116","article-title":"Do vision transformers see like convolutional neural networks?","volume":"34","author":"Raghu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wang, J., Huang, Q., Tang, F., Meng, J., Su, J., and Song, S. (2022). Stepwise Feature Fusion: Local Guides Global. arXiv.","DOI":"10.1007\/978-3-031-16437-8_11"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., and Zhang, L. (2021, January 11\u201317). Cvt: Introducing convolutions to vision transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00009"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 11\u201317). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_31","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Sanderson, E., and Matuszewski, B.J. (2022, January 27\u201329). FCN-transformer feature fusion for polyp segmentation. Proceedings of the Medical Image Understanding and Analysis: 26th Annual Conference, MIUA 2022, Cambridge, UK.","DOI":"10.1007\/978-3-031-12053-4_65"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1664","DOI":"10.1016\/j.dib.2018.11.015","article-title":"SDNET2018: An annotated image dataset for non-contact concrete crack detection using deep convolutional neural networks","volume":"21","author":"Dorafshan","year":"2018","journal-title":"Data Brief"},{"key":"ref_34","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_35","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). \u201cTensorFlow\u201d: A system for Large-Scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2244\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:38:31Z","timestamp":1760121511000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2244"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,16]]},"references-count":35,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23042244"],"URL":"https:\/\/doi.org\/10.3390\/s23042244","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,16]]}}}