{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T22:09:01Z","timestamp":1740175741645,"version":"3.37.3"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,12,10]],"date-time":"2022-12-10T00:00:00Z","timestamp":1670630400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,12,10]],"date-time":"2022-12-10T00:00:00Z","timestamp":1670630400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Major Program of National Natural Science Foundation of China","award":["71790614"],"award-info":[{"award-number":["71790614"]}]},{"name":"the Fund for the National Natural Science Foundation of China","award":["62073067"],"award-info":[{"award-number":["62073067"]}]},{"name":"the Fundamental Research Funds for the Central Universities","award":["N2128001"],"award-info":[{"award-number":["N2128001"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As the unique recognition of each slab, the accurate recognition of slab number is especially critical for the hot rolling production process. However, the collected data are often of low quality due to poor production environment conditions, making traditional deep learning algorithms face more significant challenges in slab numbers recognition. In this paper, a two-stage hybrid algorithm based on convolutional neural network and Transformer is proposed to identify industrial slab numbers. In the first stage, an improved CycleGAN (HybridCy) is developed to enhance the quality of real-world unpaired data. In the second stage, a multi-scale hybrid vision transformer model (MSHy-Vit) is proposed to identify slab numbers of the improved data output of stage one. The experimental results on industrial slab data show that HybridCy exhibits stable and efficient performance. Even for low-quality data with severe geometric distortion, HybridCy can accomplish quality improvement, which can help to improve recognition accuracy. In addition, the MSHy-Vit achieves superior accuracy in the recognition of slab numbers in comparison to existing methods in the literature.<\/jats:p>","DOI":"10.1007\/s40747-022-00933-0","type":"journal-article","created":{"date-parts":[[2022,12,10]],"date-time":"2022-12-10T09:02:38Z","timestamp":1670662958000},"page":"3367-3384","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Two-stage hybrid algorithm for recognition of industrial slab numbers with data quality improvement"],"prefix":"10.1007","volume":"9","author":[{"given":"Qingqing","family":"Liu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8132-9446","authenticated-orcid":false,"given":"Xianpeng","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Xiangman","family":"Song","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,12,10]]},"reference":[{"issue":"2","key":"933_CR1","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1007\/s42524-020-0126-0","volume":"8","author":"L Tang","year":"2021","unstructured":"Tang L, Meng Y (2021) Data analytics and optimization for smart industry. Front Eng Manag 8(2):157\u2013171","journal-title":"Front Eng Manag"},{"issue":"2","key":"933_CR2","doi-asserted-by":"publisher","first-page":"1525","DOI":"10.1007\/s40747-021-00600-w","volume":"8","author":"Q Zhang","year":"2022","unstructured":"Zhang Q, Zhang M, Gamanayake C, Yuen C, Geng Z, Jayasekara H, Woo C-W, Low J, Liu X, Guan YL (2022) Deep learning based solder joint defect detection on industrial printed circuit board X-ray images. Complex Intell Syst 8(2):1525\u20131537. https:\/\/doi.org\/10.1007\/s40747-021-00600-w","journal-title":"Complex Intell Syst"},{"key":"933_CR3","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-022-00733-6","author":"HN Monday","year":"2022","unstructured":"Monday HN, Li J, Nneji GU, Nahar S, Hossin MA, Jackson J, Oluwasanmi A (2022) A wavelet convolutional capsule network with modified super resolution generative adversarial network for fault diagnosis and classification. Complex Intell Syst. https:\/\/doi.org\/10.1007\/s40747-022-00733-6","journal-title":"Complex Intell Syst"},{"issue":"3","key":"933_CR4","doi-asserted-by":"publisher","first-page":"1173","DOI":"10.1007\/s40747-020-00205-9","volume":"7","author":"Y Li","year":"2021","unstructured":"Li Y, Wang C, Gao L, Song Y, Li X (2021) An improved simulated annealing algorithm based on residual network for permutation flow shop scheduling. Complex Intell Syst 7(3):1173\u20131183. https:\/\/doi.org\/10.1007\/s40747-020-00205-9","journal-title":"Complex Intell Syst"},{"key":"933_CR5","doi-asserted-by":"publisher","unstructured":"Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1\u20139. https:\/\/doi.org\/10.1109\/cvpr.2015.7298594","DOI":"10.1109\/cvpr.2015.7298594"},{"key":"933_CR6","doi-asserted-by":"publisher","unstructured":"Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). https:\/\/doi.org\/10.48550\/arXiv.1409.1556","DOI":"10.48550\/arXiv.1409.1556"},{"key":"933_CR7","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, pp 770\u2013778. https:\/\/doi.org\/10.1109\/cvpr.2016.90","DOI":"10.1109\/cvpr.2016.90"},{"key":"933_CR8","doi-asserted-by":"publisher","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations. https:\/\/doi.org\/10.48550\/arXiv.2010.11929","DOI":"10.48550\/arXiv.2010.11929"},{"key":"933_CR9","unstructured":"Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jegou H (2021) Training data-efficient image transformers and distillation through attention. In: International conference on machine learning, vol 139, pp 10347\u201310357"},{"key":"933_CR10","doi-asserted-by":"publisher","unstructured":"Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE\/CVF international conference on computer vision (ICCV), pp 9992\u201310002. https:\/\/doi.org\/10.1109\/ICCV48922.2021.00986","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"933_CR11","doi-asserted-by":"publisher","unstructured":"Shi B, Wang X, Lyu P, Yao C, Bai X (2016) Robust scene text recognition with automatic rectification. In: 2016 IEEE conference on computer vision and pattern recognition, pp 4168\u20134176. https:\/\/doi.org\/10.1109\/cvpr.2016.452","DOI":"10.1109\/cvpr.2016.452"},{"issue":"11","key":"933_CR12","doi-asserted-by":"publisher","first-page":"2298","DOI":"10.1109\/tpami.2016.2646371","volume":"39","author":"B Shi","year":"2017","unstructured":"Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298\u20132304. https:\/\/doi.org\/10.1109\/tpami.2016.2646371","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"933_CR13","doi-asserted-by":"publisher","unstructured":"Liao M, Zhu Z, Shi B, Xia G-S, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: IEEE conference on computer vision and pattern recognition, pp 5909\u20135918. https:\/\/doi.org\/10.1109\/cvpr.2018.00619","DOI":"10.1109\/cvpr.2018.00619"},{"key":"933_CR14","doi-asserted-by":"crossref","unstructured":"Wang T, Zhu Y, Jin L, Luo C, Chen X, Wu Y, Wang Q, Cai M (2020) Decoupled attention network for text recognition. In: AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v34i07.6903"},{"key":"933_CR15","doi-asserted-by":"publisher","unstructured":"Yu D, Li X, Zhang C, Liu T, Han J, Liu J, Ding E (2020) Towards accurate scene text recognition with semantic reasoning networks. In: 2020 IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 12110\u201312119. https:\/\/doi.org\/10.1109\/CVPR42600.2020.01213","DOI":"10.1109\/CVPR42600.2020.01213"},{"key":"933_CR16","doi-asserted-by":"publisher","unstructured":"Fang S, Xie H, Wang Y, Mao Z, Zhang Y (2021) Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7094\u20137103. https:\/\/doi.org\/10.1109\/CVPR46437.2021.00702","DOI":"10.1109\/CVPR46437.2021.00702"},{"key":"933_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.knosys.2017.06.017","volume":"132","author":"SJ Lee","year":"2017","unstructured":"Lee SJ, Yun JP, Koo G, Kim SW (2017) End-to-end recognition of slab identification numbers using a deep convolutional neural network. Knowl Based Syst 132:1\u201310. https:\/\/doi.org\/10.1016\/j.knosys.2017.06.017","journal-title":"Knowl Based Syst"},{"issue":"4","key":"933_CR18","doi-asserted-by":"publisher","first-page":"696","DOI":"10.2355\/isijinternational.isijint-2017-695","volume":"58","author":"SJ Lee","year":"2018","unstructured":"Lee SJ, Kwon W, Koo G, Choi H, Kim SW (2018) Recognition of slab identification numbers using a fully convolutional network. ISIJ Int 58(4):696\u2013703. https:\/\/doi.org\/10.2355\/isijinternational.isijint-2017-695","journal-title":"ISIJ Int"},{"key":"933_CR19","doi-asserted-by":"publisher","first-page":"23177","DOI":"10.1109\/access.2019.2899109","volume":"7","author":"SJ Lee","year":"2019","unstructured":"Lee SJ, Kim SW, Kwon W, Koo G, Yun JP (2019) Selective distillation of weakly annotated gtd for vision-based slab identification system. IEEE Access 7:23177\u201323186. https:\/\/doi.org\/10.1109\/access.2019.2899109","journal-title":"IEEE Access"},{"key":"933_CR20","doi-asserted-by":"publisher","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in neural information processing systems, vol 27. Curran Associates, Inc. https:\/\/doi.org\/10.5555\/2969033.2969125","DOI":"10.5555\/2969033.2969125"},{"issue":"1","key":"933_CR21","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1109\/18.61115","volume":"37","author":"J Lin","year":"1991","unstructured":"Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145\u2013151. https:\/\/doi.org\/10.1109\/18.61115","journal-title":"IEEE Trans Inf Theory"},{"key":"933_CR22","doi-asserted-by":"publisher","unstructured":"Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. https:\/\/doi.org\/10.1109\/iccvw54120.2021.00217","DOI":"10.1109\/iccvw54120.2021.00217"},{"issue":"2","key":"933_CR23","doi-asserted-by":"publisher","first-page":"933","DOI":"10.1007\/s40747-021-00558-9","volume":"8","author":"M Hassan","year":"2022","unstructured":"Hassan M, Wang Y, Pang W, Wang D, Li D, Zhou Y, Xu D (2022) GUV-Net for high fidelity shoeprint generation. Complex Intell Syst 8(2):933\u2013947. https:\/\/doi.org\/10.1007\/s40747-021-00558-9","journal-title":"Complex Intell Syst"},{"key":"933_CR24","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/j.neucom.2021.02.054","volume":"443","author":"X Nie","year":"2021","unstructured":"Nie X, Ding H, Qi M, Wang Y, Wong EK (2021) Urca-gan: upsample residual channel-wise attention generative adversarial network for image-to-image translation. Neurocomputing 443:75\u201384. https:\/\/doi.org\/10.1016\/j.neucom.2021.02.054","journal-title":"Neurocomputing"},{"key":"933_CR25","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-05975-y","author":"V Bharti","year":"2021","unstructured":"Bharti V, Biswas B, Shukla KK (2021) EMOCGAN: a novel evolutionary multiobjective cyclic generative adversarial network and its application to unpaired image translation. Neural Comput Appl. https:\/\/doi.org\/10.1007\/s00521-021-05975-y","journal-title":"Neural Comput Appl"},{"key":"933_CR26","doi-asserted-by":"publisher","DOI":"10.1109\/iccv48922.2021.00471","author":"X Chen","year":"2021","unstructured":"Chen X, Pan J, Jiang K, Huang Y, Kong C, Dai L, Li Y (2021) Unpaired adversarial learning for single image deraining with rain-space contrastive. Constraints. https:\/\/doi.org\/10.1109\/iccv48922.2021.00471","journal-title":"Constraints"},{"key":"933_CR27","doi-asserted-by":"publisher","unstructured":"Wang X, Xie L, Dong C, Shan Y (2021) Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. https:\/\/doi.org\/10.1109\/iccvw54120.2021.00217","DOI":"10.1109\/iccvw54120.2021.00217"},{"key":"933_CR28","doi-asserted-by":"publisher","unstructured":"Ledig C, Theis L, Husz\u00e1r F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE conference on computer vision and pattern recognition, pp 105\u2013114. https:\/\/doi.org\/10.1109\/cvpr.2017.19","DOI":"10.1109\/cvpr.2017.19"},{"key":"933_CR29","doi-asserted-by":"publisher","unstructured":"Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, pp 2242\u20132251. https:\/\/doi.org\/10.1109\/iccv.2017.244","DOI":"10.1109\/iccv.2017.244"},{"key":"933_CR30","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-021-06296-w","author":"H Sun","year":"2021","unstructured":"Sun H, Zhang Y, Chen P, Dan Z, Sun S, Wan J, Li W (2021) Scale-free heterogeneous cycleGAN for defogging from a single image for autonomous driving in fog. Neural Comput Appl. https:\/\/doi.org\/10.1007\/s00521-021-06296-w","journal-title":"Neural Comput Appl"},{"key":"933_CR31","unstructured":"Jiang Z, Hou Q, Yuan L, Zhou D, Shi Y, Jin X, Wang A, Feng J (2021) All tokens matter: token labeling for training better vision transformers. arXiv preprint arXiv:2104.10858"},{"key":"933_CR32","unstructured":"Xiao T, Singh M, Mintun E, Darrell T, Doll\u00e1r P, Girshick RB (2021) Early convolutions help transformers see better. In: NeurIPS"},{"key":"933_CR33","first-page":"28522","volume":"34","author":"Y Xu","year":"2021","unstructured":"Xu Y, Zhang Q, Zhang J, Tao D (2021) Vitae: vision transformer advanced by exploring intrinsic inductive bias. Adv Neural Inf Process Syst 34:28522\u201328535","journal-title":"Adv Neural Inf Process Syst"},{"key":"933_CR34","unstructured":"Dai Z, Liu H, Le QV, Tan M (2021) Coatnet: marrying convolution and attention for all data sizes. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds) Advances in neural information processing systems, vol 34. Curran Associates, Inc., pp 3965\u20133977. https:\/\/proceedings.neurips.cc\/paper\/2021\/file\/20568692db622456cc42a2e853ca21f8-Paper.pdf"},{"key":"933_CR35","unstructured":"Wang Y, Huang R, Song S, Huang Z, Huang G (2021) Not all images are worth 16x16 words: Dynamic transformers for efficient image recognition. In: Advances in neural information processing systems (NeurIPS)"},{"key":"933_CR36","doi-asserted-by":"publisher","unstructured":"Shi W, Caballero J, Husz\u00e1r F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1874\u20131883. https:\/\/doi.org\/10.1109\/CVPR.2016.207","DOI":"10.1109\/CVPR.2016.207"},{"key":"933_CR37","doi-asserted-by":"publisher","unstructured":"Liu Y, Shao Z, Teng Y, Hoffmann N (2021) Nam: normalization-based attention module. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.2111.12419","DOI":"10.48550\/arXiv.2111.12419"},{"key":"933_CR38","doi-asserted-by":"publisher","unstructured":"Chen C-F, Fan Q, Panda R (2021) Crossvit: cross-attention multi-scale vision transformer for image classification. https:\/\/doi.org\/10.1109\/iccv48922.2021.00041. arXiv preprint arXiv:2103.14899","DOI":"10.1109\/iccv48922.2021.00041"},{"key":"933_CR39","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV). ICCV \u201915, pp 1026\u20131034. IEEE Computer Society, USA. https:\/\/doi.org\/10.1109\/ICCV.2015.123","DOI":"10.1109\/ICCV.2015.123"},{"key":"933_CR40","doi-asserted-by":"publisher","unstructured":"Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition, pp 2261\u20132269. https:\/\/doi.org\/10.1109\/csci46756.2018.00084","DOI":"10.1109\/csci46756.2018.00084"},{"key":"933_CR41","doi-asserted-by":"crossref","unstructured":"Zagoruyko S, Komodakis N (2016) Wide residual networks. In: BMVC","DOI":"10.5244\/C.30.87"},{"key":"933_CR42","doi-asserted-by":"publisher","unstructured":"Chen X, Hsieh C, Gong B (2021) When vision transformers outperform ResNets without pretraining or strong data augmentations. arXiv preprint. https:\/\/doi.org\/10.48550\/arXiv.2106.01548","DOI":"10.48550\/arXiv.2106.01548"},{"key":"933_CR43","doi-asserted-by":"publisher","unstructured":"Yuan K, Guo S, Liu Z, Zhou A, Yu F, Wu W (2021) Incorporating convolution designs into visual transformers. In: 2021 IEEE\/CVF international conference on computer vision (ICCV), pp 559\u2013568. https:\/\/doi.org\/10.1109\/ICCV48922.2021.00062","DOI":"10.1109\/ICCV48922.2021.00062"},{"key":"933_CR44","doi-asserted-by":"crossref","unstructured":"Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z-H, Tay FEH, Feng J, Yan S (2021) Tokens-to-token vit: training vision transformers from scratch on imagenet. In: Proceedings of the IEEE\/CVF international conference on computer vision (ICCV), pp 558\u2013567","DOI":"10.1109\/ICCV48922.2021.00060"},{"key":"933_CR45","doi-asserted-by":"crossref","unstructured":"Kolesnikov A, Beyer L, Zhai X, Puigcerver J, Yung J, Gelly S, Houlsby N (2020) Big transfer (bit): General visual representation learning. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision\u2014ECCV 2020. Springer, Cham, pp 491\u2013507","DOI":"10.1007\/978-3-030-58558-7_29"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00933-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-022-00933-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00933-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T17:12:43Z","timestamp":1686330763000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-022-00933-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,10]]},"references-count":45,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["933"],"URL":"https:\/\/doi.org\/10.1007\/s40747-022-00933-0","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"type":"print","value":"2199-4536"},{"type":"electronic","value":"2198-6053"}],"subject":[],"published":{"date-parts":[[2022,12,10]]},"assertion":[{"value":"7 July 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 November 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 December 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they do not have any commercial or associative interest that represents a conflict of interest in connection with the submitted work.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}