{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T17:15:34Z","timestamp":1760116534991,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,11,7]],"date-time":"2024-11-07T00:00:00Z","timestamp":1730937600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Leshan Special Robot Engineering Technology Research Center"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Text CAPTCHAs are crucial security measures deployed on global websites to deter unauthorized intrusions. The presence of anti-attack features incorporated into text CAPTCHAs limits the effectiveness of evaluating them, despite CAPTCHA recognition being an effective method for assessing their security. This study introduces a novel color augmentation technique called Variational Color Shift (VCS) to boost the recognition accuracy of different networks. VCS generates a color shift of every input image and then resamples the image within that range to generate a new image, thus expanding the number of samples of the original dataset to improve training effectiveness. In contrast to Random Color Shift (RCS), which treats the color offsets as hyperparameters, VCS estimates color shifts by reparametrizing the points sampled from the uniform distribution using predicted offsets according to every image, which makes the color shifts learnable. To better balance the computation and performance, we also propose two variants of VCS: Sim-VCS and Dilated-VCS. In addition, to solve the overfitting problem caused by disturbances in text CAPTCHAs, we propose an Auto-Encoder (AE) based on Large Separable Kernel Attention (AE-LSKA) to replace the convolutional module with large kernels in the text CAPTCHA recognizer. This new module employs an AE to compress the interference while expanding the receptive field using Large Separable Kernel Attention (LSKA), reducing the impact of local interference on the model training and improving the overall perception of characters. The experimental results show that the recognition accuracy of the model after integrating the AE-LSKA module is improved by at least 15 percentage points on both M-CAPTCHA and P-CAPTCHA datasets. In addition, experimental results demonstrate that color augmentation using VCS is more effective in enhancing recognition, which has higher accuracy compared to RCS and PCA Color Shift (PCA-CS).<\/jats:p>","DOI":"10.3390\/info15110717","type":"journal-article","created":{"date-parts":[[2024,11,7]],"date-time":"2024-11-07T12:12:13Z","timestamp":1730981533000},"page":"717","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Variational Color Shift and Auto-Encoder Based on Large Separable Kernel Attention for Enhanced Text CAPTCHA Vulnerability Assessment"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-0825-8193","authenticated-orcid":false,"given":"Xing","family":"Wan","sequence":"first","affiliation":[{"name":"School of Intelligent Manufacturing, Leshan Vocational and Technical College, Leshan 614000, China"},{"name":"School of Electrical Engineering, Universiti Teknologi MARA (UiTM), Shah Alam 40450, Malaysia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0917-7318","authenticated-orcid":false,"given":"Juliana","family":"Johari","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering, Universiti Teknologi MARA (UiTM), Shah Alam 40450, Malaysia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fazlina Ahmat","family":"Ruslan","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering, Universiti Teknologi MARA (UiTM), Shah Alam 40450, Malaysia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,11,7]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"Setiawan, A.B., and Sastrosubroto, A.S. (2016, January 3\u20135). Strengthening the Security of Critical Data in Cyberspace, a Policy Review. Proceedings of the 2016 International Conference on Computer, Control, Informatics and its Applications (IC3INA), Tangerang, Indonesia.","key":"ref_1","DOI":"10.1109\/IC3INA.2016.7863047"},{"doi-asserted-by":"crossref","unstructured":"Biham, E. (2003). CAPTCHA: Using Hard AI Problems for Security. Advances in Cryptology\u2014EUROCRYPT 2003, Springer.","key":"ref_2","DOI":"10.1007\/3-540-39200-9"},{"doi-asserted-by":"crossref","unstructured":"Yan, J., and El Ahmad, A.S. (2008, January 23\u201325). Usability of CAPTCHAs or Usability Issues in CAPTCHA Design. Proceedings of the 4th Symposium on Usable Privacy and Security\u2014SOUPS \u201908, Pittsburgh, PA, USA.","key":"ref_3","DOI":"10.1145\/1408664.1408671"},{"key":"ref_4","first-page":"164","article-title":"Evaluating the Usability of Optimizing Text-Based CAPTCHA Generation","volume":"7","author":"Alsuhibany","year":"2016","journal-title":"Int. J. Adv. Comput. Sci. Appl. IJACSA"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"5851","DOI":"10.3934\/mbe.2019292","article-title":"CAPTCHA Recognition Based on Deep Convolutional Neural Network","volume":"16","author":"Wang","year":"2019","journal-title":"Math. Biosci. Eng."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3477142","article-title":"Gotta CAPTCHA \u2019Em All: A Survey of 20 Years of the Human-or-Computer Dilemma","volume":"54","author":"Guerar","year":"2022","journal-title":"ACM Comput. Surv."},{"doi-asserted-by":"crossref","unstructured":"Baird, H.S., and Lopresti, D.P. (2005). Building Segmentation Based Human-Friendly Human Interaction Proofs (HIPs). Human Interactive Proofs, Springer.","key":"ref_7","DOI":"10.1007\/b136509"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2575","DOI":"10.1109\/TMM.2020.3013376","article-title":"Robust CAPTCHAs Towards Malicious OCR","volume":"23","author":"Zhang","year":"2021","journal-title":"IEEE Trans. Multimed."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"196:1","DOI":"10.1145\/3559754","article-title":"An Experimental Investigation of Text-Based CAPTCHA Attacks and Their Robustness","volume":"55","author":"Wang","year":"2023","journal-title":"ACM Comput. Surv."},{"doi-asserted-by":"crossref","unstructured":"Xing, W., Mohd, M.R.S., Johari, J., and Ruslan, F.A. (June, January 29). A Review on Text-Based CAPTCHA Breaking Based on Deep Learning Methods. Proceedings of the 2023 International Conference on Computer Engineering and Distance Learning (CEDL), Shanghai, China.","key":"ref_10","DOI":"10.1109\/CEDL60560.2023.00040"},{"unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding YOLO Series in 2021. arXiv.","key":"ref_11"},{"unstructured":"Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., and Nie, W. (2022). YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv.","key":"ref_12"},{"doi-asserted-by":"crossref","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2023, January 18\u201322). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","key":"ref_13","DOI":"10.1109\/CVPR52729.2023.00721"},{"doi-asserted-by":"crossref","unstructured":"Walia, J.S., and Odugoudar, A. (2023, January 8). Vulnerability Analysis of Captcha Using Deep Learning. Proceedings of the 2023 IEEE International Conference on ICT in Business Industry & Government (ICTBIG), online.","key":"ref_14","DOI":"10.1109\/ICTBIG59752.2023.10456218"},{"unstructured":"Wang, Z., Wang, P., Liu, K., Wang, P., Fu, Y., Lu, C.-T., Aggarwal, C.C., Pei, J., and Zhou, Y. (2024). A Comprehensive Survey on Data Augmentation. arXiv.","key":"ref_15"},{"doi-asserted-by":"crossref","unstructured":"Bursztein, E., Martin, M., and Mitchell, J.C. (2011, January 17\u201321). Text-Based CAPTCHA Strengths and Weaknesses. Proceedings of the Proceedings of the 18th Acm Conference on Computer & Communications Security (CCS 11), Chicago, IL, USA.","key":"ref_16","DOI":"10.1145\/2046707.2046724"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1016\/j.neunet.2022.06.041","article-title":"Breaking CAPTCHA with Capsule Networks","volume":"154","author":"Mocanu","year":"2022","journal-title":"Neural Netw."},{"doi-asserted-by":"crossref","unstructured":"Shi, Y., Liu, X., Han, S., Lu, Y., and Zhang, X. (2021, January 28\u201330). A Transformer Network for CAPTCHA Recognition. Proceedings of the 2021 2nd International Conference on Artificial Intelligence and Information Systems, Chongqing, China.","key":"ref_18","DOI":"10.1145\/3469213.3470366"},{"doi-asserted-by":"crossref","unstructured":"Qing, K., and Zhang, R. (2022, January 22\u201325). An Efficient ConvNet for Text-Based CAPTCHA Recognition. Proceedings of the 2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Penang, Malaysia.","key":"ref_19","DOI":"10.1109\/ISPACS57703.2022.10082852"},{"doi-asserted-by":"crossref","unstructured":"Noury, Z., and Rezaei, M. (2020). Deep-CAPTCHA: A Deep Learning Based CAPTCHA Solver for Vulnerability Assessment. arXiv.","key":"ref_20","DOI":"10.31219\/osf.io\/km35b"},{"doi-asserted-by":"crossref","unstructured":"Wan, X., Johari, J., and Ruslan, F.A. (2024). Adaptive CAPTCHA: A CRNN-Based Text CAPTCHA Solver with Adaptive Fusion Filter Networks. Appl. Sci., 14.","key":"ref_21","DOI":"10.3390\/app14125016"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. Acm"},{"doi-asserted-by":"crossref","unstructured":"Wang, X., and Yu, J. (2020, January 14\u201319). Learning to Cartoonize Using White-Box Cartoon Representations. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","key":"ref_23","DOI":"10.1109\/CVPR42600.2020.00811"},{"doi-asserted-by":"crossref","unstructured":"Ishkov, D.O., and Terekhov, V.I. (2022, January 17\u201319). Text CAPTCHA Traversal with ConvNets: Impact of Color Channels. Proceedings of the 2022 4th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE), Moscow, Russia.","key":"ref_24","DOI":"10.1109\/REEPE53907.2022.9731423"},{"doi-asserted-by":"crossref","unstructured":"Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the Computer Vision\u2014ECCV, Munich, Germany.","key":"ref_25","DOI":"10.1007\/978-3-030-01249-6"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-Excitation Networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"doi-asserted-by":"crossref","unstructured":"Zhang, Q.-L., and Yang, Y.-B. (2021, January 6\u201311). SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. Proceedings of the ICASSP 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","key":"ref_27","DOI":"10.1109\/ICASSP39728.2021.9414568"},{"doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 14\u201319). ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","key":"ref_28","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_29","first-page":"6789","article-title":"Non-Deep Networks","volume":"35","author":"Goyal","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"doi-asserted-by":"crossref","unstructured":"Cao, Y., Xu, J., Lin, S., Wei, F., and Hu, H. (2019, January 27\u201328). GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond. Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea.","key":"ref_30","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"121352","DOI":"10.1016\/j.eswa.2023.121352","article-title":"Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN","volume":"236","author":"Lau","year":"2024","journal-title":"Expert. Syst. Appl."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1007\/s41095-023-0364-2","article-title":"Visual Attention Network","volume":"9","author":"Guo","year":"2023","journal-title":"Comput. Vis. Media"},{"unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.","key":"ref_33"},{"doi-asserted-by":"crossref","unstructured":"Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022, January 18\u201324). A ConvNet for the 2020s. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","key":"ref_34","DOI":"10.1109\/CVPR52688.2022.01167"},{"doi-asserted-by":"crossref","unstructured":"Chen, S., and Guo, W. (2023). Auto-Encoders in Deep Learning\u2014A Review with New Perspectives. Mathematics, 11.","key":"ref_35","DOI":"10.3390\/math11081777"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1561\/2200000056","article-title":"An Introduction to Variational Autoencoders","volume":"12","author":"Kingma","year":"2019","journal-title":"Found. Trends\u00ae Mach. Learn."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/11\/717\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:28:22Z","timestamp":1760113702000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/11\/717"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,7]]},"references-count":36,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,11]]}},"alternative-id":["info15110717"],"URL":"https:\/\/doi.org\/10.3390\/info15110717","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2024,11,7]]}}}