{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,28]],"date-time":"2025-12-28T02:20:42Z","timestamp":1766888442039,"version":"build-2065373602"},"reference-count":31,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2020,12,11]],"date-time":"2020-12-11T00:00:00Z","timestamp":1607644800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key R&amp;D Program of China","award":["Grant No. 2018YFB1700500"],"award-info":[{"award-number":["Grant No. 2018YFB1700500"]}]},{"DOI":"10.13039\/501100012245","name":"Science and Technology Planning Project of Guangdong Province","doi-asserted-by":"publisher","award":["Grant No. 2017B090914002"],"award-info":[{"award-number":["Grant No. 2017B090914002"]}],"id":[{"id":"10.13039\/501100012245","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Robot control based on visual information perception is a hot topic in the industrial robot domain and makes robots capable of doing more things in a complex environment. However, complex visual background in an industrial environment brings great difficulties in recognizing the target image, especially when a target is small or far from the sensor. Therefore, target recognition is the first problem that should be addressed in a visual servo system. This paper considers common complex constraints in industrial environments and proposes a You Only Look Once Version 2 Region of Interest (YOLO-v2-ROI) neural network image processing algorithm based on machine learning. The proposed algorithm combines the advantages of YOLO (You Only Look Once) rapid detection with effective identification of ROI (Region of Interest) pooling structure, which can quickly locate and identify different objects in different fields of view. This method can also lead the robot vision system to recognize and classify a target object automatically, improve robot vision system efficiency, avoid blind movement, and reduce the calculation load. The proposed algorithm is verified by experiments. The experimental result shows that the learning algorithm constructed in this paper has real-time image-detection speed and demonstrates strong adaptability and recognition ability when processing images with complex backgrounds, such as different backgrounds, lighting, or perspectives. In addition, this algorithm can also effectively identify and locate visual targets, which improves the environmental adaptability of a visual servo system<\/jats:p>","DOI":"10.3390\/s20247121","type":"journal-article","created":{"date-parts":[[2020,12,13]],"date-time":"2020-12-13T23:39:36Z","timestamp":1607902776000},"page":"7121","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0627-6206","authenticated-orcid":false,"given":"Yongchao","family":"Luo","sequence":"first","affiliation":[{"name":"School of Electrical Engineering, Guangzhou College, South China University of Technology, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1507-8674","authenticated-orcid":false,"given":"Shipeng","family":"Li","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510641, China"}]},{"given":"Di","family":"Li","sequence":"additional","affiliation":[{"name":"School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510641, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,12,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Li, S., Li, D., Zhang, C., Wan, J., and Xie, M. (2020). RGB-D Image Processing Algorithm for Target Recognition and Pose Estimation of Visual Servo System. Sensors, 20.","DOI":"10.3390\/s20020430"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1324","DOI":"10.1007\/s12555-018-0753-y","article-title":"Adaptive Switch Image-based Visual Servoing for Industrial Robots","volume":"18","author":"Ghasemi","year":"2019","journal-title":"Int. J. Control Autom. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.oceaneng.2012.04.006","article-title":"Vision-based object detection and tracking for autonomous navigation of underwater robots","volume":"48","author":"Lee","year":"2012","journal-title":"Ocean Eng."},{"key":"ref_4","first-page":"767","article-title":"Research progress of robot calibration-free visual servo control","volume":"48","author":"Bo","year":"2016","journal-title":"Chin. J. Theor. Appl. Mech."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1109\/TITS.2010.2040177","article-title":"A General Active-Learning Framework for On-Road Vehicle Recognition and Tracking","volume":"11","author":"Sivaraman","year":"2010","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"Speeded-Up Robust Features (SURF)","volume":"110","author":"Bay","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1023\/A:1009715923555","article-title":"A Tutorial on Support Vector Machines for Pattern Recognition","volume":"2","author":"Burges","year":"1998","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_8","unstructured":"Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Van De Sande, K.E.A., Uijlings, J.R.R., Gevers, T., and Smeulders, A.W.M. (2011, January 20\u201325). Segmentation as selective search for object recognition. Proceedings of the 2011 International Conference on Computer Vision (CVPR), Colorado Springs, CO, USA.","DOI":"10.1109\/ICCV.2011.6126456"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montr\u00e9al, QC, Canada."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","article-title":"Mask R-CNN","volume":"42","author":"He","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"616","DOI":"10.1109\/TNN.2003.810605","article-title":"An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture","volume":"14","author":"Doulamis","year":"2003","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1109\/MRA.2016.2615329","article-title":"Object Detection and Recognition for Assistive Robots: Experimentation and Implementation","volume":"24","year":"2017","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ullah, S., and Kim, D.-H. (2020). Lightweight Driver Behavior Identification Model with Sparse Learning on In-Vehicle CAN-BUS Sensor Data. Sensors, 20.","DOI":"10.3390\/s20185030"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the Medical Image Computing and Computer Assisted Interventions (MICCAI), Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1856","DOI":"10.1109\/TIP.2019.2941265","article-title":"Mumford\u2013Shah Loss Functional for Image Segmentation with Deep Learning","volume":"29","author":"Kim","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, T., Zhang, K., Li, W., and Huang, Q. (2019, January 16\u201318). Research on ROI Algorithm of Ship Image Based on Improved YOLO. Proceedings of the 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), Dublin, Ireland.","DOI":"10.1109\/AIAM48774.2019.00033"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Li, S., Tao, F., Shi, T., and Kuang, J. (2019, January 20\u201322). Improvement of YOLOv3 network based on ROI. Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China.","DOI":"10.1109\/IAEAC47372.2019.8997986"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Morera, \u00c1., S\u00e1nchez, \u00c1., Moreno, A., Sappa, A.D., and V\u00e9lez, J.F. (2020). SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities. Sensors, 20.","DOI":"10.3390\/s20164587"},{"key":"ref_26","unstructured":"Chollet, F. (2020, September 09). Keras. Available online: https:\/\/keras.io."},{"key":"ref_27","unstructured":"Tzutalin (2020, September 14). LabelImg. Available online: https:\/\/github.com\/tzutalin\/labelImg."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Singh, S.P., Wang, L., Gupta, S., Goli, H., Padmanabhan, P., and Guly\u00e1s, B. (2020). 3D Deep Learning on Medical Images: A Review. Sensors, 20.","DOI":"10.3390\/s20185097"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tien, K.-Y., Samani, H., and Lui, J.H. (2017, January 12\u201315). A survey on image processing in noisy environment by fuzzy logic, image fusion, neural network, and non-local means. Proceedings of the 2017 International Automatic Control Conference (CACS), Pingtung, Taiwan.","DOI":"10.1109\/CACS.2017.8284240"},{"key":"ref_30","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2020, September 14). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: Tensorflow.org."},{"key":"ref_31","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2020, September 16). Caffe. Available online: https:\/\/github.com\/BVLC\/caffe."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/24\/7121\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:44:04Z","timestamp":1760179444000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/24\/7121"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,11]]},"references-count":31,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["s20247121"],"URL":"https:\/\/doi.org\/10.3390\/s20247121","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,12,11]]}}}