{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:32:14Z","timestamp":1760149934045,"version":"build-2065373602"},"reference-count":25,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2023,9,25]],"date-time":"2023-09-25T00:00:00Z","timestamp":1695600000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>A visual servoing system is a type of control system used in robotics that employs visual feedback to guide the movement of a robot or a camera to achieve a desired task. This problem is addressed using deep models that receive a visual representation of the current and desired scene, to compute the control input. The focus is on early fusion, which consists of using additional information integrated into the neural input array. In this context, we discuss how ready-to-use information can be directly obtained from the current and desired scenes, to facilitate the learning process. Inspired by some of the most effective traditional visual servoing techniques, we introduce early fusion based on image moments and provide an extensive analysis of approaches based on image moments, region-based segmentation, and feature points. These techniques are applied stand-alone or in combination, to allow obtaining maps with different levels of detail. The role of the extra maps is experimentally investigated for scenes with different layouts. The results show that early fusion facilitates a more accurate approximation of the linear and angular camera velocities, in order to control the movement of a 6-degree-of-freedom robot from a current configuration to a desired one. The best results were obtained for the extra maps providing details of low and medium levels.<\/jats:p>","DOI":"10.3390\/e25101378","type":"journal-article","created":{"date-parts":[[2023,9,26]],"date-time":"2023-09-26T03:53:00Z","timestamp":1695700380000},"page":"1378","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Enhancing Visual Feedback Control through Early Fusion Deep Learning"],"prefix":"10.3390","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-1964-5218","authenticated-orcid":false,"given":"Adrian-Paul","family":"Botezatu","sequence":"first","affiliation":[{"name":"Faculty of Automatic Control and Computer Engineering, \u201cGheorghe Asachi\u201d Technical University of Iasi, D. Mangeron 27, 700050 Iasi, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5758-6696","authenticated-orcid":false,"given":"Lavinia-Eugenia","family":"Ferariu","sequence":"additional","affiliation":[{"name":"Faculty of Automatic Control and Computer Engineering, \u201cGheorghe Asachi\u201d Technical University of Iasi, D. Mangeron 27, 700050 Iasi, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3844-1808","authenticated-orcid":false,"given":"Adrian","family":"Burlacu","sequence":"additional","affiliation":[{"name":"Faculty of Automatic Control and Computer Engineering, \u201cGheorghe Asachi\u201d Technical University of Iasi, D. Mangeron 27, 700050 Iasi, Romania"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/MRA.2006.250573","article-title":"Visual servo control Part I: Basic approaches","volume":"13","author":"Chaumette","year":"2006","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1109\/MRA.2007.339609","article-title":"Visual servo control Part II: Advanced approaches","volume":"14","author":"Chaumette","year":"2007","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_3","unstructured":"Chaumette, F., Hutchinson, S., and Corke, P. (2016). Handbook of Robotics, Springer."},{"key":"ref_4","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012). Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_5","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very deep convolutional networks for large-scale image recognition. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Dosovitsky, A., Fischery, P., Ilg, E., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., and Brox, T. (2015, January 7\u201313). FlowNet: Learning optical flow with convolutional networks. Proceedings of the International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.316"},{"key":"ref_7","unstructured":"Saxena, A., Pandya, H., Kumar, G., Gaud, A., and Krishna, K. (June, January 29). Exploring convolutional networks for end-to-end visual servoing. Proceedings of the IEEE International Conference on Robotics and Automation, Singapore."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Bateux, Q., Marchand, E., Leitner, J., Chaumette, F., and Corke, P. (2018, January 21\u201325). Training deep neural networks for visual servoing. Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8461068"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"91820","DOI":"10.1109\/ACCESS.2021.3091737","article-title":"Convolutional neural network based visual servoing for eye-to-hand manipulator","volume":"9","author":"Tokuda","year":"2021","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"103757","DOI":"10.1016\/j.robot.2021.103757","article-title":"Real-time deep learning approach to visual servo control and grasp detection for autonomous robotic manipulation","volume":"139","author":"Ribeiro","year":"2021","journal-title":"Elsevier\u2019s Robot. Auton. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"104042","DOI":"10.1016\/j.imavis.2020.104042","article-title":"Deep multimodal fusion for semantic image segmentation: A survey","volume":"105","author":"Zhang","year":"2021","journal-title":"Image Vis. Comput."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ferariu, L., and Neculau, E.D. (2021, January 20\u201323). Fusing convolutional neural networks with segmentation for brain tumor classification. Proceedings of the 2021 25th International Conference on System Theory, Control and Computing (ICSTCC), Iasi, Romania.","DOI":"10.1109\/ICSTCC52150.2021.9607260"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Botezatu, P., Ferariu, L., Burlacu, A., and Sauciuc, T. (2022, January 22\u201325). Early Fusion Based CNN Architecture for Visual Servoing Systems. Proceedings of the 2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR), Mi\u0119dzyzdroje, Poland.","DOI":"10.1109\/MMAR55195.2022.9874328"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Botezatu, P., Ferariu, L., Burlacu, A., and Sauciuc, T. (2022, January 19\u201321). Visual Feedback Control using CNN Based Architecture with Input Data Fusion. Proceedings of the 2022 26th International Conference on System Theory, Control and Computing (ICSTCC), Sinaia, Romania.","DOI":"10.1109\/ICSTCC55426.2022.9931843"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Gao, Y., Hendricks, L.A., Kuchenbecker, K.J., and Darrell, T. (2016, January 16\u201321). Deep learning for tactile understanding from visual and haptic data. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487176"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Collewet, C., Marchand, E., and Chaumette, F. (2008, January 19\u201323). Visual servoing set free from image processing. Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA.","DOI":"10.1109\/ROBOT.2008.4543190"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1117\/12.580714","article-title":"Using scale-invariant feature points in visual servoing","volume":"Volume 5603","author":"Shademan","year":"2004","journal-title":"Machine Vision and its Optomechatronic Applications"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1007\/s12541-012-0049-8","article-title":"Robotic grasping based on efficient tracking and visual servoing using local feature descriptors","volume":"13","author":"Song","year":"2012","journal-title":"Int. J. Precis. Eng. Manuf."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"SURF: Speeded Up Robust Features","volume":"110","author":"Bay","year":"2008","journal-title":"Comput. Vis. Image Underst. J."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Leutenegger, S., Chli, M., and Siegwart, R. (2011, January 6\u201313). BRISK: Binary Robust Invariant Scalable Keypoints. Proceedings of the IEEE International Conference ICCV, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126542"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1109\/TRO.2004.829463","article-title":"Image moments: A general and useful set of features for visual servoing","volume":"20","author":"Chaumette","year":"2004","journal-title":"IEEE Trans. Robot."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1109\/TRO.2005.853500","article-title":"Point-based and region-based image moments for visual servoing of planar objects","volume":"21","author":"Tahri","year":"2005","journal-title":"IEEE Trans. Robot."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1109\/TIT.1962.1057692","article-title":"Visual pattern recognition by moment invariants","volume":"8","author":"Hu","year":"1962","journal-title":"IRE Trans. Inf. Theory"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1106","DOI":"10.1109\/34.473239","article-title":"Complete sets of complex Zernike moments invariants and the role of the pseudo-invariants","volume":"17","author":"Walin","year":"1995","journal-title":"IEEE Trans. PAMI"},{"key":"ref_25","unstructured":"Glorot, X., and Bengio, Y. (2010, January 13\u201315). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the JMLR Workshop and Conference Proceedings\u2014Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/10\/1378\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:57:36Z","timestamp":1760129856000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/25\/10\/1378"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,25]]},"references-count":25,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["e25101378"],"URL":"https:\/\/doi.org\/10.3390\/e25101378","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2023,9,25]]}}}