{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,14]],"date-time":"2026-05-14T02:19:37Z","timestamp":1778725177043,"version":"3.51.4"},"reference-count":39,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2020,6,7]],"date-time":"2020-06-07T00:00:00Z","timestamp":1591488000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Graduate Innovation Practice Fund","award":["YCSJ-01-201913"],"award-info":[{"award-number":["YCSJ-01-201913"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>What makes unmanned aerial vehicles (UAVs) intelligent is their capability of sensing and understanding new unknown environments. Some studies utilize computer vision algorithms like Visual Simultaneous Localization and Mapping (VSLAM) and Visual Odometry (VO) to sense the environment for pose estimation, obstacles avoidance and visual servoing. However, understanding the new environment (i.e., make the UAV recognize generic objects) is still an essential scientific problem that lacks a solution. Therefore, this paper takes a step to understand the items in an unknown environment. The aim of this research is to enable the UAV with basic understanding capability for a high-level UAV flock application in the future. Specially, firstly, the proposed understanding method combines machine learning and traditional algorithm to understand the unknown environment through RGB images; secondly, the You Only Look Once (YOLO) object detection system is integrated (based on TensorFlow) in a smartphone to perceive the position and category of 80 classes of objects in the images; thirdly, the method makes the UAV more intelligent and liberates the operator from labor; fourthly, detection accuracy and latency in working condition are quantitatively evaluated, and properties of generality (can be used in various platforms), transportability (easily deployed from one platform to another) and scalability (easily updated and maintained) for UAV flocks are qualitatively discussed. The experiments suggest that the method has enough accuracy to recognize various objects with high computational speed, and excellent properties of generality, transportability and scalability.<\/jats:p>","DOI":"10.3390\/s20113245","type":"journal-article","created":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T05:16:14Z","timestamp":1591679774000},"page":"3245","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["A Machine Learning Method for Vision-Based Unmanned Aerial Vehicle Systems to Understand Unknown Environments"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5182-5393","authenticated-orcid":false,"given":"Tianyao","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"},{"name":"ShenYuan Honors College of Beihang University, Beihang University, Beijing 100191, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoguang","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jin","family":"Xiao","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guofeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,6,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40327-015-0029-z","article-title":"Visual monitoring of civil infrastructure systems via camera-equipped Unmanned Aerial Vehicles (UAVs): A review of related works","volume":"4","author":"Ham","year":"2016","journal-title":"Vis. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1038\/nature14542","article-title":"Science, Technology and the Future of Small Autonomous Drones","volume":"521","author":"Floreano","year":"2015","journal-title":"Nature"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1109\/THMS.2015.2480801","article-title":"Human Interaction with Robot Swarms: A Survey","volume":"46","author":"Kolling","year":"2016","journal-title":"IEEE Trans. Hum.-Mach. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.apergo.2016.05.011","article-title":"A Meta-Analysis of Human-System Interfaces in Unmanned Aerial Vehicle (UAV) Swarm Management","volume":"58","author":"Hocraffer","year":"2017","journal-title":"Appl. Ergonom."},{"key":"ref_5","first-page":"273","article-title":"Human-robot interaction in UVs swarming: A survey","volume":"10","author":"Mi","year":"2013","journal-title":"Int. J. Comput. Sci. Issues"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"141","DOI":"10.5772\/58898","article-title":"Bio-Inspired Trajectory Generation for UAV Perching Movement Based on Tau Theory","volume":"11","author":"Zhang","year":"2014","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Opromolla, R., Inchingolo, G., and Fasano, G. (2019). Airborne Visual Detection and Tracking of Cooperative UAVs Exploiting Deep Learning. Sensors, 19.","DOI":"10.3390\/s19194332"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Shen, W., Xu, D., Zhu, Y., Fei-Fei, L., Guibas, L., and Savarese, S. (November, January 27). Situational Fusion of Visual Representation for Visual Navigation. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00297"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/978-3-319-46448-0_2","article-title":"SSD: Single Shot MultiBox Detector","volume":"Volume 9905","author":"Leibe","year":"2016","journal-title":"Computer Vision\u2014ECCV 2016"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (July, January 26). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_11","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Samaras, S., Diamantidou, E., Ataloglou, D., Sakellariou, N., Vafeiadis, A., Magoulianitis, V., Lalas, A., Dimou, A., Zarpalas, D., and Votis, K. (2019). Deep Learning on Multi Sensor Data for Counter UAV Applications\u2014A Systematic Review. Sensors, 19.","DOI":"10.3390\/s19224837"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/j.eswa.2017.09.033","article-title":"Survey of Computer Vision Algorithms and Applications for Unmanned Aerial Vehicles","volume":"92","author":"Escalera","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhou, J., Tian, Y., Yuan, C., Yin, K., Yang, G., and Wen, M. (2019). Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model. Sensors, 19.","DOI":"10.3390\/s19224851"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Falanga, D., Mueggler, E., Faessler, M., and Scaramuzza, D. (June, January 29). Aggressive Quadrotor Flight through Narrow Gaps with Onboard Sensing and Computing Using Active Vision. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989679"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Markiewicz, J., Abratkiewicz, K., Gromek, A., Samczy\u0144ski, W.O.P., and Gromek, D. (2019). Geometrical Matching of SAR and Optical Images Utilizing ASIFT Features for SAR-Based Navigation Aided Systems. Sensors, 19.","DOI":"10.3390\/s19245500"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"eaaz9712","DOI":"10.1126\/scirobotics.aaz9712","article-title":"Dynamic Obstacle Avoidance for Quadrotors with Event Cameras","volume":"5","author":"Falanga","year":"2020","journal-title":"Sci. Robot."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Bl\u00f6sch, M., Weiss, S., Scaramuzza, D., and Siegwart, R. (2010, January 3\u20138). Vision Based MAV Navigation in Unknown and Unstructured Environments. Proceedings of the 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA.","DOI":"10.1109\/ROBOT.2010.5509920"},{"key":"ref_19","unstructured":"Land, M.F., and Dan-Eric, N. (2012). Animal Eyes, Oxford University Press. [2nd ed.]."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3976","DOI":"10.1109\/TIE.2017.2764849","article-title":"A Vision-Aided Approach to Perching a Bioinspired Unmanned Aerial Vehicle","volume":"65","author":"Luo","year":"2018","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_21","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Gao, P., Zhang, Y., Zhang, L., Noguchi, R., and Ahamed, T. (2019). Development of a Recognition System for Spraying Areas from Unmanned Aerial Vehicles Using a Machine Learning Approach. Sensors, 19.","DOI":"10.3390\/s19020313"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Opromolla, R., Fasano, G., and Accardo, D. (2018). A Vision-Based Approach to UAV Detection and Tracking in Cooperative Applications. Sensors, 18.","DOI":"10.3390\/s18103391"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Hummel, K., Pollak, M., and Krahofer, J. (2019). A Distributed Architecture for Human-Drone Teaming: Timing Challenges and Interaction Opportunities. Sensors, 19.","DOI":"10.3390\/s19061379"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Martinez-Alpiste, I., Casaseca-de-la-Higuera, P., Alcaraz-Calero, J., Grecos, C., and Wang, Q. (2019, January 15\u201319). Benchmarking Machine-Learning-Based Object Detection on a UAV and Mobile Platform. Proceedings of the 2019 IEEE Wireless Communications and Networking Conference (WCNC), Marrakesh, Morocco.","DOI":"10.1109\/WCNC.2019.8885504"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Li, M., Zhao, L., Tan, D., and Tong, X. (2019). BLE Fingerprint Indoor Localization Algorithm Based on Eight-Neighborhood Template Matching. Sensors, 19.","DOI":"10.3390\/s19224859"},{"key":"ref_27","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet Classifification with Deep Convolutional Neural Networks. Proceedings of the International Conference on the Neural Information Processing Systems Conference, Lake Tahoe, NV, USA."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going Deeper with Convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_29","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_31","unstructured":"Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.R. (2012). Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wang, Q., Zhang, L., Bertinetto, L., Hu, W., and Torr, P.H.S. (2019, January 16\u201320). Fast Online Object Tracking and Segmentation: A Unifying Approach. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00142"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, T., Hu, X., Xiao, J., Zhang, G., and Fu, L. (2019, January 16\u201319). An Implementation of Non-Electronic Human-Swarm Interface for Multi-Agent System in Cooperative Searching. Proceedings of the 2019 IEEE 15th International Conference on Control and Automation (ICCA), Edinburgh, UK.","DOI":"10.1109\/ICCA.2019.8899992"},{"key":"ref_34","unstructured":"(2018, July 14). Running on Mobile with TensorFlow Lite. Available online: https:\/\/github.com\/tensorflow\/models\/blob\/master\/research\/object_detection\/g3doc\/running_on_mobile_tensorflowlite.md."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/0734-189X(85)90016-7","article-title":"Topological structural analysis of digitized binary images by border following","volume":"30","author":"Suzuki","year":"1985","journal-title":"Comput. Vis. Graph. Image Process."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Qi, Y., Yao, H., Sun, X., Sun, X., Zhang, Y., and Huang, Q. (2014, January 27\u201330). Structure-Aware Multi-Object Discovery for Weakly Supervised Tracking. Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France.","DOI":"10.1109\/ICIP.2014.7025093"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.neucom.2018.10.035","article-title":"Robust Visual Tracking via Scale-and-State-Awareness","volume":"329","author":"Qi","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_38","unstructured":"Qi, Y., Zhang, S., Zhang, W., Su, L., Huang, Q., and Yang, M.-H. (February, January 27). Learning Attribute-Specific Representations for Visual Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1007\/s11263-019-01266-1","article-title":"The Unmanned Aerial Vehicle Benchmark: Object Detection, Tracking and Baseline","volume":"128","author":"Yu","year":"2020","journal-title":"Int. J. Comput. Vis."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/11\/3245\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:36:26Z","timestamp":1760175386000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/11\/3245"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,7]]},"references-count":39,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2020,6]]}},"alternative-id":["s20113245"],"URL":"https:\/\/doi.org\/10.3390\/s20113245","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,7]]}}}