{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:38:24Z","timestamp":1776443904251,"version":"3.51.2"},"reference-count":96,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,2,5]],"date-time":"2021-02-05T00:00:00Z","timestamp":1612483200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012290","name":"Innovation and Networks Executive Agency","doi-asserted-by":"publisher","award":["769129"],"award-info":[{"award-number":["769129"]}],"id":[{"id":"10.13039\/501100012290","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>We present MultEYE, a traffic monitoring system that can detect, track, and estimate the velocity of vehicles in a sequence of aerial images. The presented solution has been optimized to execute these tasks in real-time on an embedded computer installed on an Unmanned Aerial Vehicle (UAV). In order to overcome the limitation of existing object detection architectures related to accuracy and computational overhead, a multi-task learning methodology was employed by adding a segmentation head to an object detector backbone resulting in the MultEYE object detection architecture. On a custom dataset, it achieved 4.8% higher mean Average Precision (mAP) score, while being 91.4% faster than the state-of-the-art model and while being able to generalize to different real-world traffic scenes. Dedicated object tracking and speed estimation algorithms have been then optimized to track reliably objects from an UAV with limited computational effort. Different strategies to combine object detection, tracking, and speed estimation are discussed, too. From our experiments, the optimized detector runs at an average frame-rate of up to 29 frames per second (FPS) on frame resolution 512 \u00d7 320 on a Nvidia Xavier NX board, while the optimally combined detector, tracker and speed estimator pipeline achieves speeds of up to 33 FPS on an image of resolution 3072 \u00d7 1728. To our knowledge, the MultEYE system is one of the first traffic monitoring systems that was specifically designed and optimized for an UAV platform under real-world constraints.<\/jats:p>","DOI":"10.3390\/rs13040573","type":"journal-article","created":{"date-parts":[[2021,2,7]],"date-time":"2021-02-07T14:04:13Z","timestamp":1612706653000},"page":"573","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":88,"title":["MultEYE: Monitoring System for Real-Time Vehicle Detection, Tracking and Speed Estimation from UAV Imagery on Edge-Computing Platforms"],"prefix":"10.3390","volume":"13","author":[{"given":"Navaneeth","family":"Balamuralidhar","sequence":"first","affiliation":[{"name":"Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7514 AE Enschede, The Netherlands"},{"name":"XO Sight B.V, 2614 AC Delft, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6944-3374","authenticated-orcid":false,"given":"Sofia","family":"Tilon","sequence":"additional","affiliation":[{"name":"Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7514 AE Enschede, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5712-6902","authenticated-orcid":false,"given":"Francesco","family":"Nex","sequence":"additional","affiliation":[{"name":"Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, 7514 AE Enschede, The Netherlands"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,5]]},"reference":[{"key":"ref_1","unstructured":"Gardner, M.P. (2000). Highway Traffic Monitoring, South Dakota Department of Transportation. Technical Report; A2B08."},{"key":"ref_2","unstructured":"Frank, H. (2020, July 27). Expanded Traffic-Cam System in Monroe County Will Cost PennDOT 4.3M. Available online: http:\/\/www.poconorecord.com\/apps\/pbcs.dll\/articlAID=\/20130401\/NEWS\/1010402\/-1\/NEWS."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Maimaitijiang, M., Sagan, V., Sidike, P., Daloye, A.M., Erkbol, H., and Fritschi, F.B. (2020). Crop Monitoring Using Satellite\/UAV Data Fusion and Machine Learning. Remote Sens., 12.","DOI":"10.3390\/rs12091357"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1080\/22797254.2018.1527661","article-title":"Monitoring of crop fields using multispectral and thermal imagery from UAV","volume":"52","author":"Raeva","year":"2019","journal-title":"Eur. J. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Feng, X., and Li, P. (2019). A Tree Species Mapping Method from UAV Images over Urban Area Using Similarity in Tree-Crown Object Histograms. Remote Sens., 11.","DOI":"10.3390\/rs11171982"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wu, X., Shen, X., Cao, L., Wang, G., and Cao, F. (2019). Assessment of individual tree detection and canopy cover estimation using unmanned aerial vehicle based light detection and ranging (UAV-LiDAR) data in planted forests. Remote Sens., 11.","DOI":"10.3390\/rs11080908"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"012003","DOI":"10.1088\/1755-1315\/169\/1\/012003","article-title":"Remote sensing UAV\/drones and its applications for urban areas: A review","volume":"169","author":"Noor","year":"2018","journal-title":"IOP Conf. Ser. Earth Environ. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Nex, F., Duarte, D., Tonolo, F.G., and Kerle, N. (2019). Structural Building Damage Detection with Deep Learning: Assessment of a State-of-the-Art CNN in Operational Conditions. Remote Sens., 11.","DOI":"10.3390\/rs11232765"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1111\/j.1467-8667.2011.00727.x","article-title":"An Unmanned Aerial Vehicle-Based Imaging System for 3D Measurement of Unpaved Road Surface Distresses","volume":"27","author":"Zhang","year":"2012","journal-title":"Comput. Aided Civ. Infrastruct. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Tan, Y., and Li, Y. (2019). UAV Photogrammetry-Based 3D Road Distress Detection. ISPRS Int. J. GeoInf., 8.","DOI":"10.3390\/ijgi8090409"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"05019001","DOI":"10.1061\/(ASCE)BE.1943-5592.0001343","article-title":"UAV Bridge Inspection through Evaluated 3D Reconstructions","volume":"24","author":"Chen","year":"2019","journal-title":"J. Bridge Eng."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Elloumi, M., Dhaou, R., Escrig, B., Idoudi, H., Saidane, L.A., and Fer, A. (2019, January 1\u20133). Traffic Monitoring on City Roads Using UAVs. Proceedings of the 18th International Conference on Ad-Hoc Networks and Wireless, ADHOC-NOW, Luxembourg.","DOI":"10.1007\/978-3-030-31831-4_42"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"St\u00f6cker, C., Bennett, R., Nex, F., Gerke, M., and Zevenbergen, J. (2017). Review of the Current State of UAV Regulations. Remote Sens., 9.","DOI":"10.3390\/rs9050459"},{"key":"ref_14","unstructured":"Press (2020, July 27). Dutch Government Successfully Uses Aerialtronics Drones to Control Traffic. Available online: https:\/\/www.suasnews.com\/2015\/07\/dutch-government-successfully-uses-aerialtronics-drones-to-control-traffic\/."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Elloumi, M., Dhaou, R., Escrig, B., Idoudi, H., and Saidane, L.A. (2018, January 15\u201318). Monitoring road traffic with a UAV-based system. Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain.","DOI":"10.1109\/WCNC.2018.8377077"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1016\/j.trpro.2017.03.043","article-title":"UAV-based traffic analysis: A universal guiding framework based on literature survey","volume":"22","author":"Khan","year":"2017","journal-title":"Transp. Res. Procedia"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Niu, H., Gonzalez-Prelcic, N., and Heath, R.W. (2018, January 3\u20136). A UAV-based traffic monitoring system\u2014Invited paper. Proceedings of the IEEE 87th Vehicular Technology Conference (VTC Spring), Porto, Portugal.","DOI":"10.1109\/VTCSpring.2018.8417546"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1007\/s10115-016-1004-2","article-title":"The (black) art of runtime evaluation: Are we comparing algorithms or implementations?","volume":"52","author":"Kriegel","year":"2017","journal-title":"Knowl. Inf. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and PATTERN Recognition, Venice, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Kwan, C., Chou, B., Yang, J., Rangamani, A., Tran, T., Zhang, J., and Etienne-Cummings, R. (2019). Deep Learning-Based Target Tracking and Classification for Low Quality Videos Using Coded Aperture Camera. Sensors, 19.","DOI":"10.3390\/s19173702"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, J., Dai, Y., Li, C., Shu, J., Li, D., Yang, T., and Lu, Z. (2018). Visual Detail Augmented Mapping for Small Aerial Target Detection. Remote Sens., 11.","DOI":"10.3390\/rs11010014"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1023\/A:1007379606734","article-title":"Multitask learning","volume":"28","author":"Caruana","year":"1997","journal-title":"Mach. Learn."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hashimoto, K., Xiong, C., Tsuruoka, Y., and Socher, R. (2016). A joint many-task model: Growing a neural network for multiple nlp tasks. arXiv.","DOI":"10.18653\/v1\/D17-1206"},{"key":"ref_26","unstructured":"McCann, B., Keskar, N.S., Xiong, C., and Socher, R. (2018). The natural language decathlon: Multitask learning as question answering. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Teichmann, M., Weber, M., Z\u00f6llner, M., Cipolla, R., and Urtasun, R. (2018, January 26\u201330). MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving. Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Changshu, China.","DOI":"10.1109\/IVS.2018.8500504"},{"key":"ref_28","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_29","unstructured":"Paszke, A., Chaurasia, A., Kim, S., and Culurciello, E. (2016). Enet: A deep neural network architecture for real-time semantic segmentation. arXiv."},{"key":"ref_30","unstructured":"Forsyth, D.A., and Ponce, J. (2003). Computer Vision: A Modern Approach, Prentice Hall."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Milletari, F., Navab, N., and Ahmadi, S.A. (2016, January 25\u201328). V-net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.79"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Kim, J., and Park, C. (2017, January 21\u201326). End-to-end ego lane estimation based on sequential transfer learning for self-driving cars. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.158"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Ullah, M., Mohammed, A., and Alaya Cheikh, F. (2018). Pednet: A spatio-temporal deep convolutional neural network for pedestrian segmentation. J. Imaging, 4.","DOI":"10.3390\/jimaging4090107"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ammar, S., Bouwmans, T., Zaghden, N., and Neji, M. (2019, January 7\u20139). Moving objects segmentation based on deepsphere in video surveillance. Proceedings of the 14th International Symposium on Visual Computing, ISVC 2019, Lake Tahoe, NV, USA. Part II.","DOI":"10.1007\/978-3-030-33723-0_25"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Silberman, N., Hoiem, D., Kohli, P., and Fergus, R. (2012, January 7\u201313). Indoor segmentation and support inference from rgbd images. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy. Part V.","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the 18th International Conference, Munich, Germany. Part III.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (2017, January 21\u201326). Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_41","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_43","unstructured":"Yu, F., and Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding convolution for semantic segmentation. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yang, M., Yu, K., Zhang, C., Li, Z., and Yang, K. (2018, January 18\u201323). Denseaspp for semantic segmentation in street scenes. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00388"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"297","DOI":"10.14358\/PERS.85.4.297","article-title":"Vehicle detection in aerial images","volume":"85","author":"Yang","year":"2019","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Sommer, L.W., Schuchert, T., and Beyerer, J. (2017, January 24\u201331). Fast deep vehicle detection in aerial images. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.41"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"3652","DOI":"10.1109\/JSTARS.2017.2694890","article-title":"Toward fast and accurate vehicle detection in aerial images using coupled region-based convolutional neural networks","volume":"10","author":"Deng","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). SSD: Single shot multibox detector. Proceedings of the 14th European Conference, Amsterdam, The Netherlands. Part I.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1038\/381607a0","article-title":"Emergence of simple-cell receptive field properties by learning a sparse code for natural images","volume":"381","author":"Olshausen","year":"1996","journal-title":"Nature"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"3327","DOI":"10.1016\/S0042-6989(97)00121-1","article-title":"The \u201cindependent components\u201d of natural scenes are edge filters","volume":"37","author":"Bell","year":"1997","journal-title":"Vis. Res."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Gidaris, S., and Komodakis, N. (2015, January 7\u201313). Object detection via a multi-region and semantic segmentation-aware cnn model. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.135"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Brahmbhatt, S., Christensen, H.I., and Hays, J. (2017, January 24\u201331). StuffNet: Using \u2018Stuff\u2019to improve object detection. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Los Alamitos, CA, USA.","DOI":"10.1109\/WACV.2017.109"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., and Gupta, A. (2016, January 11\u201314). Contextual priming and feedback for faster r-cnn. Proceedings of the 14th European Conference, Amsterdam, The Netherlands. Part I.","DOI":"10.1007\/978-3-319-46448-0_20"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_59","first-page":"61","article-title":"Visual object tracking: A survey","volume":"31","author":"Lu","year":"2018","journal-title":"Pattern Recognit. Artif. Intell."},{"key":"ref_60","unstructured":"Cuevas, E., Zaldivar, D., and Rojas, R. (2005). Kalman Filter for Vision Tracking, Freie Universitat Berlin. Technical Report August."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Okuma, K., Taleghani, A., De Freitas, N., Little, J.J., and Lowe, D.G. (2004, January 11\u201314). A boosted particle filter: Multitarget detection and tracking. Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic. Part I.","DOI":"10.1007\/978-3-540-24670-1_3"},{"key":"ref_62","unstructured":"Bochinski, E., Eiselein, V., and Sikora, T. (September, January 29). High-speed tracking-by-detection without using image information. Proceedings of the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B. (2016, January 25\u201328). Simple online and realtime tracking. Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Wojke, N., Bewley, A., and Paulus, D. (2017, January 17\u201320). Simple online and realtime tracking with a deep association metric. Proceedings of the IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Sadeghian, A., Alahi, A., and Savarese, S. (2017, January 22\u201329). Tracking the untrackable: Learning to track multiple cues with long-term dependencies. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.41"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Kart, U., Lukezic, A., Kristan, M., Kamarainen, J.K., and Matas, J. (2019, January 15\u201320). Object tracking by reconstruction with view-specific discriminative correlation filters. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00143"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Nam, H., and Han, B. (2016, January 27\u201330). Learning Multi-Domain Convolutional Neural Networks for Visual Tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.465"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Held, D., Thrun, S., and Savarese, S. (2016, January 11\u201314). Learning to track at 100 fps with deep regression networks. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_45"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Bolme, D.S., Beveridge, J.R., Draper, B.A., and Lui, Y.M. (2010, January 13\u201318). Visual object tracking using adaptive correlation filters. Proceedings of the IEEE Computer Society Conference on Computer Vision and PATTERN Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539960"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/TITS.2003.821213","article-title":"Dynamic camera calibration of roadside traffic management cameras for vehicle speed estimation","volume":"4","author":"Schoepflin","year":"2003","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Zhiwei, H., Yuanyuan, L., and Xueyi, Y. (2007, January 15\u201319). Models of vehicle speeds measurement with a single camera. Proceedings of the International Conference on Computational Intelligence and Security Workshops (CISW 2007), Harbin, China.","DOI":"10.1109\/CISW.2007.4425492"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Li, J., Chen, S., Zhang, F., Li, E., Yang, T., and Lu, Z. (2019). An adaptive framework for multi-vehicle ground speed estimation in airborne videos. Remote Sens., 11.","DOI":"10.3390\/rs11101241"},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Mark Liao, H.Y., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Ridnik, T., Lawen, H., Noy, A., and Friedman, I. (2020). TResNet: High Performance GPU-Dedicated Architecture. arXiv.","DOI":"10.1109\/WACV48630.2021.00144"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_77","unstructured":"Schultz van Haegen, M. (2020, July 22). Model Flying Scheme. Available online: https:\/\/wetten.overheid.nl\/BWBR0019147\/2019-04-01."},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Nigam, I., Huang, C., and Ramanan, D. (2018, January 12\u201315). Ensemble knowledge transfer for semantic segmentation. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00168"},{"key":"ref_79","unstructured":"Schmidt, F. (2020, July 22). Data Set for Tracking Vehicles in Aerial Image Sequences. Available online: http:\/\/www.ipf.kit.edu\/downloads_data_set_AIS_vehicle_tracking.php."},{"key":"ref_80","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20138). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA."},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"Salehi, S.S.M., Erdogmus, D., and Gholipour, A. (2017, January 10). Tversky loss function for image segmentation using 3D fully convolutional deep networks. Proceedings of the 8th International Workshop Machine Learning in Medical Imaging, Quebec City, QC, Canada.","DOI":"10.1007\/978-3-319-67389-9_44"},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_83","unstructured":"Bernardin, K., Elbs, A., and Stiefelhagen, R. (2006, January 13). Multiple object tracking performance metrics and evaluation in a smart room environment. Proceedings of the The Sixth IEEE International Workshop on Visual Surveillance (in Conjunction with ECCV), Graz, Austria."},{"key":"ref_84","unstructured":"Tan, M., and Le, Q.V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv."},{"key":"ref_85","unstructured":"Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., and Pang, R. (November, January 27). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_86","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_88","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Grabner, H., Grabner, M., and Bischof, H. (2006, January 4\u20137). Real-time tracking via on-line boosting. Proceedings of the The British Machine Vision Conference, Edinburgh, Scotland.","DOI":"10.5244\/C.20.6"},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Babenko, B., Yang, M.H., and Belongie, S. (2009, January 20\u201325). Visual tracking with online multiple instance learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPRW.2009.5206737"},{"key":"ref_91","doi-asserted-by":"crossref","first-page":"1409","DOI":"10.1109\/TPAMI.2011.239","article-title":"Tracking-learning-detection","volume":"34","author":"Kalal","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1109\/TPAMI.2014.2345390","article-title":"High-speed tracking with kernelized correlation filters","volume":"37","author":"Henriques","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Kalal, Z., Mikolajczyk, K., and Matas, J. (2010, January 23\u201326). Forward-backward error: Automatic detection of tracking failures. Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey.","DOI":"10.1109\/ICPR.2010.675"},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1007\/s11263-017-1061-3","article-title":"Discriminative correlation filter with channel and spatial reliability","volume":"126","author":"Lukezic","year":"2018","journal-title":"Int. J. Comput. Vis."},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., and Yan, J. (15\u201316, January 8\u201310). POI: Multiple object tracking with high performance detection and appearance feature. Proceedings of the European Conference on Computer Vision 2016 Workshops, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_3"},{"key":"ref_96","unstructured":"Gibbs, J. (2020, July 27). Drivers Risk fines as Speed Camera Tolerances Revealed. Available online: https:\/\/www.confused.com\/on-the-road\/driving-law\/speed-camera-tolerances."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/4\/573\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:20:18Z","timestamp":1760160018000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/4\/573"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,5]]},"references-count":96,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["rs13040573"],"URL":"https:\/\/doi.org\/10.3390\/rs13040573","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,5]]}}}