{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T09:54:24Z","timestamp":1776419664480,"version":"3.51.2"},"reference-count":65,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2021,3,4]],"date-time":"2021-03-04T00:00:00Z","timestamp":1614816000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Public littering and discarded trash are, despite the effort being put to limit it, still a serious ecological, aesthetic, and social problem. The problematic waste is usually localised and picked up by designated personnel, which is a tiresome, time-consuming task. This paper proposes a low-cost solution enabling the localisation of trash and litter objects in low altitude imagery collected by an unmanned aerial vehicle (UAV) during an autonomous patrol mission. The objects of interest are detected in the acquired images and put on the global map using a set of onboard sensors commonly found in typical UAV autopilots. The core object detection algorithm is based on deep, convolutional neural networks. Since the task is domain-specific, a dedicated dataset of images containing objects of interest was collected and annotated. The dataset is made publicly available, and its description is contained in the paper. The dataset was used to test a range of embedded devices enabling the deployment of deep neural networks for inference onboard the UAV. The results of measurements in terms of detection accuracy and processing speed are enclosed, and recommendations for the neural network model and hardware platform are given based on the obtained values. The complete system can be put together using inexpensive, off-the-shelf components, and perform autonomous localisation of discarded trash, relieving human personnel of this burdensome task, and enabling automated pickup planning.<\/jats:p>","DOI":"10.3390\/rs13050965","type":"journal-article","created":{"date-parts":[[2021,3,5]],"date-time":"2021-03-05T00:39:07Z","timestamp":1614904747000},"page":"965","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":116,"title":["Autonomous, Onboard Vision-Based Trash and Litter Detection in Low Altitude Aerial Images Collected by an Unmanned Aerial Vehicle"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6483-2357","authenticated-orcid":false,"given":"Marek","family":"Kraft","sequence":"first","affiliation":[{"name":"Faculty of Control, Robotics and Electrical, Engineering, Institute of Robotics and Machine Intelligence, Pozna\u0144 University of Technology, 60-965 Pozna\u0144, Poland"}]},{"given":"Mateusz","family":"Piechocki","sequence":"additional","affiliation":[{"name":"Faculty of Control, Robotics and Electrical, Engineering, Institute of Robotics and Machine Intelligence, Pozna\u0144 University of Technology, 60-965 Pozna\u0144, Poland"}]},{"given":"Bartosz","family":"Ptak","sequence":"additional","affiliation":[{"name":"Faculty of Control, Robotics and Electrical, Engineering, Institute of Robotics and Machine Intelligence, Pozna\u0144 University of Technology, 60-965 Pozna\u0144, Poland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2800-2716","authenticated-orcid":false,"given":"Krzysztof","family":"Walas","sequence":"additional","affiliation":[{"name":"Faculty of Control, Robotics and Electrical, Engineering, Institute of Robotics and Machine Intelligence, Pozna\u0144 University of Technology, 60-965 Pozna\u0144, Poland"}]}],"member":"1968","published-online":{"date-parts":[[2021,3,4]]},"reference":[{"key":"ref_1","unstructured":"Campbell, F. (2007). People Who Litter, ENCAMS."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1287\/inte.14.2.1","article-title":"Management Science in New York\u2019s Department of Sanitation","volume":"14","author":"Riccio","year":"1984","journal-title":"Interfaces"},{"key":"ref_3","first-page":"405","article-title":"Unpleasant or tedious jobs in the industrialised countries","volume":"117","author":"Dufour","year":"1978","journal-title":"Int. Labour Rev."},{"key":"ref_4","unstructured":"Proen\u00e7a, P.F., and Sim\u00f5es, P. (2020). TACO: Trash Annotations in Context for Litter Detection. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lowe, D.G. (1999, January 20\u201327). Object recognition from local scale-invariant features. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.790410"},{"key":"ref_6","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","article-title":"Object detection with discriminatively trained part-based models","volume":"32","author":"Felzenszwalb","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Malisiewicz, T., Gupta, A., and Efros, A.A. (2011, January 6\u201313). Ensemble of exemplar-SVMs for object detection and beyond. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126229"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_11","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Elsken, T., Metzen, J.H., and Hutter, F. (2018). Neural architecture search: A survey. arXiv.","DOI":"10.1007\/978-3-030-05318-5_3"},{"key":"ref_17","unstructured":"Real, E., Aggarwal, A., Huang, Y., and Le, Q.V. (February, January 21). Regularized evolution for image classifier architecture search. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zoph, B., Cubuk, E.D., Ghiasi, G., Lin, T.Y., Shlens, J., and Le, Q.V. (2019). Learning data augmentation strategies for object detection. arXiv.","DOI":"10.1109\/CVPR.2019.00020"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_20","unstructured":"Tan, M., and Le, Q.V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_22","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_23","unstructured":"Misra, D. (2019). Mish: A self regularized non-monotonic neural activation function. arXiv."},{"key":"ref_24","unstructured":"Ghiasi, G., Lin, T.Y., and Le, Q.V. (2018). Dropblock: A regularization method for convolutional networks. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Yang, Z., Wang, Z., Xu, W., He, X., Wang, Z., and Yin, Z. (2019, January 16\u201319). Region-aware Random Erasing. Proceedings of the 2019 IEEE 19th International Conference on Communication Technology (ICCT), Xi\u2019an, China.","DOI":"10.1109\/ICCT46805.2019.8947189"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 16\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2020). Scaled-YOLOv4: Scaling Cross Stage Partial Network. arXiv.","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1109\/72.129422","article-title":"Application of the ANNA neural network chip to high-speed character recognition","volume":"3","author":"Boser","year":"1992","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Jouppi, N.P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., and Borchers, A. (2017, January 24\u201328). In-datacenter performance analysis of a tensor processing unit. Proceedings of the 2017 ACM\/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON, Canada.","DOI":"10.1145\/3079856.3080246"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1109\/MSPEC.2017.7802746","article-title":"Deeper and cheaper machine learning [top tech 2017]","volume":"54","author":"Schneider","year":"2017","journal-title":"IEEE Spectr."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sugiarto, I., Liu, G., Davidson, S., Plana, L.A., and Furber, S.B. (2016, January 9\u201311). High performance computing on spinnaker neuromorphic platform: A case study for energy efficient image processing. Proceedings of the 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC), Las Vegas, NV, USA.","DOI":"10.1109\/PCCC.2016.7820645"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1109\/MSSC.2017.2745818","article-title":"Embedded deep neural network processing: Algorithmic and processor techniques bring deep learning to IoT and edge devices","volume":"9","author":"Verhelst","year":"2017","journal-title":"IEEE Solid-State Circuits Mag."},{"key":"ref_33","unstructured":"Lin, D., Talathi, S., and Annapureddy, S. (2016, January 19\u201324). Fixed point quantization of deep convolutional networks. Proceedings of the International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_34","unstructured":"Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., and Bengio, Y. (2016, January 5\u201310). Binarized neural networks. Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1631\/FITEE.1700789","article-title":"Recent advances in efficient computation of deep convolutional neural networks","volume":"19","author":"Cheng","year":"2018","journal-title":"Front. Inf. Technol. Electron. Eng."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1109\/MSP.2017.2765695","article-title":"Model compression and acceleration for deep neural networks: The principles, progress, and challenges","volume":"35","author":"Cheng","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/s10514-012-9281-4","article-title":"PIXHAWK: A micro aerial vehicle design for autonomous flight using onboard computer vision","volume":"33","author":"Meier","year":"2012","journal-title":"Auton. Robot."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Ebeid, E., Skriver, M., and Jin, J. (October, January 30). A survey on open-source flight control platforms of unmanned aerial vehicle. Proceedings of the 2017 Euromicro Conference on Digital System Design (DSD), Vienna, Austria.","DOI":"10.1109\/DSD.2017.30"},{"key":"ref_39","unstructured":"Franklin, D., Hariharapura, S.S., and Todd, S. (2012, December 15). Bringing Cloud-Native Agility to Edge AI Devices with the NVIDIA Jetson Xavier NX Developer Kit. Available online: https:\/\/developer.nvidia.com\/blog\/bringing-cloud-native-agility-to-edge-ai-with-jetson-xavier-nx\/."},{"key":"ref_40","unstructured":"Upton, E., and Halfacree, G. (2014). Raspberry Pi User Guide, John Wiley & Sons."},{"key":"ref_41","unstructured":"Libutti, L.A., Igual, F.D., Pinuel, L., De Giusti, L., and Naiouf, M. (2020, January 31). Benchmarking performance and power of USB accelerators for inference with MLPerf. Proceedings of the 2nd Workshop on Accelerated Machine Learning (AccML), Valencia, Spain."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"104046","DOI":"10.1016\/j.imavis.2020.104046","article-title":"Deep learning-based object detection in low-altitude UAV datasets: A survey","volume":"104","author":"Mittal","year":"2020","journal-title":"Image Vis. Comput."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Ammour, N., Alhichri, H., Bazi, Y., Benjdira, B., Alajlan, N., and Zuair, M. (2017). Deep learning approach for car detection in UAV imagery. Remote. Sens., 9.","DOI":"10.3390\/rs9040312"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Wang, X., Cheng, P., Liu, X., and Uzochukwu, B. (2018, January 21\u201323). Fast and accurate, convolutional neural network based approach for object detection from UAV. Proceedings of the IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA.","DOI":"10.1109\/IECON.2018.8592805"},{"key":"ref_45","unstructured":"Zhang, X., Izquierdo, E., and Chandramouli, K. (November, January 27). Dense and small object detection in uav vision based on cascade network. Proceedings of the IEEE International Conference on Computer Vision Workshops, Seoul, Korea."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"103910","DOI":"10.1016\/j.imavis.2020.103910","article-title":"Recent advances in small object detection based on deep learning: A review","volume":"97","author":"Tong","year":"2020","journal-title":"Image Vis. Comput."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Robicquet, A., Sadeghian, A., Alahi, A., and Savarese, S. (2016, January 8\u201316). Learning social etiquette: Human trajectory understanding in crowded scenes. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46484-8_33"},{"key":"ref_48","unstructured":"Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., and Ling, H. (2020). Vision Meets Drones: Past, Present and Future. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"1256","DOI":"10.1007\/s11263-019-01177-1","article-title":"Deep learning approach in aerial imagery for supporting land search and rescue missions","volume":"127","author":"Gotovac","year":"2019","journal-title":"Int. J. Comput. Vis."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"110823","DOI":"10.1016\/j.marpolbul.2019.110823","article-title":"Field test of beach litter assessment by commercial aerial drone","volume":"151","author":"Lo","year":"2020","journal-title":"Mar. Pollut. Bull."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Merlino, S., Paterni, M., Berton, A., and Massetti, L. (2020). Unmanned Aerial Vehicles for Debris Survey in Coastal Areas: Long-Term Monitoring Programme to Study Spatial and Temporal Accumulation of the Dynamics of Beached Marine Litter. Remote. Sens., 12.","DOI":"10.3390\/rs12081260"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"105478","DOI":"10.1016\/j.ocecoaman.2020.105478","article-title":"Autonomous litter surveying and human activity monitoring for governance intelligence in coastal eco-cyber-physical systems","volume":"200","author":"Nazerdeylami","year":"2021","journal-title":"Ocean. Coast. Manag."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Hong, J., Fulton, M., and Sattar, J. (August, January 31). A Generative Approach Towards Improved Robotic Detection of Marine Litter. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197575"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"100026","DOI":"10.1016\/j.cscee.2020.100026","article-title":"AquaVision: Automating the detection of waste in water bodies using deep transfer learning","volume":"2","author":"Panwar","year":"2020","journal-title":"Case Stud. Chem. Environ. Eng."},{"key":"ref_55","unstructured":"Gorbachev, Y., Fedorov, M., Slavutin, I., Tugarev, A., Fatekhov, M., and Tarkan, Y. (2019, January 27\u201328). OpenVINO deep learning workbench: Comprehensive analysis and tuning of neural networks inference. Proceedings of the IEEE International Conference on Computer Vision Workshops, Seoul, Korea."},{"key":"ref_56","unstructured":"Lee, J., Chirkov, N., Ignasheva, E., Pisarchyk, Y., Shieh, M., Riccardi, F., Sarokin, R., Kulik, A., and Grundmann, M. (2019). On-device neural net inference with mobile gpus. arXiv."},{"key":"ref_57","unstructured":"Gray, A., Gottbrath, C., Olson, R., and Prasanna, S. (2012, December 15). Deploying Deep Neural Networks with NVIDIA TensorRT. Available online: https:\/\/developer.nvidia.com\/blog\/deploying-deep-learning-nvidia-tensorrt\/."},{"key":"ref_58","unstructured":"Kaehler, A., and Bradski, G. (2016). Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library, O\u2019Reilly Media, Inc."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Rehder, J., Nikolic, J., Schneider, T., Hinzmann, T., and Siegwart, R. (2016, January 16\u201321). Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487628"},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1016\/j.procs.2017.01.090","article-title":"Applications of multi-height sensors data fusion and fault-tolerant Kalman filter in integrated navigation system of UAV","volume":"103","author":"Geng","year":"2017","journal-title":"Procedia Comput. Sci."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1007\/s10846-017-0483-z","article-title":"Survey on computer vision for UAVs: Current developments and trends","volume":"87","author":"Kanellakis","year":"2017","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/j.eswa.2017.09.033","article-title":"Survey of computer vision algorithms and applications for unmanned aerial vehicles","volume":"92","author":"Martin","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_63","unstructured":"Krishnamoorthi, R. (2018). Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv."},{"key":"ref_64","first-page":"7","article-title":"Security, privacy, and safety aspects of civilian drones: A survey","volume":"1","author":"Altawy","year":"2016","journal-title":"ACM Trans. Cyber-Phys. Syst."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1186\/s40965-018-0050-y","article-title":"OpenLitterMap. com\u2013open data on plastic pollution with blockchain rewards (littercoin)","volume":"3","author":"Lynch","year":"2018","journal-title":"Open Geospat. Data Softw. Stand."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/5\/965\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:32:47Z","timestamp":1760160767000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/5\/965"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,4]]},"references-count":65,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2021,3]]}},"alternative-id":["rs13050965"],"URL":"https:\/\/doi.org\/10.3390\/rs13050965","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,4]]}}}