{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T03:54:00Z","timestamp":1769918040184,"version":"3.49.0"},"reference-count":102,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,5,23]],"date-time":"2024-05-23T00:00:00Z","timestamp":1716422400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Kookmin University Industry-Academic Cooperation Foundation and Incheon National University Research","award":["2021-0184"],"award-info":[{"award-number":["2021-0184"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>With the introduction of deep learning, a significant amount of research has been conducted in the field of computer vision in the past decade. In particular, research on object detection (OD) continues to progress rapidly. However, despite these advances, some limitations need to be overcome to enable real-world applications of deep learning-based OD models. One such limitation is inaccurate OD when image quality is poor or a target object is small. The performance degradation phenomenon for small objects is similar to the fundamental limitations of an OD model, such as the constraint of the receptive field, which is a difficult problem to solve using only an OD model. Therefore, OD performance can be hindered by low image quality or small target objects. To address this issue, this study investigates the compatibility of super-resolution (SR) and OD techniques to improve detection, particularly for small objects. We analyze the combination of SR and OD models, classifying them based on architectural characteristics. The experimental results show a substantial improvement when integrating OD detectors with SR models. Overall, it was demonstrated that, when the evaluation metrics (PSNR, SSIM) of the SR models are high, the performance in OD is correspondingly high as well. Especially, evaluations on the MS COCO dataset reveal that the enhancement rate for small objects is 9.4% higher compared to all objects. This work provides an analysis of SR and OD model compatibility, demonstrating the potential benefits of their synergistic combination. The experimental code can be found on our GitHub repository.<\/jats:p>","DOI":"10.3390\/s24113335","type":"journal-article","created":{"date-parts":[[2024,5,23]],"date-time":"2024-05-23T09:04:25Z","timestamp":1716455065000},"page":"3335","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Compatibility Review for Object Detection Enhancement through Super-Resolution"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9676-9604","authenticated-orcid":false,"given":"Daehee","family":"Kim","sequence":"first","affiliation":[{"name":"NAVER Cloud Corp., Seongnam 13529, Republic of Korea"},{"name":"College of Computer Science, Kookmin University, Seoul 02707, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1567-6591","authenticated-orcid":false,"given":"Sungmin","family":"Lee","sequence":"additional","affiliation":[{"name":"SK Telecom, Seoul 04539, Republic of Korea"}]},{"given":"Junghyeon","family":"Seo","sequence":"additional","affiliation":[{"name":"College of Computer Science, Kookmin University, Seoul 02707, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0106-7106","authenticated-orcid":false,"given":"Song","family":"Noh","sequence":"additional","affiliation":[{"name":"Department of Information and Telecommunication Engineering, Incheon National University, Incheon 22012, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5947-5487","authenticated-orcid":false,"given":"Jaekoo","family":"Lee","sequence":"additional","affiliation":[{"name":"College of Computer Science, Kookmin University, Seoul 02707, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). ImageNet: A Large-Scale Hierarchical Image Database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_3","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Computer Vision\u2014ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6\u201312 September 2014, Springer."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_5","unstructured":"Anwar, A., and Raychowdhury, A. (2020). Masked Face Recognition for Secure Authentication. arXiv."},{"key":"ref_6","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s11263-019-01247-4","article-title":"Deep learning for generic object detection: A survey","volume":"128","author":"Liu","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"128837","DOI":"10.1109\/ACCESS.2019.2939201","article-title":"A survey of deep learning-based object detection","volume":"7","author":"Jiao","year":"2019","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Bai, Y., Zhang, Y., Ding, M., and Ghanem, B. (2018, January 8\u201314). Sod-mtgan: Small object detection via multi-task generative adversarial network. Proceedings of the 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_13"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., and Hidayanto, A.N. (2021, January 8\u201312). Task-Driven Super Resolution: Object Detection in Low-Resolution Images. Neural Information Processing, Proceedings of the 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia.","DOI":"10.1007\/978-3-030-92307-5"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yang, S., Luo, P., Loy, C.C., and Tang, X. (2016, January 27\u201330). WIDER FACE: A Face Detection Benchmark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.596"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3322","DOI":"10.1109\/TIFS.2019.2916592","article-title":"JCS-Net: Joint Classification and Super-Resolution Network for Small-Scale Pedestrian Detection in Surveillance Images","volume":"14","author":"Pang","year":"2019","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wang, L., Li, D., Zhu, Y., Tian, L., and Shan, Y. (2020, January 13\u201319). Dual super-resolution learning for semantic segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00383"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"8972","DOI":"10.1109\/TMM.2023.3243615","article-title":"Online video super-resolution with convolutional kernel bypass grafts","volume":"25","author":"Xiao","year":"2023","journal-title":"IEEE Trans. Multimed."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2512","DOI":"10.1109\/TCSVT.2023.3301930","article-title":"Estimating high-resolution surface normals via low-resolution photometric stereo images","volume":"34","author":"Ju","year":"2023","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, B., Lu, T., and Zhang, Y. (2020, January 16\u201318). Feature-Driven Super-Resolution for Object Detection. Proceedings of the 2020 5th International Conference on Control, Robotics and Cybernetics (CRC), Wuhan, China.","DOI":"10.1109\/CRC51253.2020.9253468"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zheng, S., Wu, Y., Jiang, S., Lu, C., and Gupta, G. (2021, January 18\u201322). Deblur-YOLO: Real-Time Object Detection with Efficient Blind Motion Deblurring. Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China.","DOI":"10.1109\/IJCNN52387.2021.9534352"},{"key":"ref_20","unstructured":"Noh, J., Bae, W., Lee, W., Seo, J., and Kim, G. (November, January 27). Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2861","DOI":"10.1109\/TIP.2010.2050625","article-title":"Image super-resolution via sparse representation","volume":"19","author":"Yang","year":"2010","journal-title":"IEEE Trans. Image Process."},{"key":"ref_22","unstructured":"Timofte, R., De Smet, V., and Van Gool, L. (2014). Computer Vision\u2014ACCV 2014, Proceedings of the 12th Asian Conference on Computer Vision, Singapore, 1\u20135 November 2014, Springer."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Schulter, S., Leistner, C., and Bischof, H. (2015, January 7\u201312). Fast and accurate image upscaling with super-resolution forests. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299003"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Tian, Y., Kong, Y., Zhong, B., and Fu, Y. (2018, January 18\u201323). Residual dense network for image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00262"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, X., Xie, L., Dong, C., and Shan, Y. (2021, January 11\u201317). Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. Proceedings of the International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00217"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhang, K., Liang, J., Van Gool, L., and Timofte, R. (2021, January 10\u201317). Designing a practical degradation model for deep blind image super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00475"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1109\/TPAMI.2015.2439281","article-title":"Image super-resolution using deep convolutional networks","volume":"38","author":"Dong","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kim, J., Kwon Lee, J., and Mu Lee, K. (2016, January 27\u201330). Accurate image super-resolution using very deep convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.182"},{"key":"ref_29","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_30","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wang, Z., Liu, D., Yang, J., Han, W., and Huang, T. (2015, January 7\u201313). Deep Networks for Image Super-Resolution with Sparse Prior. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.50"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zheng, H., Ji, M., Wang, H., Liu, Y., and Fang, L. (2018, January 8\u201314). CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping. Proceedings of the 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01231-1_6"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Dong, C., Loy, C.C., and Tang, X. (2016, January 11\u201314). Accelerating the super-resolution convolutional neural network. Proceedings of the 14th European Conference, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46475-6_25"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Shi, W., Caballero, J., Husz\u00e1r, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., and Wang, Z. (2016, January 27\u201330). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.207"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Lai, W.S., Huang, J.B., Ahuja, N., and Yang, M.H. (2017, January 21\u201326). Deep laplacian pyramid networks for fast and accurate super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.618"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Haris, M., Shakhnarovich, G., and Ukita, N. (2018, January 18\u201323). Deep back-projection networks for super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00179"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 7\u201313). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, K., Zuo, W., and Zhang, L. (2018, January 18\u201323). Learning a single convolutional super-resolution network for multiple degradations. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00344"},{"key":"ref_39","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Ledig, C., Theis, L., Husz\u00e1r, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., and Wang, Z. (2017, January 21\u201326). Photo-realistic single image super-resolution using a generative adversarial network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.19"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Lim, B., Son, S., Kim, H., Nah, S., and Mu Lee, K. (2017, January 21\u201326). Enhanced deep residual networks for single image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.151"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., and Zhang, L. (2017, January 21\u201326). Ntire 2017 challenge on single image super-resolution: Methods and results. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.150"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Han, W., Chang, S., Liu, D., Yu, M., Witbrock, M., and Huang, T.S. (2018, January 18\u201323). Image super-resolution via dual-state recurrent networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00178"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"3365","DOI":"10.1109\/TPAMI.2020.2982166","article-title":"Deep learning for image super-resolution: A survey","volume":"43","author":"Wang","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Kim, J., Kwon Lee, J., and Mu Lee, K. (2016, January 27\u201330). Deeply-recursive convolutional network for image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.181"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Tai, Y., Yang, J., Liu, X., and Xu, C. (2017, January 22\u201329). Memnet: A persistent memory network for image restoration. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.486"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Li, Z., Yang, J., Liu, Z., Yang, X., Jeon, G., and Wu, W. (2019, January 15\u201320). Feedback network for image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00399"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Tai, Y., Yang, J., and Liu, X. (2017, January 21\u201326). Image super-resolution via deep recursive residual network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.298"},{"key":"ref_50","unstructured":"Liu, D., Wen, B., Fan, Y., Loy, C.C., and Huang, T.S. (2018, January 3\u20138). Non-local recurrent network for image restoration. Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montreal, QC, Canada."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Tong, T., Li, G., Liu, X., and Gao, Q. (2017, January 22\u201329). Image super-resolution using dense skip connections. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.514"},{"key":"ref_53","unstructured":"Timofte, R., Gu, S., Wu, J., and Van Gool, L. (2018, January 18\u201322). Ntire 2018 challenge on single image super-resolution: Methods and results. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., and Change Loy, C. (2018, January 8\u201314). Esrgan: Enhanced super-resolution generative adversarial networks. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-11021-5_5"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Ahn, N., Kang, B., and Sohn, K.A. (2018, January 8\u201314). Fast, accurate, and lightweight super-resolution with cascading residual network. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_16"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Sajjadi, M.S., Scholkopf, B., and Hirsch, M. (2017, January 22\u201329). Enhancenet: Single image super-resolution through automated texture synthesis. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.481"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Park, S.J., Son, H., Cho, S., Hong, K.S., and Lee, S. (2018, January 8\u201314). Srfeat: Single image super-resolution with feature discrimination. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_27"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Wang, Z., Lin, Z., and Qi, H. (2019, January 15\u201320). Image super-resolution by neural texture transfer. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00817"},{"key":"ref_59","unstructured":"Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A.C. (2017, January 4\u20139). Improved training of wasserstein gans. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., and Brox, T. (2015, January 7\u201313). Flownet: Learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.316"},{"key":"ref_61","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5\u20139 October 2015, Springer."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1023\/B:VISI.0000045324.43199.43","article-title":"Lucas\/Kanade meets Horn\/Schunck: Combining local and global optic flow methods","volume":"61","author":"Bruhn","year":"2005","journal-title":"Int. J. Comput. Vis."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Yang, F., Yang, H., Fu, J., Lu, H., and Guo, B. (2020, January 13\u201319). Learning texture transformer network for image super-resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00583"},{"key":"ref_64","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_65","unstructured":"Gregor, K., and LeCun, Y. (2010, January 21\u201324). Learning fast approximations of sparse coding. Proceedings of the 27th International Conference on International Conference on Machine Learning, Haifa, Israel."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Li, J., Fang, F., Mei, K., and Zhang, G. (2018, January 8\u201314). Multi-scale residual network for image super-resolution. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01237-3_32"},{"key":"ref_67","unstructured":"Hu, Y., Gao, X., Li, J., Huang, Y., and Wang, H. (2018). Single image super-resolution via cascaded multi-scale cross network. arXiv."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Ren, H., El-Khamy, M., and Lee, J. (2017, January 21\u201326). Image super resolution based on fusing multiple convolution neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.142"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Hui, Z., Wang, X., and Gao, X. (2018, January 18\u201323). Fast and accurate single image super-resolution via information distillation network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00082"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., and Fu, Y. (2018, January 8\u201314). Image super-resolution using very deep residual channel attention networks. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_18"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Dai, T., Cai, J., Zhang, Y., Xia, S.T., and Zhang, L. (2019, January 15\u201320). Second-order attention network for single image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01132"},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"622","DOI":"10.1016\/j.neunet.2023.11.049","article-title":"CVANet: Cascaded visual attention network for single image super-resolution","volume":"170","author":"Zhang","year":"2024","journal-title":"Neural Netw."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Chen, X., Wang, X., Zhou, J., Qiao, Y., and Dong, C. (2023, January 17\u201324). Activating more pixels in image super-resolution transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02142"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Chen, Z., Zhang, Y., Gu, J., Kong, L., Yang, X., and Yu, F. (2023, January 1\u20136). Dual aggregation transformer for image super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01131"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Shocher, A., Cohen, N., and Irani, M. (2018, January 18\u201323). \u201cZero-shot\u201d super-resolution using deep internal learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00329"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Maeda, S. (2020, January 13\u201319). Unpaired Image Super-Resolution using Pseudo-Supervision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00037"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1007\/s11263-013-0620-5","article-title":"Selective search for object recognition","volume":"104","author":"Uijlings","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_82","first-page":"379","article-title":"R-fcn: Object detection via region-based fully convolutional networks","volume":"29","author":"Dai","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Zhang, S., Wen, L., Bian, X., Lei, Z., and Li, S.Z. (2018, January 18\u201323). Single-shot refinement neural network for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00442"},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_88","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_90","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_91","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). Computer Vision\u2014ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11\u201314 October 2016, Springer."},{"key":"ref_92","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_93","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_94","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). End-to-End Object Detection with Transformers. arXiv.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_95","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv."},{"key":"ref_96","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., and Shum, H.Y. (2022). Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv."},{"key":"ref_97","doi-asserted-by":"crossref","unstructured":"Zong, Z., Song, G., and Liu, Y. (2023, January 1\u20136). Detrs with collaborative hybrid assignments training. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00621"},{"key":"ref_98","unstructured":"Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., and Zhang, L. (2022). Dab-detr: Dynamic anchor boxes are better queries for detr. arXiv."},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Agustsson, E., and Timofte, R. (2017, January 21\u201326). Ntire 2017 challenge on single image super-resolution: Dataset and study. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.150"},{"key":"ref_100","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1109\/TPAMI.2010.25","article-title":"Single-image super-resolution using sparse regression and natural image prior","volume":"32","author":"Kim","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_101","doi-asserted-by":"crossref","unstructured":"Bevilacqua, M., Roumy, A., Guillemot, C., and Alberi-Morel, M.L. (2012, January 3\u20137). Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proceedings of the British Machine Vision Conference, Surrey, UK.","DOI":"10.5244\/C.26.135"},{"key":"ref_102","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/11\/3335\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:47:13Z","timestamp":1760107633000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/11\/3335"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,23]]},"references-count":102,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["s24113335"],"URL":"https:\/\/doi.org\/10.3390\/s24113335","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,23]]}}}