{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T20:00:43Z","timestamp":1775073643131,"version":"3.50.1"},"reference-count":70,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,1,21]],"date-time":"2025-01-21T00:00:00Z","timestamp":1737417600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Regional Development Fund (ERDF)","award":["TED2021-130890B-C21"],"award-info":[{"award-number":["TED2021-130890B-C21"]}]},{"name":"European Regional Development Fund (ERDF)","award":["TSI-100121-2024-24"],"award-info":[{"award-number":["TSI-100121-2024-24"]}]},{"name":"European Regional Development Fund (ERDF)","award":["FPU21\/00414"],"award-info":[{"award-number":["FPU21\/00414"]}]},{"name":"European Regional Development Fund (ERDF)","award":["FPU22\/04200"],"award-info":[{"award-number":["FPU22\/04200"]}]},{"name":"European Regional Development Fund (ERDF)","award":["FPU23\/00532"],"award-info":[{"award-number":["FPU23\/00532"]}]},{"name":"European Regional Development Fund (ERDF)","award":["CIACIF\/2021\/430"],"award-info":[{"award-number":["CIACIF\/2021\/430"]}]},{"name":"European Regional Development Fund (ERDF)","award":["CIACIF\/2022\/175"],"award-info":[{"award-number":["CIACIF\/2022\/175"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["TED2021-130890B-C21"],"award-info":[{"award-number":["TED2021-130890B-C21"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["TSI-100121-2024-24"],"award-info":[{"award-number":["TSI-100121-2024-24"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["FPU21\/00414"],"award-info":[{"award-number":["FPU21\/00414"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["FPU22\/04200"],"award-info":[{"award-number":["FPU22\/04200"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["FPU23\/00532"],"award-info":[{"award-number":["FPU23\/00532"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["CIACIF\/2021\/430"],"award-info":[{"award-number":["CIACIF\/2021\/430"]}]},{"name":"Spanish Ministry of Digital Processing and by the European Union NextGeneration EU","award":["CIACIF\/2022\/175"],"award-info":[{"award-number":["CIACIF\/2022\/175"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["TED2021-130890B-C21"],"award-info":[{"award-number":["TED2021-130890B-C21"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["TSI-100121-2024-24"],"award-info":[{"award-number":["TSI-100121-2024-24"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["FPU21\/00414"],"award-info":[{"award-number":["FPU21\/00414"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["FPU22\/04200"],"award-info":[{"award-number":["FPU22\/04200"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["FPU23\/00532"],"award-info":[{"award-number":["FPU23\/00532"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["CIACIF\/2021\/430"],"award-info":[{"award-number":["CIACIF\/2021\/430"]}]},{"name":"three Spanish national and two regional grants for PhD studies","award":["CIACIF\/2022\/175"],"award-info":[{"award-number":["CIACIF\/2022\/175"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>The rapid loss of biodiversity significantly impacts birds\u2019 environments and behaviors, highlighting the importance of analyzing bird behavior for ecological insights. With the growing adoption of Machine Learning (ML) algorithms in the Internet of Things (IoT) domain, edge computing has become essential to ensure data privacy and enable real-time predictions by processing high-dimensional data, such as video streams, efficiently. This paper introduces a set of dimensionality reduction techniques tailored for video sequences based on cutting-edge methods for this data representation. These methods drastically compress video data, reducing bandwidth and storage requirements while enabling the creation of compact ML models with faster inference speeds. Comprehensive experiments on bird behavior classification in rural environments demonstrate the effectiveness of the proposed techniques. The experiments incorporate state-of-the-art deep learning techniques, including pre-trained video vision models, Autoencoders, and single-frame feature extraction. These methods demonstrated superior performance to the baseline, achieving up to a 6000-fold reduction in data size while reaching a classification accuracy of 60.7% on the Visual WetlandBirds Dataset and obtaining state-of-the-art performance on this dataset. These findings underline the potential of using dimensionality reduction to enhance the scalability and efficiency of bird behavior analysis.<\/jats:p>","DOI":"10.3390\/fi17020053","type":"journal-article","created":{"date-parts":[[2025,1,21]],"date-time":"2025-01-21T11:47:32Z","timestamp":1737460052000},"page":"53","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Optimizing IoT Video Data: Dimensionality Reduction for Efficient Deep Learning on Edge Computing"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4890-8217","authenticated-orcid":false,"given":"David","family":"Ortiz-Perez","sequence":"first","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-6434-2701","authenticated-orcid":false,"given":"Pablo","family":"Ruiz-Ponce","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1712-7265","authenticated-orcid":false,"given":"David","family":"Mulero-P\u00e9rez","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8809-8476","authenticated-orcid":false,"given":"Manuel","family":"Benavent-Lledo","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-0241-5221","authenticated-orcid":false,"given":"Javier","family":"Rodriguez-Juan","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hugo","family":"Hernandez-Lopez","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anatoli","family":"Iarovikov","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7310-6063","authenticated-orcid":false,"given":"Srdjan","family":"Krco","sequence":"additional","affiliation":[{"name":"DunavNet DOO, Bulevar Oslobo\u0111enja 133\/2, 21000 Novi Sad, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daliborka","family":"Nedic","sequence":"additional","affiliation":[{"name":"DunavNet DOO, Bulevar Oslobo\u0111enja 133\/2, 21000 Novi Sad, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dejan","family":"Vukobratovic","sequence":"additional","affiliation":[{"name":"Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovi\u0107a 6, 21000 Novi Sad, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7798-3055","authenticated-orcid":false,"given":"Jose","family":"Garcia-Rodriguez","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology, University of Alicante, 03690 Alicante, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,1,21]]},"reference":[{"key":"ref_1","unstructured":"O\u2019Riordan, T. (1995). Environmental Science for Environmental Management, Longman."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"668","DOI":"10.1016\/j.tree.2006.08.007","article-title":"Monitoring for conservation","volume":"21","author":"Nichols","year":"2006","journal-title":"Trends Ecol. Evol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/0006-3207(81)90073-2","article-title":"Criteria used in assessing wildlife conservation potential: A review","volume":"21","author":"Margules","year":"1981","journal-title":"Biol. Conserv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1007\/s002679900244","article-title":"Using the best scientific data for endangered species conservation","volume":"24","author":"Smallwood","year":"1999","journal-title":"Environ. Manag."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"106728","DOI":"10.1016\/j.ecolind.2020.106728","article-title":"A state-of-the-art review on birds as indicators of biodiversity: Advances, challenges, and future directions","volume":"118","author":"Fraixedas","year":"2020","journal-title":"Ecol. Indic."},{"key":"ref_6","unstructured":"Bellman, R., Bellman, R., and Corporation, R. (1957). Dynamic Programming, Princeton University Press. Rand Corporation Research Study."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., and Paluri, M. (2018, January 18\u201322). A closer look at spatiotemporal convolutions for action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00675"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Xie, S., Sun, C., Huang, J., Tu, Z., and Murphy, K. (2018, January 8\u201314). Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01267-0_19"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Liu, Z., Ning, J., Cao, Y., Wei, Y., Zhang, Z., Lin, S., and Hu, H. (2022, January 18\u201324). Video swin transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00320"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Li, Y., Wu, C.Y., Fan, H., Mangalam, K., Xiong, B., Malik, J., and Feichtenhofer, C. (2022, January 18\u201324). Mvitv2: Improved multiscale vision transformers for classification and detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LO, USA.","DOI":"10.1109\/CVPR52688.2022.00476"},{"key":"ref_11","unstructured":"Rodriguez-Juan, J., Ortiz-Perez, D., Benavent-Lledo, M., Mulero-P\u00e9rez, D., Ruiz-Ponce, P., Orihuela-Torres, A., Garcia-Rodriguez, J., and Sebasti\u00e1n-Gonz\u00e1lez, E. (2025). Visual WetlandBirds Dataset: Bird Species Identification and Behavior Recognition in Videos. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Guo, C., and Wu, D. (2018, January 20\u201322). Feature Dimensionality Reduction for Video Affect Classification: A Comparative Study. Proceedings of the 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia), Beijing, China.","DOI":"10.1109\/ACIIAsia.2018.8470329"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.inffus.2020.01.005","article-title":"Overview and comparative study of dimensionality reduction techniques for high dimensional data","volume":"59","author":"Ayesha","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"54776","DOI":"10.1109\/ACCESS.2020.2980942","article-title":"Analysis of Dimensionality Reduction Techniques on Big Data","volume":"8","author":"Reddy","year":"2020","journal-title":"IEEE Access"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"56","DOI":"10.38094\/jastt1224","article-title":"A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction","volume":"1","author":"Zebari","year":"2020","journal-title":"J. Appl. Sci. Technol. Trends"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"LIII. On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"1901","journal-title":"Lond. Edinburgh Dublin Philos. Mag. J. Sci."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1037\/h0071325","article-title":"Analysis of a complex of statistical variables into principal components","volume":"24","author":"Hotelling","year":"1933","journal-title":"J. Educ. Psychol."},{"key":"ref_18","unstructured":"Cohen, J. (1983). Applied multiple regression. Correlation Analysis for the Behavioral Sciences\/Hillsdale, Lawrence Erlbaum Associates, Inc., Publishers."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/BF02289565","article-title":"Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis","volume":"29","author":"Kruskal","year":"1964","journal-title":"Psychometrika"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1109\/TAC.1980.1102314","article-title":"The singular value decomposition: Its computation and some applications","volume":"25","author":"Klema","year":"1980","journal-title":"IEEE Trans. Autom. Control"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1299","DOI":"10.1162\/089976698300017467","article-title":"Nonlinear component analysis as a kernel eigenvalue problem","volume":"10","author":"Smola","year":"1998","journal-title":"Neural Comput."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2323","DOI":"10.1126\/science.290.5500.2323","article-title":"Nonlinear dimensionality reduction by locally linear embedding","volume":"290","author":"Roweis","year":"2000","journal-title":"Science"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2319","DOI":"10.1126\/science.290.5500.2319","article-title":"A global geometric framework for nonlinear dimensionality reduction","volume":"290","author":"Tenenbaum","year":"2000","journal-title":"Science"},{"key":"ref_24","first-page":"283","article-title":"Generalized nonnegative matrix approximations with Bregman divergences","volume":"18","author":"Sra","year":"2005","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"341","DOI":"10.3390\/make1010020","article-title":"Recent Advances in Supervised Dimension Reduction: A Survey","volume":"1","author":"Chao","year":"2019","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_27","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_28","unstructured":"Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986). Learning internal representations by error propagation, parallel distributed processing, explorations in the microstructure of cognition, ed. de rumelhart and j. mcclelland. vol. 1. 1986. Biometrika, 71."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1109\/MGRS.2018.2853555","article-title":"A review of the autoencoder and its variants: A comparative perspective from target recognition in synthetic-aperture radar images","volume":"6","author":"Dong","year":"2018","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"110176","DOI":"10.1016\/j.asoc.2023.110176","article-title":"A comprehensive survey on design and application of autoencoder in deep learning","volume":"138","author":"Li","year":"2023","journal-title":"Appl. Soft Comput."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"5947","DOI":"10.4249\/scholarpedia.5947","article-title":"Deep belief networks","volume":"4","author":"Hinton","year":"2009","journal-title":"Scholarpedia"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1038\/s41524-020-0276-y","article-title":"Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures","volume":"6","author":"Kiarashinejad","year":"2020","journal-title":"npj Comput. Mater."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Sun, G., Wang, C., Zhang, Z., Deng, J., Zafeiriou, S., and Hua, Y. (2023, January 4\u20136). Spatio-temporal Prompting Network for Robust Video Feature Extraction. Proceedings of the 2023 IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France.","DOI":"10.1109\/ICCV51070.2023.01250"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1109\/TPAMI.2017.2670560","article-title":"Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks","volume":"40","author":"Jiang","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","unstructured":"Nottebaum, M., Roth, S., and Schaub-Meyer, S. (2022). Efficient Feature Extraction for High-resolution Video Frame Interpolation. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Lan, Z., Zhu, Y., Hauptmann, A.G., and Newsam, S. (2017, January 21\u201326). Deep local video feature for action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.161"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2635","DOI":"10.1007\/s10462-019-09743-2","article-title":"A survey of feature extraction and fusion of deep learning for detection of abnormalities in video endoscopy of gastrointestinal-tract","volume":"53","author":"Ali","year":"2020","journal-title":"Artif. Intell. Rev."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Abdulhussain, S.H., Rahman Ramli, A., Mahmmod, B.M., Iqbal Saripan, M., Al-Haddad, S., Baker, T., Flayyih, W.N., and Jassim, W.A. (2019, January 14\u201319). A Fast Feature Extraction Algorithm for Image and Video Processing. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8851750"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1109\/TSMCB.2011.2167750","article-title":"Robust CoHOG Feature Extraction in Human-Centered Image\/Video Management System","volume":"42","author":"Pang","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part B (Cybern.)"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Zin, T.T., Kobayashi, I., Tin, P., and Hama, H. (2016, January 20\u201322). A General Video Surveillance Framework for Animal Behavior Analysis. Proceedings of the 2016 Third International Conference on Computing Measurement Control and Sensor Network (CMCSN), Matsue, Japan.","DOI":"10.1109\/CMCSN.2016.55"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Fan, J., Jiang, N., and Wu, Y. (2010, January 26\u201329). Automatic video-based analysis of animal behaviors. Proceedings of the 2010 IEEE International Conference on Image Processing, Hong Kong.","DOI":"10.1109\/ICIP.2010.5652495"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Stern, U., He, R., and Yang, C.H. (2015). Analyzing animal behavior via classifying each video frame using convolutional neural networks. Sci. Rep., 5.","DOI":"10.1038\/srep14351"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Can, C., Yan, X., and Baoping, Y. (2013, January 29). Morphology classification and behaviors identification of birds in scientific video. Proceedings of the 3rd International Conference on Multimedia Technology (ICMT-13), Guangzhou, China.","DOI":"10.2991\/icmt-13.2013.178"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"109141","DOI":"10.1016\/j.ecolind.2022.109141","article-title":"Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network","volume":"141","author":"Lin","year":"2022","journal-title":"Ecol. Indic."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Saito, T., Kanezaki, A., and Harada, T. (2016, January 11\u201315). IBC127: Video dataset for fine-grained bird classification. Proceedings of the 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA.","DOI":"10.1109\/ICME.2016.7552915"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.ecoinf.2018.07.005","article-title":"Classification of bird species from video using appearance and motion features","volume":"48","author":"Atanbori","year":"2018","journal-title":"Ecol. Inform."},{"key":"ref_48","unstructured":"Ge, Z., McCool, C., Sanderson, C., Wang, P., Liu, L., Reid, I., and Corke, P. (December, January 30). Exploiting Temporal Information for DCNN-based Fine-Grained Object Classification. Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, Gold Coast, Australia."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Ng, X.L., Ong, K.E., Zheng, Q., Ni, Y., Yeo, S.Y., and Liu, J. (2022, January 18\u201324). Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01844"},{"key":"ref_50","unstructured":"Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2023). The Caltech-UCSD Birds-200-2011 Dataset, California Institute of Technology."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Van Horn, G., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., Perona, P., and Belongie, S. (2015, January 7\u201312). Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298658"},{"key":"ref_52","unstructured":"Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv."},{"key":"ref_53","unstructured":"Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozi\u00e8re, B., Goyal, N., Hambro, E., and Azhar, F. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv."},{"key":"ref_54","unstructured":"(2024). Gemini Team Google. Gemini: A Family of Highly Capable Multimodal Models. arXiv."},{"key":"ref_55","unstructured":"Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., and Natsev, P. (2017). The Kinetics Human Action Video Dataset. arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. (2015, January 7\u201313). Learning spatiotemporal features with 3d convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.510"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"12922","DOI":"10.1109\/TPAMI.2023.3243465","article-title":"Video Transformers: A Survey","volume":"45","author":"Selva","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_58","unstructured":"Tschannen, M., Bachem, O., and Lucic, M. (2018). Recent advances in autoencoder-based representation learning. arXiv."},{"key":"ref_59","unstructured":"Zhang, Y. A better autoencoder for image: Convolutional autoencoder. In Proceedings of the ICONIP17-DCEC. Available online: https:\/\/users.cecs.anu.edu.au\/~Tom.Gedeon\/conf\/ABCs2018\/paper\/ABCs2018_paper_58.pdf."},{"key":"ref_60","unstructured":"Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., and El-Nouby, A. (2023). DINOv2: Learning Robust Visual Features without Supervision. arXiv."},{"key":"ref_61","unstructured":"Darcet, T., Oquab, M., Mairal, J., and Bojanowski, P. (2023). Vision Transformers Need Registers. arXiv."},{"key":"ref_62","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_64","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., and Weinberger, K.Q. (2017, January 21\u201326). Densely Connected Convolutional Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_66","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_67","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"1598","DOI":"10.1016\/j.patrec.2011.01.004","article-title":"Face recognition using histograms of oriented gradients","volume":"32","author":"Bueno","year":"2011","journal-title":"Pattern Recognit. Lett."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"102747","DOI":"10.1016\/j.media.2023.102747","article-title":"Histogram of oriented gradients meet deep learning: A novel multi-task deep network for 2D surgical image semantic segmentation","volume":"85","author":"Bhattarai","year":"2023","journal-title":"Med. Image Anal."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/2\/53\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T10:33:04Z","timestamp":1759919584000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/17\/2\/53"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,21]]},"references-count":70,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,2]]}},"alternative-id":["fi17020053"],"URL":"https:\/\/doi.org\/10.3390\/fi17020053","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,21]]}}}