{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:20:39Z","timestamp":1760235639283,"version":"build-2065373602"},"reference-count":66,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2021,9,18]],"date-time":"2021-09-18T00:00:00Z","timestamp":1631923200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010665","name":"H2020 Marie Sk\u0142odowska-Curie Actions","doi-asserted-by":"publisher","award":["754382"],"award-info":[{"award-number":["754382"]}],"id":[{"id":"10.13039\/100010665","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Spanish Ministry of Science, Innovation and Universities","award":["2017-90035-R"],"award-info":[{"award-number":["2017-90035-R"]}]},{"name":"Community Region of Madrid","award":["2018\/EMT-4362 SEGVAUTO"],"award-info":[{"award-number":["2018\/EMT-4362 SEGVAUTO"]}]},{"name":"H2020","award":["723021"],"award-info":[{"award-number":["723021"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Understanding the scene in front of a vehicle is crucial for self-driving vehicles and Advanced Driver Assistance Systems, and in urban scenarios, intersection areas are one of the most critical, concentrating between 20% to 25% of road fatalities. This research presents a thorough investigation on the detection and classification of urban intersections as seen from onboard front-facing cameras. Different methodologies aimed at classifying intersection geometries have been assessed to provide a comprehensive evaluation of state-of-the-art techniques based on Deep Neural Network (DNN) approaches, including single-frame approaches and temporal integration schemes. A detailed analysis of most popular datasets previously used for the application together with a comparison with ad hoc recorded sequences revealed that the performances strongly depend on the field of view of the camera rather than other characteristics or temporal-integrating techniques. Due to the scarcity of training data, a new dataset is created by performing data augmentation from real-world data through a Generative Adversarial Network (GAN) to increase generalizability as well as to test the influence of data quality. Despite being in the relatively early stages, mainly due to the lack of intersection datasets oriented to the problem, an extensive experimental activity has been performed to analyze the individual performance of each proposed systems.<\/jats:p>","DOI":"10.3390\/s21186269","type":"journal-article","created":{"date-parts":[[2021,9,21]],"date-time":"2021-09-21T22:35:20Z","timestamp":1632263720000},"page":"6269","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Urban Intersection Classification: A Comparative Analysis"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6688-5081","authenticated-orcid":false,"given":"Augusto Luis","family":"Ballardini","sequence":"first","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3779-6474","authenticated-orcid":false,"given":"\u00c1lvaro","family":"Hern\u00e1ndez Saz","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4782-7166","authenticated-orcid":false,"given":"Sandra","family":"Carrasco Limeros","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6350-2460","authenticated-orcid":false,"given":"Javier","family":"Lorenzo","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3889-018X","authenticated-orcid":false,"given":"Ignacio","family":"Parra Alonso","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6644-9498","authenticated-orcid":false,"given":"Noelia","family":"Hern\u00e1ndez Parra","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8940-6434","authenticated-orcid":false,"given":"Iv\u00e1n","family":"Garc\u00eda Daza","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8809-2103","authenticated-orcid":false,"given":"Miguel \u00c1ngel","family":"Sotelo","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Universidad de Alcal\u00e1, 28805 Alcal\u00e1 de Henares, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,18]]},"reference":[{"key":"ref_1","unstructured":"(2021, May 21). European Commission Road Safety Key Figures 2020. Available online: https:\/\/ec.europa.eu\/transport\/road_safety\/sites\/roadsafety\/files\/pdf\/scoreboard_2020.pdf."},{"key":"ref_2","unstructured":"(2021, May 21). European Union Annual Accident Report 2018. Available online: https:\/\/ec.europa.eu\/transport\/road_safety\/specialist\/observatory\/statistics\/annual_accident_report_archive_en."},{"key":"ref_3","unstructured":"Fatality and Injury Reporting System Tool of, U.S. (2021, May 21). National Highway Traffic Safety Administration, Available online: https:\/\/cdan.dot.gov\/query."},{"key":"ref_4","unstructured":"(2021, September 03). European Union Report on Mobility and Transportation: ITS & Vulnerable Road Users. Available online: https:\/\/ec.europa.eu\/transport\/themes\/its\/road\/action_plan\/its_and_vulnerable_road_users_en."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Huang, X., Cheng, X., Geng, Q., Cao, B., Zhou, D., Wang, P., Lin, Y., and Yang, R. (2018, January 18\u201322). The ApolloScape Dataset for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00141"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chang, M.F., Lambert, J.W., Sangkloy, P., Singh, J., Bak, S., Hartnett, A., Wang, D., Carr, P., Lucey, S., and Ramanan, D. (2019, January 16\u201320). Argoverse: 3D Tracking and Forecasting with Rich Maps. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00895"},{"key":"ref_7","unstructured":"Houston, J., Zuidhof, G., Bergamini, L., Ye, Y., Jain, A., Omari, S., Iglovikov, V., and Ondruska, P. (2020). One Thousand and One Hours: Self-driving Motion Prediction Dataset. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., and Beijbom, O. (2019). nuScenes: A multimodal dataset for autonomous driving. arXiv.","DOI":"10.1109\/CVPR42600.2020.01164"},{"key":"ref_9","unstructured":"(2021, May 21). PandaSet (2020) PandaSet Dataset. Available online: https:\/\/scale.com\/open-datasets\/pandaset."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets Robotics: The KITTI Dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Robot. Res. (IJRR)"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Xie, J., Kiefel, M., Sun, M.T., and Geiger, A. (2016, January 27\u201330). Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.401"},{"key":"ref_12","unstructured":"Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative Adversarial Networks. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1109\/TITS.2006.890070","article-title":"A Traffic Accident Recording and Reporting Model at Intersections","volume":"8","author":"Ki","year":"2007","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_14","unstructured":"Golembiewski, G., and Chandler, B.E. (2011). Intersection Safety: A Manual for Local Rural Road Owners, Technical Report."},{"key":"ref_15","first-page":"19","article-title":"Progress in road intersection detection for autonomous vehicle navigation","volume":"Volume 852","author":"Kushner","year":"1987","journal-title":"Mobile Robots II. International Society for Optics and Photonics"},{"key":"ref_16","unstructured":"Geiger, A. (2013). Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms, KIT Scientific Publishing."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"An, J., Choi, B., Sim, K.B., and Kim, E. (2016). Novel intersection type recognition for autonomous vehicles using a multi-layer laser scanner. Sensors, 16.","DOI":"10.3390\/s16071123"},{"key":"ref_18","unstructured":"Ballardini, A.L., Cattaneo, D., Fontana, S., and Sorrenti, D.G. (June, January 29). An online probabilistic road intersection detector. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Habermann, D., Vido, C.E., Os\u00f3rio, F.S., and Ramos, F. (2016, January 24\u201329). Road junction detection from 3d point clouds. Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada.","DOI":"10.1109\/IJCNN.2016.7727849"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Baumann, U., Huang, Y.Y., Gl\u00e4ser, C., Herman, M., Banzhaf, H., and Z\u00f6llner, J.M. (2018, January 4\u20137). Classifying road intersections using transfer-learning on a deep neural network. Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA.","DOI":"10.1109\/ITSC.2018.8569916"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bhatt, D., Sodhi, D., Pal, A., Balasubramanian, V., and Krishna, M. (2017, January 24\u201328). Have i reached the intersection: A deep learning-based approach for intersection detection from monocular cameras. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8206317"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Koji, T., and Kanji, T. (2019, January 9\u201312). Deep intersection classification using first and third person views. Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France.","DOI":"10.1109\/IVS.2019.8813859"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1007\/s00138-014-0649-7","article-title":"Multimodal information fusion for urban scene understanding","volume":"27","author":"Xu","year":"2016","journal-title":"Mach. Vis. Appl."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. (2015). Show and Tell: A Neural Image Caption Generator. arXiv.","DOI":"10.1109\/CVPR.2015.7298935"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Donahue, J., Hendricks, L.A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., and Darrell, T. (2016). Long-term Recurrent Convolutional Networks for Visual Recognition and Description. arXiv.","DOI":"10.1109\/CVPR.2015.7298878"},{"key":"ref_27","unstructured":"Ng, J.Y.H., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., and Toderici, G. (2015). Beyond Short Snippets: Deep Networks for Video Classification. arXiv."},{"key":"ref_28","unstructured":"Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W., and Woo, W. (2015). Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"16884","DOI":"10.1038\/s41598-019-52737-x","article-title":"Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks","volume":"9","author":"Sandfort","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1016\/j.neucom.2018.09.013","article-title":"GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification","volume":"321","author":"Diamant","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_31","unstructured":"Perez, L., and Wang, J. (2017). The Effectiveness of Data Augmentation in Image Classification using Deep Learning. arXiv."},{"key":"ref_32","unstructured":"Antoniou, A., Storkey, A., and Edwards, H. (2018). Data Augmentation Generative Adversarial Networks. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Stoyanov, D., Taylor, Z., Kainz, B., Maicas, G., Beichel, R.R., Martel, A., Maier-Hein, L., Bhatia, K., Vercauteren, T., and Oktay, O. (2018). Conditional Infilling GANs for Data Augmentation in Mammogram Classification. Image Analysis for Moving Organ, Breast, and Thoracic Images, Springer International Publishing.","DOI":"10.1007\/978-3-030-00946-5"},{"key":"ref_34","unstructured":"Bowles, C., Gunn, R., Hammers, A., and Rueckert, D. (2018). GANsfer Learning: Combining labelled and unlabelled data for GAN based data augmentation. arXiv."},{"key":"ref_35","unstructured":"Li, C.L., Zaheer, M., Zhang, Y., Poczos, B., and Salakhutdinov, R. (2018). Point Cloud GAN. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Gadelha, M., Rai, A., Maji, S., and Wang, R. (2019). Inferring 3D Shapes from Image Collections using Adversarial Networks. arXiv.","DOI":"10.1007\/s11263-020-01335-w"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Saito, M., Matsumoto, E., and Saito, S. (2017). Temporal Generative Adversarial Nets with Singular Value Clipping. arXiv.","DOI":"10.1109\/ICCV.2017.308"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Karras, T., Laine, S., and Aila, T. (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv.","DOI":"10.1109\/CVPR.2019.00453"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., and Aila, T. (2020). Analyzing and Improving the Image Quality of StyleGAN. arXiv.","DOI":"10.1109\/CVPR42600.2020.00813"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Gal, R., Cohen, D., Bermano, A., and Cohen-Or, D. (2021). SWAGAN: A Style-based Wavelet-driven Generative Model. arXiv.","DOI":"10.1145\/3476576.3476707"},{"key":"ref_41","unstructured":"Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., and Aila, T. (2020). Training Generative Adversarial Networks with Limited Data. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Ballardini, A.L., Hern\u00e1ndez, A., and Sotelo, M.A. (2021). Model Guided Road Intersection Classification. arXiv.","DOI":"10.1109\/IV48863.2021.9575605"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_44","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (2019, January 27\u201328). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00140"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Bellet, A., Habrard, A., and Sebban, M. (2014). A Survey on Metric Learning for Feature Vectors and Structured Data. arXiv.","DOI":"10.1007\/978-3-031-01572-4"},{"key":"ref_48","unstructured":"Musgrave, K., Belongie, S., and Lim, S.N. (2020). PyTorch Metric Learning. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Yuan, T., Deng, W., Tang, J., Tang, Y., and Chen, B. (2019, January 16\u201320). Signal-to-noise ratio: A robust distance metric for deep metric learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00495"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). FaceNet: A unified embedding for face recognition and clustering. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Musgrave, K., Belongie, S., and Lim, S.N. (2020). A metric learning reality check. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58595-2_41"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Ballardini, A.L., Cattaneo, D., and Sorrenti, D.G. (2019, January 20\u201324). Visual Localization at Intersections with Digital Maps. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794413"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Xu, H., and Zhang, J. (2020, January 14\u201319). AANet: Adaptive Aggregation Network for Efficient Stereo Matching. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00203"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Hern\u00e1ndez, A., Woo, S., Corrales, H., Parra, I., Kim, E., Llorca, D.F., and Sotelo, M.A. (November, January 19). 3D-DEEP: 3-Dimensional Deep-learning based on elevation patterns for road scene interpretation. Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA.","DOI":"10.1109\/IV47402.2020.9304601"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Cattaneo, D., Vaghi, M., Fontana, S., Ballardini, A.L., and Sorrenti, D.G. (2020, January 23\u201327). Global visual localization in LiDAR-maps through shared 2D-3D embedding space. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA.","DOI":"10.1109\/ICRA40945.2020.9196859"},{"key":"ref_56","unstructured":"Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer, A. (2021, May 21). Automatic differentiation in PyTorch. Advances in Neural Information Processing Systems, Autodiff Workshop. Available online: https:\/\/openreview.net\/forum?id=BJJsrmfCZ."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Carreira, J., and Zisserman, A. (2017, January 21\u201326). Quo vadis, action recognition? A new model and the kinetics dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.502"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C. (2020, January 14\u201319). X3d: Expanding architectures for efficient video recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00028"},{"key":"ref_59","unstructured":"Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., and Natsev, P. (2017). The kinetics human action video dataset. arXiv."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Sigurdsson, G.A., Varol, G., Wang, X., Farhadi, A., Laptev, I., and Gupta, A. (2016). Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding. arXiv.","DOI":"10.1007\/978-3-319-46448-0_31"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Damen, D., Doughty, H., Farinella, G.M., Fidler, S., Furnari, A., Kazakos, E., Moltisanti, D., Munro, J., Perrett, T., and Price, W. (2020). The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI).","DOI":"10.1109\/TPAMI.2020.2991965"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Goyal, R., Kahou, S.E., Michalski, V., Materzynska, J., Westphal, S., Kim, H., Haenel, V., Fruend, I., Yianilos, P., and Mueller-Freitag, M. (2017, January 22\u201329). The \u201cSomething Something\u201d Video Database for Learning and Evaluating Visual Common Sense. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.622"},{"key":"ref_63","unstructured":"Fan, H., Li, Y., Xiong, B., Lo, W.Y., and Feichtenhofer, C. (2021, May 21). PySlowFast. Available online: https:\/\/github.com\/facebookresearch\/slowfast."},{"key":"ref_64","unstructured":"Biewald, L. (2021, May 21). Experiment Tracking with Weights and Biases. 2020. Software. Available online: wandb.com."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C., Fan, H., Malik, J., and He, K. (2019, January 27\u201328). SlowFast Networks for Video Recognition. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00630"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6269\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:02:01Z","timestamp":1760166121000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6269"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,18]]},"references-count":66,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["s21186269"],"URL":"https:\/\/doi.org\/10.3390\/s21186269","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,9,18]]}}}