{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:04:20Z","timestamp":1760231060379,"version":"build-2065373602"},"reference-count":30,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2022,8,23]],"date-time":"2022-08-23T00:00:00Z","timestamp":1661212800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Korean Government (MSIT)","award":["2021-0-00087","4199990214394"],"award-info":[{"award-number":["2021-0-00087","4199990214394"]}]},{"name":"Ministry of Education, School of Computer Science and Engineering, Kyungpook National University, Korea","award":["2021-0-00087","4199990214394"],"award-info":[{"award-number":["2021-0-00087","4199990214394"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>In this paper, we propose an object-cooperated decision method for efficient ternary tree (TT) partitioning that reduces the encoding complexity of versatile video coding (VVC). In most previous studies, the VVC complexity was reduced using decision schemes based on the encoding context, which do not apply object detecion models. We assume that high-level objects are important for deciding whether complex TT partitioning is required because they can provide hints on the characteristics of a video. Herein, we apply an object detection model that discovers and extracts the high-level object features\u2014the number and ratio of objects from frames in a video sequence. Using the extracted features, we propose machine learning (ML)-based classifiers for each TT-split direction to efficiently reduce the encoding complexity of VVC and decide whether the TT-split process can be skipped in the vertical or horizontal direction. The TT-split decision of classifiers is formulated as a binary classification problem. Experimental results show that the proposed method more effectively decreases the encoding complexity of VVC than a state-of-the-art model based on ML.<\/jats:p>","DOI":"10.3390\/s22176328","type":"journal-article","created":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T02:55:34Z","timestamp":1661309734000},"page":"6328","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Object-Cooperated Ternary Tree Partitioning Decision Method for Versatile Video Coding"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0258-3196","authenticated-orcid":false,"given":"Sujin","family":"Lee","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sang-hyo","family":"Park","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dongsan","family":"Jun","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Dong-A University, Busan 49315, Korea"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_2","unstructured":"Pereira, F., Burges, C., Bottou, L., and Weinberger, K. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Collobert, R., and Weston, J. (2008, January 5\u20139). A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland.","DOI":"10.1145\/1390156.1390177"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/MSP.2012.2205597","article-title":"Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups","volume":"29","author":"Hinton","year":"2012","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Tissier, A., Hamidouche, W., Mdalsi, S.B.D., Vanne, J., Galpin, F., and Menard, D. (2021). Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders. arXiv.","DOI":"10.1109\/ICIP46576.2022.9898052"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3736","DOI":"10.1109\/TCSVT.2021.3101953","article-title":"Overview of the Versatile Video Coding (VVC) Standard and its Applications","volume":"31","author":"Bross","year":"2021","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1649","DOI":"10.1109\/TCSVT.2012.2221191","article-title":"Overview of the High Efficiency Video Coding (HEVC) Standard","volume":"22","author":"Sullivan","year":"2012","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"172597","DOI":"10.1109\/ACCESS.2019.2956196","article-title":"Context-Based Ternary Tree Decision Method in Versatile Video Coding for Fast Intra Coding","volume":"7","author":"Park","year":"2019","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lee, S., and Park, S.H. (2022, January 6\u20139). Study on Machine Learning Models for Tree Partitioning Method of Versatile Video Coding. Proceedings of the 2022 International Conference on Electronics, Information, and Communication (ICEIC), Jeju, Korea.","DOI":"10.1109\/ICEIC54506.2022.9748428"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1668","DOI":"10.1109\/TCSVT.2019.2904198","article-title":"Low-Complexity CTU Partition Structure Decision and Fast Intra Mode Decision for Versatile Video Coding","volume":"30","author":"Yang","year":"2020","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4388","DOI":"10.1109\/TMM.2020.3042062","article-title":"Fast Multi-Type Tree Partitioning for Versatile Video Coding Using a Lightweight Neural Network","volume":"23","author":"Park","year":"2021","journal-title":"IEEE Trans. Multimed."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"119289","DOI":"10.1109\/ACCESS.2021.3108238","article-title":"Fast CU Decision-Making Algorithm Based on DenseNet Network for VVC","volume":"9","author":"Zhang","year":"2021","journal-title":"IEEE Access"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5377","DOI":"10.1109\/TIP.2021.3083447","article-title":"DeepQTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVC","volume":"30","author":"Li","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"5638","DOI":"10.1109\/TCSVT.2022.3146061","article-title":"HG-FCN: Hierarchical grid fully convolutional network for fast VVC intra coding","volume":"32","author":"Wu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_15","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2015). You Only Look Once: Unified, Real-Time Object Detection. arXiv.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). SSD: Single Shot MultiBox Detector. European Conference on Computer Vision, Springer International Publishing.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017). Focal Loss for Dense Object Detection. arXiv.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_19","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017). Mask R-CNN. arXiv.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2017). Cascade R-CNN: Delving into High Quality Object Detection. arXiv.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_22","unstructured":"(2022, May 03). YOLOv5. Available online: https:\/\/github.com\/ultralytics\/yolov5\/."},{"key":"ref_23","unstructured":"(2022, May 03). COCO 2017 Dataset. Available online: https:\/\/cocodataset.org\/#overview."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Xu, X., Liu, S., and Li, Z. (2021). Tencent Video Dataset (TVD): A Video Dataset for Learning-based Visual Data Compression and Analysis. arXiv.","DOI":"10.1109\/VCIP53242.2021.9675343"},{"key":"ref_25","unstructured":"Boyce, J., Suehring, K., Li, X., and Seregin, V. (2022, May 03). JVET-J1010: JVET Common test Conditions and Software Reference Configurations. Available online: https:\/\/jvet.hhi.fraunhofer.de\/."},{"key":"ref_26","first-page":"120","article-title":"The OpenCV Library","volume":"25","author":"Bradski","year":"2000","journal-title":"Dr. Dobb\u2019s J. Softw. Tools Prof. Program."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zoph, B., Cubuk, E.D., Ghiasi, G., Lin, T.Y., Shlens, J., and Le, Q.V. (2019). Learning Data Augmentation Strategies for Object Detection. arXiv.","DOI":"10.1109\/CVPR.2019.00020"},{"key":"ref_28","unstructured":"Adams, K. (2022, May 03). Tutorial The Gini Impurity Index and What It Means and How to Calculate It. Available online: https:\/\/www.researchgate.net\/publication\/327110793_Tutorial_The_Gini_Impurity_index_and_what_it_means_and_how_to_calculate_it."},{"key":"ref_29","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2022, May 03). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: https:\/\/www.tensorflow.org\/."},{"key":"ref_30","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/17\/6328\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:13:57Z","timestamp":1760141637000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/17\/6328"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,23]]},"references-count":30,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["s22176328"],"URL":"https:\/\/doi.org\/10.3390\/s22176328","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,8,23]]}}}