{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T08:49:03Z","timestamp":1765356543874,"version":"build-2065373602"},"reference-count":38,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,5,2]],"date-time":"2019-05-02T00:00:00Z","timestamp":1556755200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Visual quality and algorithm efficiency are two main interests in video frame interpolation. We propose a hybrid task-based convolutional neural network for fast and accurate frame interpolation of 4K videos. The proposed method synthesizes low-resolution frames, then reconstructs high-resolution frames in a coarse-to-fine fashion. We also propose edge loss, to preserve high-frequency information and make the synthesized frames look sharper. Experimental results show that the proposed method achieves state-of-the-art performance and performs 2.69x faster than the existing methods that are operable for 4K videos, while maintaining comparable visual and quantitative quality.<\/jats:p>","DOI":"10.3390\/sym11050619","type":"journal-article","created":{"date-parts":[[2019,5,7]],"date-time":"2019-05-07T03:15:46Z","timestamp":1557198946000},"page":"619","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["A Fast 4K Video Frame Interpolation Using a Hybrid Task-Based Convolutional Neural Network"],"prefix":"10.3390","volume":"11","author":[{"given":"Ha-Eun","family":"Ahn","sequence":"first","affiliation":[{"name":"Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea"},{"name":"Korea Electronics Technology Institute, Sungnam 13509, Korea"}]},{"given":"Jinwoo","family":"Jeong","sequence":"additional","affiliation":[{"name":"Korea Electronics Technology Institute, Sungnam 13509, Korea"}]},{"given":"Je Woo","family":"Kim","sequence":"additional","affiliation":[{"name":"Korea Electronics Technology Institute, Sungnam 13509, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2019,5,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Werlberger, M., Pock, T., Unger, M., and Bischof, H. (2011). Optical flow guided TV-L 1 video interpolation and restoration. International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer.","DOI":"10.1007\/978-3-642-23094-3_20"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1235","DOI":"10.1109\/TCSVT.2013.2242631","article-title":"Multi-level video frame interpolation: Exploiting the interaction among different levels","volume":"23","author":"Yu","year":"2013","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-540-24673-2_3"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., and Brox, T. (2015, January 7\u201313). Flownet: Learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.316"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., and Brox, T. (2017, January 21\u201326). Flownet 2.0: Evolution of optical flow estimation with deep networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.179"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ranjan, A., and Black, M.J. (2017, January 21\u201326). Optical flow estimation using a spatial pyramid network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.291"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., and Zha, H. (2017, January 12). Unsupervised deep learning for optical flow estimation. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10723"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Sun, D., Yang, X., Liu, M.Y., and Kautz, J. (2018, January 18\u201322). Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00931"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Long, G., Kneip, L., Alvarez, J.M., Li, H., Zhang, X., and Yu, Q. (2016). Learning image matching by simply watching video. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46466-4_26"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, Z., Yeh, R.A., Tang, X., Liu, Y., and Agarwala, A. (2017, January 22\u201329). Video frame synthesis using deep voxel flow. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.478"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., and Kautz, J. (2018, January 18\u201322). Super slomo: High quality estimation of multiple intermediate frames for video interpolation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00938"},{"key":"ref_12","unstructured":"Liu, Y.L., Liao, Y.T., Lin, Y.Y., and Chuang, Y.Y. (February, January 27). Deep Video Frame Interpolation using Cyclic Frame Generation. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Niklaus, S., Mai, L., and Liu, F. (2017, January 21\u201326). Video frame interpolation via adaptive convolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.244"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Niklaus, S., Mai, L., and Liu, F. (2017, January 22\u201329). Video frame interpolation via adaptive separable convolution. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.37"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Niklaus, S., and Liu, F. (2018, January 18\u201322). Context-aware synthesis for video frame interpolation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00183"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Meyer, S., Wang, O., Zimmer, H., Grosse, M., and Sorkine-Hornung, A. (2015, January 7\u201312). Phase-based frame interpolation for video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298747"},{"key":"ref_17","unstructured":"Mathieu, M., Couprie, C., and LeCun, Y. (2015). Deep multi-scale video prediction beyond mean square error. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kim, J., Kwon Lee, J., and Mu Lee, K. (2016, January 1). Accurate image super-resolution using very deep convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.182"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Dong, C., Loy, C.C., He, K., and Tang, X. (2014). Learning a deep convolutional network for image super-resolution. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10593-2_13"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lim, B., Son, S., Kim, H., Nah, S., and Mu Lee, K. (2017, January 21\u201326). Enhanced deep residual networks for single image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.151"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Lai, W.S., Huang, J.B., Ahuja, N., and Yang, M.H. (2017, January 21\u201326). Deep laplacian pyramid networks for fast and accurate super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.618"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_23","unstructured":"Clevert, D.A., Unterthiner, T., and Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv."},{"key":"ref_24","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_26","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liao, R., Tao, X., Li, R., Ma, Z., and Jia, J. (2015, January 11\u201318). Video super-resolution via deep draft-ensemble learning. Proceedings of the IEEE International Conference on Computer Vision, Araucano Park, Las Condes, Chile.","DOI":"10.1109\/ICCV.2015.68"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J., Wang, Z., and Shi, W. (2017, January 21\u201326). Real-time video super-resolution with spatio-temporal networks and motion compensation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.304"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Johnson, J., Alahi, A., and Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46475-6_43"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Canny, J. (1987). A computational approach to edge detection. Readings in Computer Vision, Morgan Kaufmann.","DOI":"10.1016\/B978-0-08-051581-6.50024-6"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Xie, S., and Tu, Z. (2015, January 7\u201313). Holistically-nested edge detection. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.164"},{"key":"ref_33","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Tao, M., Bai, J., Kohli, P., and Paris, S. (2012). Simple Flow: A Non-iterative, Sublinear Optical Flow Algorithm. Computer Graphics Forum, Blackwell Publishing Ltd.","DOI":"10.1111\/j.1467-8659.2012.03013.x"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s11263-010-0390-2","article-title":"A database and evaluation methodology for optical flow","volume":"92","author":"Baker","year":"2011","journal-title":"Int. J. Comput. Vis."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Le Feuvre, J., Thiesse, J.M., Parmentier, M., Raulet, M., and Daguet, C. (2014, January 19). Ultra high definition HEVC DASH data set. Proceedings of the 5th ACM Multimedia Systems Conference, Singapore.","DOI":"10.1145\/2557642.2563672"},{"key":"ref_37","unstructured":"Song, L., Tang, X., Zhang, W., Yang, X., and Xia, P. (2013, January 3). The SJTU 4K video sequence dataset. Proceedings of the IEEE 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX), Klagenfurt am W\u00f6rthersee, Austria."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/5\/619\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:48:51Z","timestamp":1760186931000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/5\/619"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,2]]},"references-count":38,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,5]]}},"alternative-id":["sym11050619"],"URL":"https:\/\/doi.org\/10.3390\/sym11050619","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2019,5,2]]}}}