{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T18:00:58Z","timestamp":1768672858761,"version":"3.49.0"},"reference-count":31,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T00:00:00Z","timestamp":1628467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["NRF-2019R1A2C4070681"],"award-info":[{"award-number":["NRF-2019R1A2C4070681"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>For accurate and fast detection of facial landmarks, we propose a new facial landmark detection method. Previous facial landmark detection models generally perform a face detection step before landmark detection. This greatly affects landmark detection performance depending on which face detection model is used. Therefore, we propose a model that can simultaneously detect a face region and a landmark without performing the face detection step before landmark detection. The proposed single-shot detection model is based on the framework of YOLOv3, a one-stage object detection method, and the loss function and structure are altered to learn faces and landmarks at the same time. In addition, EfficientNet-B0 was utilized as the backbone network to increase processing speed and accuracy. The learned database used 300W-LP with 64 facial landmarks. The average normalized error of the proposed model was 2.32 pixels. The processing time per frame was about 15 milliseconds, and the average precision of face detection was about 99%. As a result of the evaluation, it was confirmed that the single-shot detection model has better performance and speed than the previous methods. In addition, as a result of using the COFW database, which has 29 landmarks instead of 64 to verify the proposed method, the average normalization error was 2.56 pixels, which was also confirmed to show promising performance.<\/jats:p>","DOI":"10.3390\/s21165360","type":"journal-article","created":{"date-parts":[[2021,8,9]],"date-time":"2021-08-09T09:03:53Z","timestamp":1628499833000},"page":"5360","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Detecting Facial Region and Landmarks at Once via Deep Network"],"prefix":"10.3390","volume":"21","author":[{"given":"Taehyung","family":"Kim","sequence":"first","affiliation":[{"name":"Department of AI & Informatics, Graduate School, Sangmyung University, Seoul 03016, Korea"}]},{"given":"Jiwon","family":"Mok","sequence":"additional","affiliation":[{"name":"Department of AI & Informatics, Graduate School, Sangmyung University, Seoul 03016, Korea"}]},{"given":"Euichul","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Human-Centered Artificial Intelligence, Sangmyung University, Seoul 03016, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2021,8,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Kim, T., Mok, J.W., and Lee, E.C. (2020, January 24\u201326). 1-Stage Face Landmark Detection using Deep Learning. Proceedings of the 12th International Conference on Intelligent Human Computer Interaction, LNCS, Daegu, Korea.","DOI":"10.1007\/978-3-030-68452-5_25"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2037","DOI":"10.1109\/TPAMI.2006.244","article-title":"Face description with local binary patterns: Application to face recognition","volume":"28","author":"Ahonen","year":"2006","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Sun, Y., Wang, X., and Tang, X. (2014, January 23\u201328). Deep learning face representation from predicting 10,000 classes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.244"},{"key":"ref_4","first-page":"394","article-title":"3D face reconstruction from a single image using a single reference face shape","volume":"33","author":"Basri","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lee, W.H., and Lee, R. (2016, January 18). Implicit sensor-based authentication of smartphone users with smartwatch. Proceedings of the Hardware and Architectural Support for Security and Privacy, Seoul, Korea.","DOI":"10.1145\/2948618.2948627"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wang, T., Song, Z., Ma, J., Xiong, Y., and Jie, Y. (2013, January 20\u201322). An anti-fake iris authentication mechanism for smart glasses. Proceedings of the 2013 3rd International Conference on Consumer Electronics, Communications and Networks, Xianning, China.","DOI":"10.1109\/CECNet.2013.6703278"},{"key":"ref_7","first-page":"85","article-title":"Mobile augmented reality research trends and prospects","volume":"11","author":"Hwang","year":"2013","journal-title":"Korea Inst. Inf. Technol. Mag."},{"key":"ref_8","first-page":"127","article-title":"Development of virtual makeup tool based on mobile augmented reality","volume":"26","author":"Song","year":"2021","journal-title":"J. Korea Soc. Comput. Inf."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.9728\/dcs.2018.19.7.1349","article-title":"Implementation of multi-channel network platform based augmented reality facial emotion sticker using deep learning","volume":"19","author":"Kim","year":"2018","journal-title":"J. Digit. Contents Soc."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Firmanda, M.R., Dewantara, B.S.B., and Sigit, R. (2020, January 17\u201319). Implementation of illumination invariant face recognition for accessing user record in healthcare Kiosk. Proceedings of the 2020 International Electronics Symposium (IES), Delft, The Netherlands.","DOI":"10.1109\/IES50839.2020.9231644"},{"key":"ref_11","first-page":"486","article-title":"Kiosk system development using eye tracking and face-recognition technology","volume":"27","author":"Woongsup","year":"2020","journal-title":"Korea Inf. Process. Soc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1109\/34.927467","article-title":"Active appearance models","volume":"23","author":"Cootes","year":"2001","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","unstructured":"Zhu, X., and Ramanan, D. (2012, January 16\u201321). Face detection, pose estimation, and landmark localization in the wild. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Asthana, A., Zafeiriou, S., Cheng, S., and Pantic, M. (2013, January 23\u201328). Robust discriminative response map flitting with constrained local models. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, VCPR, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.442"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Tzimiropoulos, G., and Pantic, M. (2014, January 23\u201328). Gauss-newton deformable part models for face alignment in-the-wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.239"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2930","DOI":"10.1109\/TPAMI.2013.23","article-title":"Localizing parts of faces using a consensus of exemplars","volume":"35","author":"Belhumeur","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Burgos-Artizzu, X.P., Perona, P., and Dollar, P. (2013, January 1\u20138). Robust face landmark estimation under occlusion. Proceedings of the IEEE International Conference on Computer Vision, ICCV, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.191"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Xiong, X., and De la Torre, F. (2013, January 23\u201328). Supervised descent method and its applications to face alignment. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.75"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhang, J., Shan, S., Kan, M., and Chen, X. (2014). Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10605-2_1"},{"key":"ref_20","unstructured":"Zhu, S., Li, C., Change Loy, C., and Tang, X. (2015, January 7\u201312). Face alignment by coarse-to-fine shape searching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.imavis.2016.01.002","article-title":"300 faces in-the-wild challenge: Database and results","volume":"47","author":"Sagonas","year":"2016","journal-title":"Image Vis. Comput."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Luo, P., Loy, C.C., and Tang, X. (2014). Facial landmark detection by deep multi-task learning. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10599-4_7"},{"key":"ref_23","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). Ssd: Single shot multibox detector. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Honolulu, Hawaii.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_26","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). EfficientNet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA."},{"key":"ref_27","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zhu, X., Lei, Z., Liu, X., Shi, H., and Li, S.Z. (2016, January 27\u201330). Face alignment across large poses: A 3d solution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.23"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Le, V., Brandt, J., Lin, Z., Bourdev, L., and Huang, T.S. (2012). Interactive facial feature localization. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-642-33712-3_49"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/s11263-018-1097-z","article-title":"Facial landmark detection: A literature survey","volume":"127","author":"Wu","year":"2019","journal-title":"Int. J. Comput. Vis."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/16\/5360\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:42:49Z","timestamp":1760164969000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/16\/5360"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,9]]},"references-count":31,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["s21165360"],"URL":"https:\/\/doi.org\/10.3390\/s21165360","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,9]]}}}