{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:01:54Z","timestamp":1760241714002,"version":"build-2065373602"},"reference-count":33,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2018,8,17]],"date-time":"2018-08-17T00:00:00Z","timestamp":1534464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of high-performance GPUs (Graphics Processing Units). This limits CNN-based face detection algorithms in real applications, especially in some speed dependent ones. To alleviate this problem, we propose a lightweight face detector in this paper, which takes a fast residual network as backbone. Our method can run fast even on cheap and ordinary GPUs. To guarantee its detection precision, multi-scale features and multi-context are fully exploited in efficient ways. Specifically, feature fusion is used to obtain semantic strongly multi-scale features firstly. Then multi-context including both local and global context is added to these multi-scale features without extra computational burden. The local context is added through a depthwise separable convolution based approach, and the global context by a simple global average pooling way. Experimental results show that our method can run at about 110 fps on VGA (Video Graphics Array)-resolution images, while still maintaining competitive precision on WIDER FACE and FDDB (Face Detection Data Set and Benchmark) datasets as compared with its state-of-the-art counterparts.<\/jats:p>","DOI":"10.3390\/fi10080080","type":"journal-article","created":{"date-parts":[[2018,8,17]],"date-time":"2018-08-17T10:54:25Z","timestamp":1534503265000},"page":"80","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Fast and Lightweight Method with Feature Fusion and Multi-Context for Face Detection"],"prefix":"10.3390","volume":"10","author":[{"given":"Lei","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China"}]},{"given":"Xiaoli","family":"Zhi","sequence":"additional","affiliation":[{"name":"School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China"},{"name":"Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,8,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Chen, D., Ren, S., Wei, Y., Cao, X., and Sun, J. (2014). Joint cascade face detection and alignment. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10599-4_8"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., and Li, S.Z. (2017, January 22\u201329). S3fd: Single shot scale invariant face detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Venice, Italy.","DOI":"10.1109\/ICCV.2017.30"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 13\u201316). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 24\u201327). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Hu, P., and Ramanan, D. (2017, January 21\u201326). Finding tiny faces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.166"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016). SSD: Single shot multibox detector. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_7","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very deep convolutional networks for large-scale image recognition, arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (arXiv, 2016). Deep residual learning for image recognition, arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TCSVT.2016.2630731","article-title":"Incremental Learning with Saliency Map for Moving Object Detection","volume":"28","author":"Pang","year":"2018","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Chen, B.-H., Shi, L.-F., and Ke, X. (2018). A Robust Moving Object Detection in Multi-Scenario Big Data for Video Surveillance. IEEE Trans. Circuits Syst. Video Technol.","DOI":"10.1109\/TCSVT.2018.2828606"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (arXiv, 2017). Yolo9000: Better, faster, stronger, arXiv.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 8\u201310). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Bell, S., Zitnick, C.L., Bala, K., and Girshick, R. (2016, January 27\u201330). Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.314"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., Lerer, A., Lin, T.-Y., Pinheiro, P.O., Gross, S., Chintala, S., and Dollar, P. (arXiv, 2016). A multipath network for object detection, arXiv.","DOI":"10.5244\/C.30.15"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhu, C., Zheng, Y., Luu, K., and Savvides, M. (arXiv, 2017). CMS-RCNN: Contextual multi-scale region-based CNN for unconstrained face detection, arXiv.","DOI":"10.1007\/978-3-319-61657-5_3"},{"key":"ref_16","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (arXiv, 2017). MobileNets: Efficient convolutional neural networks for mobile vision applications, arXiv."},{"key":"ref_17","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, IEEE Computer Society."},{"key":"ref_18","unstructured":"Zhu, Q., Yeh, M.C., Cheng, K.T., and Avidan, S. (2006, January 17\u201322). Fast human detection using a cascade of histograms of oriented gradients. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_19","unstructured":"Viola, P., and Jones, M. (2001, January 8\u201314). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Li, H., Lin, Z., Shen, X., Brandt, J., and Hua, G. (2015, January 7\u201312). A convolutional neural network cascade for face detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299170"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Yang, S., Luo, P., Loy, C.-C., and Tang, X. (2015, January 7\u201313). From facial parts responses to face detection: A deep learning approach. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.419"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Najibi, M., Samangouei, P., Chellappa, R., and Davis, L.S. (2017, January 21\u201326). SSH: Single stage headless face detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.522"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_24","unstructured":"Liu, W., Rabinovich, A., and Berg, A.C. (arXiv, 2015). ParseNet: Looking wider to see better, arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Li, F.-F. (2009, January 20\u201325). ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pinheiro, P.O., Lin, T.-Y., Collobert, R., and Dollar, P. (arXiv, 2016). Learning to refine object segments, arXiv.","DOI":"10.1007\/978-3-319-46448-0_5"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yang, S., Luo, P., Loy, C.-C., and Tang, X. (2016, January 27\u201330). Wider face: A face detection benchmark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.596"},{"key":"ref_29","unstructured":"Jain, V., and Learned-Miller, E.G. (2010). FDDB: A Benchmark for Face Detection in Unconstrained Settings, University of Massachusetts. UMass Amherst Technical Report."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.bdr.2017.06.002","article-title":"Fast deep convolutional face detection in the wild exploiting hard sample mining","volume":"11","author":"Triantafyllidou","year":"2018","journal-title":"Big Data Res."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Cai, Z., Fan, Q., Feris, R.S., and Vasconcelos, N. (2016). A unified multi-scale deep convolutional neural network for fast object detection. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1499","DOI":"10.1109\/LSP.2016.2603342","article-title":"Joint face detection and alignment using multi-task cascaded convolutional networks","volume":"23","author":"Zhang","year":"2016","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.neucom.2018.03.030","article-title":"Face detection using deep learning: An improved faster RCNN approach","volume":"299","author":"Sun","year":"2018","journal-title":"Neurocomputing"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/10\/8\/80\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:19:22Z","timestamp":1760195962000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/10\/8\/80"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,8,17]]},"references-count":33,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2018,8]]}},"alternative-id":["fi10080080"],"URL":"https:\/\/doi.org\/10.3390\/fi10080080","relation":{},"ISSN":["1999-5903"],"issn-type":[{"type":"electronic","value":"1999-5903"}],"subject":[],"published":{"date-parts":[[2018,8,17]]}}}