{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T02:12:36Z","timestamp":1772676756457,"version":"3.50.1"},"reference-count":52,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2020,9,15]],"date-time":"2020-09-15T00:00:00Z","timestamp":1600128000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004663","name":"Ministry of Science and Technology, Taiwan","doi-asserted-by":"publisher","award":["MOST 108-3017-F-009-001"],"award-info":[{"award-number":["MOST 108-3017-F-009-001"]}],"id":[{"id":"10.13039\/501100004663","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004663","name":"Ministry of Science and Technology, Taiwan","doi-asserted-by":"publisher","award":["MOST 109-2634-F-009-017"],"award-info":[{"award-number":["MOST 109-2634-F-009-017"]}],"id":[{"id":"10.13039\/501100004663","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005144","name":"Qualcomm","doi-asserted-by":"publisher","award":["408929"],"award-info":[{"award-number":["408929"]}],"id":[{"id":"10.13039\/100005144","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper proposes a deep-learning model with task-specific bounding box regressors (TSBBRs) and conditional back-propagation mechanisms for detection of objects in motion for advanced driver assistance system (ADAS) applications. The proposed model separates the object detection networks for objects of different sizes and applies the proposed algorithm to achieve better detection results for both larger and tinier objects. For larger objects, a neural network with a larger visual receptive field is used to acquire information from larger areas. For the detection of tinier objects, the network of a smaller receptive field utilizes fine grain features. A conditional back-propagation mechanism yields different types of TSBBRs to perform data-driven learning for the set criterion and learn the representation of different object sizes without degrading each other. The design of dual-path object bounding box regressors can simultaneously detect objects in various kinds of dissimilar scales and aspect ratios. Only a single inference of neural network is needed for each frame to support the detection of multiple types of object, such as bicycles, motorbikes, cars, buses, trucks, and pedestrians, and to locate their exact positions. The proposed model was developed and implemented on different NVIDIA devices such as 1080 Ti, DRIVE-PX2 and Jetson TX-2 with the respective processing performance of 67 frames per second (fps), 19.4 fps, and 8.9 fps for the video input of 448 \u00d7 448 resolution, respectively. The proposed model can detect objects as small as 13 \u00d7 13 pixels and achieves 86.54% accuracy on a publicly available Pascal Visual Object Class (VOC) car database and 82.4% mean average precision (mAP) on a large collection of common road real scenes database (iVS database).<\/jats:p>","DOI":"10.3390\/s20185269","type":"journal-article","created":{"date-parts":[[2020,9,15]],"date-time":"2020-09-15T10:24:09Z","timestamp":1600165449000},"page":"5269","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["A Deep-Learning Model with Task-Specific Bounding Box Regressors and Conditional Back-Propagation for Moving Object Detection in ADAS Applications"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9895-0060","authenticated-orcid":false,"given":"Guan-Ting","family":"Lin","sequence":"first","affiliation":[{"name":"Department of Electronics Engineering and Institute of Electronics, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9434-5899","authenticated-orcid":false,"given":"Vinay","family":"Malligere Shivanna","sequence":"additional","affiliation":[{"name":"Department of Electronics Engineering and Institute of Electronics, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0402-2621","authenticated-orcid":false,"given":"Jiun-In","family":"Guo","sequence":"additional","affiliation":[{"name":"Department of Electronics Engineering and Institute of Electronics, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,9,15]]},"reference":[{"key":"ref_1","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G. (2012, January 3\u20136). Imagenet Classification with Deep Convolutional Neural Networks. Proceedings of the Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA."},{"key":"ref_2","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_3","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_5","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (July, January 26). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1109\/TIP.2017.2762591","article-title":"Multi-Task Vehicle Detection with Region-of-Interest Voting","volume":"27","author":"Chu","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1973","DOI":"10.1109\/TITS.2017.2740303","article-title":"Fast Automatic Vehicle Annotation for Urban Traffic Surveillance","volume":"19","author":"Zhou","year":"2018","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","article-title":"Object Detection with Discriminatively Trained Part-Based Models","volume":"32","author":"Felzenszwal","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1109\/TCYB.2013.2255271","article-title":"Occlusion Handling via Random Subspace Classifiers for Human Detection","volume":"44","author":"Amores","year":"2014","journal-title":"IEEE Trans. Cybern."},{"key":"ref_10","unstructured":"Zhang, S., Benenson, R., Omran, M., Hosang, J., and Schiele, B. (July, January 26). How Far Are We from Solving Pedestrian Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_11","unstructured":"Coley, G., Wesley, A., Reed, N., and Parry, I. (2018, July 12). Driver Reaction Times to Familiar but Unexpected Events. Available online: https:\/\/trl.co.uk\/sites\/default\/files\/PPR313_new.pdf."},{"key":"ref_12","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montreal, QC, Canada."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, T., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_14","unstructured":"Bell, S., Zitnick, C., Bala, K., and Girshick, R. (July, January 26). Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kong, T., Sun, F., Yao, A., Liu, H., Lu, M., and Chen, Y. (2017, January 21\u201326). RON: Reverse Connection with Objectness Prior Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.557"},{"key":"ref_16","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (July, January 26). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., and Reed, S.E. (2016, January 11\u201314). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 11\u201318). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ozuysal, M., Lepetit, V., and Fua, P. (2009, January 20\u201325). Pose estimation for category specific multiview object localization. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206633"},{"key":"ref_21","unstructured":"Villamizar, M., Grabner, H., Moreno-Noguer, F., Andrade-Cetto, J., Van Gool, L.J., and Sanfeliu, A. (September, January 29). Efficient 3D Object Detection using Multiple PoseSpecific Classifiers. Proceedings of the British Machine Vision Conference, University of Dundee, Dundee, Scotland."},{"key":"ref_22","unstructured":"Choi, J. (2006). Realtime On-Road Vehicle Detection with Optical Flows and Haar-Like Feature Detectors, University of Illinois at Urbana-Champaign. A Final Report of a Course CS543 (Computer Vision, Prof. Li FeiFei)."},{"key":"ref_23","unstructured":"Zheng, W., and Liang, L. (2009, January 20\u201325). Fast car detection using image strip features. Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Leibe, B., and Schiele, B. (2003, January 9\u201311). Interleaved object categorization and segmentation. Proceedings of the British Machine Vision Conference, Norwich, UK.","DOI":"10.5244\/C.17.78"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1109\/TPAMI.2007.70772","article-title":"Multiscale categorical object recognition using contour fragments","volume":"30","author":"Shotton","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","unstructured":"Zheng, W., Song, S., Chang, H., and Chen, X. (2012, January 5\u20139). Grouping active contour fragments for object recognition. Proceedings of the Asian Conference on Computer Vision, Daejeon, Korea."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Lim, J.J., Zitnick, C.L., and Doll\u00e1r, P. (2013, January 25\u201327). Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.406"},{"key":"ref_28","unstructured":"Wu, B., and Nevatia, R. (2008, January 24\u201326). Optimizing discrimination efficiency tradeoff in integrating heterogeneous local features for object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, AK, USA."},{"key":"ref_29","unstructured":"Mathieu, M., LeCun, Y., Fergus, R., Eigen, D., Sermanet, P., and Zhang, X. (2014, January 14\u201316). Over Feat: Integrated Recognition, Localization and Detection using Convolutional Networks. Proceedings of the International Conference on Learning Representations, Banff, AB, Canada."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2014, January 6\u201312). Spatial pyramid pooling in deep convolutional networks for visual Recognition. Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10578-9_23"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 11\u201318). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"944","DOI":"10.1109\/TMM.2016.2642789","article-title":"Attentive Contexts for Object Detection","volume":"19","author":"Li","year":"2017","journal-title":"IEEE Trans. Multimed."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1899","DOI":"10.1109\/TMM.2015.2476660","article-title":"RGB-D object recognition via incorporating latent data structure and prior Knowledge","volume":"17","author":"Tang","year":"2015","journal-title":"IEEE Trans. Multimed."},{"key":"ref_35","unstructured":"Hong, S., Roh, B., Kim, K., Cheon, Y., and Park, M. (2016). Pvanet: Deep but lightweight neural networks for real-time object detection. arXiv."},{"key":"ref_36","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_37","unstructured":"Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A.C. (2017). DSSD: Deconvolutional Single Shot Detector. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Lin, T., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2018). Focal Loss for Dense Object Detection. arXiv.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1007\/s11263-014-0733-5","article-title":"The pascal visual object classes challenge: A retrospective","volume":"111","author":"Everingham","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_41","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_43","unstructured":"Jonathan, H. (2018, July 12). mAP (mean Average Precision) for Object Detection. Available online: https:\/\/medium.com\/@jonathan_hui\/map-mean-average-precision-for-object-detection-45c121a31173."},{"key":"ref_44","unstructured":"Salton, G., and McGill, M.J. (1986). Introduction to Modern Information Retrieval, McGraw-Hill."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully Convolutional Networks for Semantic Segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_47","unstructured":"Srivastava, R.K., Masci, J., Gomez, F., and Schmidhuber, J. (2015, January 7\u20139). Understanding Locally Competitive Networks. Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated Residual Transformations for Deep Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_49","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_51","unstructured":"Mechanical Simulation Corporation (2018, March 12). CarSim\u00ae Mechanical Simulation. Available online: https:\/\/www.carsim.com\/products\/carsim\/."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Yang, L., Luo, P., Loy, C.C., and Tang, X. (2015, January 7\u201312). A large-scale car dataset for fine-grained categorization and verification. Proceedings of the IEEE International Conference of Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299023"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5269\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:10:10Z","timestamp":1760177410000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/18\/5269"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,15]]},"references-count":52,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["s20185269"],"URL":"https:\/\/doi.org\/10.3390\/s20185269","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,15]]}}}