{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T15:53:23Z","timestamp":1780588403219,"version":"3.54.1"},"reference-count":34,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2020,4,12]],"date-time":"2020-04-12T00:00:00Z","timestamp":1586649600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>For the development of intelligent transportation systems, if real-time information on the number of people on buses can be obtained, it will not only help transport operators to schedule buses but also improve the convenience for passengers to schedule their travel times accordingly. This study proposes a method for estimating the number of passengers on a bus. The method is based on deep learning to estimate passenger occupancy in different scenarios. Two deep learning methods are used to accomplish this: the first is a convolutional autoencoder, mainly used to extract features from crowds of passengers and to determine the number of people in a crowd; the second is the you only look once version 3 architecture, mainly for detecting the area in which head features are clearer on a bus. The results obtained by the two methods are summed to calculate the current passenger occupancy rate of the bus. To demonstrate the algorithmic performance, experiments for estimating the number of passengers at different bus times and bus stops were performed. The results indicate that the proposed system performs better than some existing methods.<\/jats:p>","DOI":"10.3390\/s20082178","type":"journal-article","created":{"date-parts":[[2020,4,13]],"date-time":"2020-04-13T10:41:52Z","timestamp":1586774512000},"page":"2178","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":39,"title":["Estimation of the Number of Passengers in a Bus Using Deep Learning"],"prefix":"10.3390","volume":"20","author":[{"given":"Ya-Wen","family":"Hsu","sequence":"first","affiliation":[{"name":"Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-sen University, Kaohsiung 804201, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yen-Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-sen University, Kaohsiung 804201, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jau-Woei","family":"Perng","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-sen University, Kaohsiung 804201, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2020,4,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Luo, Y., Tan, J., Tian, X., and Xiang, H. (2013, January 28\u201330). A device for counting the passenger flow is introduced. Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, Dongguan, China.","DOI":"10.1109\/ICVES.2013.6619593"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1109\/TITS.2010.2048429","article-title":"Performance evaluation of UHF RFID technologies for real-time passenger recognition in intelligent public transportation systems","volume":"11","author":"Oberli","year":"2010","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_3","unstructured":"Chen, C.H., Chang, Y.C., Chen, T.Y., and Wang, D.J. (2008, January 26\u201328). People counting system for getting in\/out of a bus based on video processing. Proceedings of the International Conference on Intelligent Systems Design and Applications, Kaohsiung, Taiwan."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yang, T., Zhang, Y., Shao, D., and Li, Y. (2010). Clustering method for counting passengers getting in a bus with single camera. Opt. Eng., 49.","DOI":"10.1117\/1.3374439"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chen, J., Wen, Q., Zhuo, C., and Mete, M. (2013, January 28\u201330). Automatic head detection for passenger flow analysis in bus surveillance videos. Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, Dongguan, China.","DOI":"10.1109\/CISP.2012.6469669"},{"key":"ref_6","unstructured":"Hu, B., Xiong, G., Li, Y., Chen, Z., Zhou, W., Wang, X., and Wang, Q. (2014, January 8\u201311). Research on passenger flow counting based on embedded system. Proceedings of the International IEEE Conference on Intelligent Transportation Systems (ITSC), Qingdao, China."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Mukherjee, S., Saha, B., Jamal, I., Leclerc, R., and Ray, N. (2011, January 11\u201314). A novel framework for automatic passenger counting. Proceedings of the IEEE International Conference on Image Processing, Brussels, Belgium.","DOI":"10.1109\/ICIP.2011.6116284"},{"key":"ref_8","unstructured":"Xu, H., Lv, P., and Meng, L. (2010, January 25\u201327). A people counting system based on head-shoulder detection and tracking in surveillance video. Proceedings of the International Conference On Computer Design and Applications, Qinhuangdao, China."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zeng, C., and Ma, H. (2010, January 23\u201326). Robust head-shoulder detection by PCA-based multilevel HOG-LBP detector for people counting. Proceedings of the International Conference on Pattern Recognition, Istanbul, Turkey.","DOI":"10.1109\/ICPR.2010.509"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1016\/j.knosys.2017.02.016","article-title":"Passenger flow estimation based on convolutional neural network in public transportation system","volume":"123","author":"Liu","year":"2017","journal-title":"Knowl. Base Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1016\/j.neucom.2016.01.097","article-title":"People counting based on head detection combining Adaboost and CNN in crowded surveillance environment","volume":"208","author":"Gao","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wang, Z., Cai, G., Zheng, C., and Fang, C. (2018, January 28\u201331). Bus-crowdedness estimation by shallow convolutional neural network. Proceedings of the International Conference on Sensor Networks and Signal Processing (SNSP), Xi\u2019an, China.","DOI":"10.1109\/SNSP.2018.00029"},{"key":"ref_13","unstructured":"Chan, A.B., and Vasconcelos, N. (October, January 29). Bayesian Poisson regression for crowd counting. Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chen, K., Loy, C.C., Gong, S., and Xiang, T. (2012, January 3\u20137). Feature mining for localised crowd counting. Proceedings of the British Machine Vision Conference (BMVC), Surrey, England.","DOI":"10.5244\/C.26.21"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Xu, B., and Qiu, G. (2016, January 7\u201310). Crowd density estimation based on rich features and random projection forest. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA.","DOI":"10.1109\/WACV.2016.7477682"},{"key":"ref_16","unstructured":"Borstel, M., Kandemir, M., Schmidt, P., Rao, M., Rajamani, K., and Hamprecht, F. (2016, January 11\u201314). Gaussian process density counting from weak supervision. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chan, A.B., Liang, Z.S.J., and Vasconcelos, N. (2008, January 23\u201328). Privacy preserving crowd monitoring: Counting people without people models or tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587569"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2160","DOI":"10.1109\/TIP.2011.2172800","article-title":"Counting people with low-level features and Bayesian regression","volume":"21","author":"Chan","year":"2012","journal-title":"IEEE Trans. Image Process."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Idrees, H., Saleemi, I., Seibert, C., and Shah, M. (2013, January 23\u201328). Multi-source multi-scale counting in extremely dense crowd images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.329"},{"key":"ref_20","unstructured":"Lempitsky, V., and Zisserman, A. (2010, January 6\u201311). Learning to count objects in images. Proceedings of the International Conference on Neural Information Processing Systems (NIPS), Hyatt Regency, Vancouver, BC, Canada."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wang, J., Wang, L., and Yang, F. (2017, January 17\u201319). Counting crowd with fully convolutional networks. Proceedings of the International Conference on Multimedia and Image Processing (ICMIP), Wuhan, China.","DOI":"10.1109\/ICMIP.2017.25"},{"key":"ref_22","unstructured":"Zhang, C., Li, H., Wang, X., and Yang, X. (2015, January 7\u201312). Cross-scene crowd counting via deep convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sindagi, V.A., and Patel, V.M. (2017). CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. arXiv.","DOI":"10.1109\/AVSS.2017.8078491"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Sindagi, V.A., and Patel, V.M. (2017). Generating high-quality crowd density maps using contextual pyramid CNNs. arXiv.","DOI":"10.1109\/ICCV.2017.206"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, L., Shi, M., and Chen, Q. (2017). Crowd counting via scale-adaptive convolutional neural network. arXiv.","DOI":"10.1109\/WACV.2018.00127"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Zhou, D., Chen, S., Gao, S., and Ma, Y. (2016, January 27\u201330). Single-image crowd counting via multi-column convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.70"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Weng, W.T., and Lin, D.T. (2018, January 8\u201313). Crowd density estimation based on a modified multicolumn convolutional neural network. Proceedings of the International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489238"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1049","DOI":"10.1109\/TIP.2017.2740160","article-title":"Body structure aware deep crowd counting","volume":"27","author":"Huang","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"24411","DOI":"10.1109\/ACCESS.2019.2899939","article-title":"Improved crowd counting method based on scale-adaptive convolutional neural network","volume":"7","author":"Sang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.image.2018.03.004","article-title":"Counting challenging crowds robustly using a multi-column multi-task convolutional neural network","volume":"64","author":"Yang","year":"2018","journal-title":"Signal Process. Image Commun."},{"key":"ref_31","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Olmschenk, G., Tang, H., and Zhu, Z. (2019). Improving dense crowd counting convolutional neural networks using inverse k-nearest neighbor maps and multiscale upsampling. arXiv.","DOI":"10.5220\/0009156201850195"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Masci, J., Meier, U., Ciresan, D., and SchmidHuber, J. (2011, January 14\u201317). Stacked convolutional auto-encoders for hierarchical feature extraction. Proceedings of the Artificial Neural Networks and Machine Learning (ICANN), Espoo, Finland.","DOI":"10.1007\/978-3-642-21735-7_7"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, X., and Chen, D. (2018). CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. arXiv.","DOI":"10.1109\/CVPR.2018.00120"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/8\/2178\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:44:47Z","timestamp":1760363087000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/8\/2178"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,12]]},"references-count":34,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["s20082178"],"URL":"https:\/\/doi.org\/10.3390\/s20082178","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,12]]}}}