{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T15:05:48Z","timestamp":1774451148220,"version":"3.50.1"},"reference-count":49,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2020,4,2]],"date-time":"2020-04-02T00:00:00Z","timestamp":1585785600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["71621001 and 71131001."],"award-info":[{"award-number":["71621001 and 71131001."]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Various traffic-sensing technologies have been employed to facilitate traffic control. Due to certain factors, e.g., malfunctioning devices and artificial mistakes, missing values typically occur in the Intelligent Transportation System (ITS) sensing datasets, resulting in a decrease in the data quality. In this study, an integrated imputation algorithm based on fuzzy C-means (FCM) and the genetic algorithm (GA) is proposed to improve the accuracy of the estimated values. The GA is applied to optimize the parameter of the membership degree and the number of cluster centroids in the FCM model. An experimental test of the taxi global positioning system (GPS) data in Manhattan, New York City, is employed to demonstrate the effectiveness of the integrated imputation approach. Three evaluation criteria, the root mean squared error (RMSE), correlation coefficient (R), and relative accuracy (RA), are used to verify the experimental results. Under the \u00b15% and \u00b110% thresholds, the average RAs obtained by the integrated imputation method are 0.576 and 0.785, which remain the highest among different methods, indicating that the integrated imputation method outperforms the history imputation method and the conventional FCM method. On the other hand, the clustering imputation performance with the Euclidean distance is better than that with the Manhattan distance. Thus, our proposed integrated imputation method can be employed to estimate the missing values in the daily traffic management.<\/jats:p>","DOI":"10.3390\/s20071992","type":"journal-article","created":{"date-parts":[[2020,4,2]],"date-time":"2020-04-02T11:57:14Z","timestamp":1585828634000},"page":"1992","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":27,"title":["An Integrated Fuzzy C-Means Method for Missing Data Imputation Using Taxi GPS Data"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8960-1432","authenticated-orcid":false,"given":"Junsheng","family":"Huang","sequence":"first","affiliation":[{"name":"School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China"},{"name":"Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China"}]},{"given":"Baohua","family":"Mao","sequence":"additional","affiliation":[{"name":"School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China"},{"name":"Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China"},{"name":"Integrated Transportation Research Centre of China, Beijing Jiaotong University, Beijing 100044, China"}]},{"given":"Yun","family":"Bai","sequence":"additional","affiliation":[{"name":"School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China"},{"name":"Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China"}]},{"given":"Tong","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China"},{"name":"Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China"}]},{"given":"Changjun","family":"Miao","sequence":"additional","affiliation":[{"name":"Signal &amp; Communication Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, China"}]}],"member":"1968","published-online":{"date-parts":[[2020,4,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"590","DOI":"10.1016\/j.physa.2016.03.047","article-title":"Understanding taxi travel patterns","volume":"457","author":"Cai","year":"2016","journal-title":"Physica A"},{"key":"ref_2","first-page":"189","article-title":"Rapid traffic congestion monitoring based on floating car data","volume":"51","author":"Wu","year":"2014","journal-title":"J. Comput. Res. Dev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1.1","DOI":"10.1155\/2018\/6197549","article-title":"Taxi driver\u2019s operation behavior and passengers\u2019 demand analysis based on GPS data","volume":"2018","author":"Hu","year":"2018","journal-title":"J. Adv. Transp."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2579","DOI":"10.1016\/j.ijleo.2015.12.006","article-title":"Efficient vehicles path planning algorithm based on taxi GPS big data","volume":"127","author":"Zhang","year":"2016","journal-title":"Optik"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.trb.2014.06.002","article-title":"Estimation of mean and covariance of peak hour origin\u2013destination demands from day-to-day traffic counts","volume":"68","author":"Shao","year":"2014","journal-title":"Transp. Res. Part B Methodol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1016\/j.trc.2010.12.003","article-title":"Smart card data use in public transit: A literature review","volume":"19","author":"Pelletier","year":"2011","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1016\/j.trc.2016.05.004","article-title":"Validating and improving public transport origin\u2013destination estimation algorithm using smart card fare data","volume":"68","author":"Alsger","year":"2016","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.ins.2013.01.021","article-title":"A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm","volume":"223","author":"Aydilek","year":"2013","journal-title":"Inf. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1016\/j.trc.2016.09.015","article-title":"An efficient realization of deep learning for traffic data imputation","volume":"72","author":"Duan","year":"2016","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.neucom.2016.04.015","article-title":"Missing data imputation using fuzzy-rough methods","volume":"205","author":"Amiri","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1007\/s00521-009-0295-6","article-title":"Pattern classification with missing data: A review","volume":"19","year":"2010","journal-title":"Neural Comput. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1016\/j.trc.2014.11.003","article-title":"A hybrid approach to integrate fuzzy C-means based imputation method with genetic algorithm for missing traffic volume data estimation","volume":"51","author":"Tang","year":"2015","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1016\/j.eswa.2019.04.032","article-title":"A hierarchical prediction model for lane-changes based on combination of fuzzy C-means and adaptive neural network","volume":"130","author":"Tang","year":"2019","journal-title":"Expert Syst. Appl."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Choi, Y.Y., Shon, H., Byon, Y.J., Kim, D.Y., and Kang, S. (2019). Enhanced application of principal component analysis in machine learning for imputation of missing traffic data. Appl. Sci., 9.","DOI":"10.3390\/app9102149"},{"key":"ref_15","first-page":"81","article-title":"Random Forest Based Operational Missing Data Imputation for Highway Tunnel","volume":"16","author":"Qian","year":"2016","journal-title":"J. Transp. Syst. Eng. Inf. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Offor, K.J., Vaci, L., and Mihaylova, L.S. (2019). Traffic Estimation for Large Urban Road Network with High Missing Data Ratio. Sensors, 19.","DOI":"10.3390\/s19122813"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1787","DOI":"10.1007\/s00500-013-0997-7","article-title":"A hybrid genetic algorithm\u2013fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals","volume":"17","author":"Li","year":"2013","journal-title":"Soft. Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"6793","DOI":"10.1016\/j.eswa.2010.12.067","article-title":"Missing data analysis with fuzzy C-Means: A study of its application in a psychological scenario","volume":"38","author":"Nuovo","year":"2011","journal-title":"Expert Syst. Appl."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/j.neucom.2018.08.067","article-title":"LSTM-based traffic flow prediction with missing data","volume":"318","author":"Tian","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.trc.2018.11.003","article-title":"A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation","volume":"98","author":"Chen","year":"2019","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1061\/(ASCE)0733-947X(2005)131:12(931)","article-title":"Multiple Imputation Scheme for Overcoming the Missing Values and Variability Issues in ITS Data","volume":"131","author":"Ni","year":"2005","journal-title":"J. Transp. Eng."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1016\/j.neunet.2009.11.014","article-title":"A study on the use of imputation methods for experimentation with radial basis function network classifiers handling missing attribute values: The good synergy between RBFNs and event covering method","volume":"23","author":"Luengo","year":"2010","journal-title":"Neural Netw."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1177\/0962280217727033","article-title":"Improved conditional imputation for linear regression with a randomly censored predictor","volume":"28","author":"Atem","year":"2017","journal-title":"Stat. Methods Med. Res."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.atmosenv.2018.05.055","article-title":"A novel regression imputation framework for Tehran air pollution monitoring network using outputs from WRF and CAMX models","volume":"187","author":"Shahbazi","year":"2018","journal-title":"Atmos. Environ."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.csda.2015.04.009","article-title":"Improved methods for the imputation of missing data by nearest neighbor methods","volume":"90","author":"Tutz","year":"2015","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.eswa.2017.07.026","article-title":"An extensive analysis of the interaction be- tween missing data types, imputation methods, and supervised classifiers","volume":"89","author":"Garciarena","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"106175","DOI":"10.1016\/j.asoc.2020.106175","article-title":"A new incomplete pattern belief classification method with multiple estimations based on KNN","volume":"90","author":"Ma","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"105122","DOI":"10.1016\/j.cmpb.2019.105122","article-title":"R-Ensembler: A greedy rough set based ensemble attribute selection algorithm with kNN imputation for classification of medical data","volume":"184","author":"Bania","year":"2020","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2541","DOI":"10.1016\/j.jss.2012.05.073","article-title":"Nearest neighbor selection for iteratively kNN imputation","volume":"85","author":"Zhang","year":"2012","journal-title":"J. Syst. Softw."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/j.nutres.2020.01.001","article-title":"Missing data imputation via the expectation-maximization algorithm can improve principal component analysis aimed at deriving biomarker profiles and dietary patterns","volume":"75","author":"Malan","year":"2020","journal-title":"Nutr. Res."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1016\/j.knosys.2018.01.005","article-title":"FROG: Inference from knowledge base for missing value imputation","volume":"145","author":"Qi","year":"2018","journal-title":"Knowl. Based Syst."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1016\/j.knosys.2018.03.026","article-title":"A class center based approach for missing value imputation","volume":"151","author":"Tsai","year":"2018","journal-title":"Knowl. Based Syst."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/j.knosys.2016.01.048","article-title":"Fuzzy C-Means clustering of incomplete data based on probabilistic information granules of missing values","volume":"99","author":"Zhang","year":"2016","journal-title":"Knowl. Based Syst."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1016\/j.asoc.2010.02.011","article-title":"Autonomous and deterministic supervised fuzzy clustering with data imputation capabilities","volume":"11","author":"Ming","year":"2011","journal-title":"Appl. Soft Comput."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1016\/j.eswa.2018.07.057","article-title":"Missing value imputation using a novel grey based fuzzy c-means, mutual information based feature selection, and regression model","volume":"115","author":"Sefidian","year":"2019","journal-title":"Expert Syst. Appl."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"160","DOI":"10.3141\/1855-20","article-title":"Detecting Errors and Imputing Missing Data for Single-Loop Surveillance Systems","volume":"1855","author":"Chen","year":"2003","journal-title":"Transp. Res. Record."},{"key":"ref_37","unstructured":"Boyles, S. (2011, January 23\u201327). A comparison of interpolation methods for missing traffic volume data. Proceedings of the 90th Annual Meeting of the Transportation Research Board, Washington, DC, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1080\/713827181","article-title":"An analysis of four missing data treatment methods for supervised learning","volume":"17","author":"Batista","year":"2003","journal-title":"Appl. Artif. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/j.jmva.2018.03.010","article-title":"An expectation\u2013maximization algorithm for the matrix normal distribution with an application in remote sensing","volume":"167","author":"Glanz","year":"2018","journal-title":"J. Multivar. Anal."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.ymssp.2018.10.020","article-title":"An approach based on expectation-maximization algorithm for parameter estimation of Lamb wave signals","volume":"120","author":"Jia","year":"2019","journal-title":"Mech. Syst. Signal Process."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"104805","DOI":"10.1016\/j.knosys.2019.06.013","article-title":"Similarity-learning information-fusion schemes for missing data imputation","volume":"187","author":"Cheng","year":"2020","journal-title":"Knowl. Based Syst."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"6942","DOI":"10.1016\/j.eswa.2010.03.028","article-title":"A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data","volume":"37","author":"Li","year":"2010","journal-title":"Expert Syst. Appl."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"103435","DOI":"10.1016\/j.engappai.2019.103435","article-title":"An Evolutionary Neuro-Fuzzy C-means Clustering Technique","volume":"89","author":"Pantula","year":"2020","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1016\/j.neucom.2017.03.068","article-title":"Hybrid fuzzy clustering methods based on improved self-adaptive cellular genetic algorithm and optimal-selection-based fuzzy c-means","volume":"249","author":"Jie","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1016\/j.cie.2013.09.025","article-title":"A fuzzy c-means based hybrid evolutionary approach to the clustering of supply chain","volume":"66","author":"Xiao","year":"2013","journal-title":"Comput. Ind. Eng."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1109\/TITS.2009.2026312","article-title":"PPCA-Based Missing Data Imputation for Traffic Flow Volume: A Systematical Approach","volume":"10","author":"Qu","year":"2009","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1016\/j.apenergy.2018.05.054","article-title":"Missing value imputation for short to mid-term horizontal solar irradiance Data","volume":"225","author":"Demirhan","year":"2018","journal-title":"Appl. Energy"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1016\/j.ins.2016.01.018","article-title":"Missing value imputation for the analysis of incomplete traffic accident data","volume":"339","author":"Deb","year":"2016","journal-title":"Inf. Sci."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/7\/1992\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:14:49Z","timestamp":1760174089000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/7\/1992"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,2]]},"references-count":49,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["s20071992"],"URL":"https:\/\/doi.org\/10.3390\/s20071992","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,2]]}}}