{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,27]],"date-time":"2026-05-27T22:29:04Z","timestamp":1779920944032,"version":"3.53.1"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2022,2,11]],"date-time":"2022-02-11T00:00:00Z","timestamp":1644537600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,11]],"date-time":"2022-02-11T00:00:00Z","timestamp":1644537600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australia Research Council","doi-asserted-by":"crossref","award":["DP150100294"],"award-info":[{"award-number":["DP150100294"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP150104251"],"award-info":[{"award-number":["DP150104251"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001801","name":"University of Western Australia","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001801","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Missing data is a major problem in real-world datasets, which hinders the performance of data analytics. Conventional data imputation schemes such as univariate single imputation replace missing values in each column with the same approximated value. These univariate single imputation techniques underestimate the variance of the imputed values. On the other hand, multivariate imputation explores the relationships between different columns of data, to impute the missing values. Reinforcement Learning (RL) is a machine learning paradigm where the agent learns by taking actions and receiving rewards in response, to achieve its goal. In this work, we propose an RL-based approach to impute missing data by learning a policy to impute data through an action-reward-based experience. Our approach imputes missing values in a column by working only on the same column (similar to univariate single imputation) but imputes the missing values in the column with different values thus keeping the variance in the imputed values. We report superior performance of our approach, compared with other imputation techniques, on a number of datasets.<\/jats:p>","DOI":"10.1007\/s00521-022-06958-3","type":"journal-article","created":{"date-parts":[[2022,2,11]],"date-time":"2022-02-11T03:02:31Z","timestamp":1644548551000},"page":"9701-9716","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["A reinforcement learning-based approach for imputing missing data"],"prefix":"10.1007","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5759-6832","authenticated-orcid":false,"given":"Saqib Ejaz","family":"Awan","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mohammed","family":"Bennamoun","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ferdous","family":"Sohel","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Frank","family":"Sanfilippo","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Girish","family":"Dwivedi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,2,11]]},"reference":[{"key":"6958_CR1","doi-asserted-by":"crossref","unstructured":"Altameem T, Amoon M, Altameem A (2020) A deep reinforcement learning process based on robotic training to assist mental health patients. Neural Comput Appl 1\u201310","DOI":"10.1007\/s00521-020-04855-1"},{"issue":"1","key":"6958_CR2","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1111\/j.1751-5823.2010.00103.x","volume":"78","author":"RR Andridge","year":"2010","unstructured":"Andridge RR, Little RJ (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78(1):40\u201364","journal-title":"Int Stat Rev"},{"key":"6958_CR3","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1016\/j.neucom.2021.04.010","volume":"453","author":"SE Awan","year":"2021","unstructured":"Awan SE, Bennamoun M, Sohel F, Sanfilippo F, Dwivedi G (2021) Imputation of missing data with class imbalance using conditional generative adversarial networks. Neurocomputing 453:164\u2013171","journal-title":"Neurocomputing"},{"issue":"3","key":"6958_CR4","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1186\/s12911-016-0318-z","volume":"16","author":"L Beretta","year":"2016","unstructured":"Beretta L, Santaniello A (2016) Nearest neighbor imputation algorithms: a critical evaluation. BMC Med Inf Decis Mak 16(3):74","journal-title":"BMC Med Inf Decis Mak"},{"key":"6958_CR5","first-page":"1","volume":"45","author":"S Van Buuren","year":"2010","unstructured":"Van Buuren S, Groothuis-Oudshoorn K (2010) MICE: multivariate imputation by chained equations in R. J Stat Softw 45:1\u201368","journal-title":"J Stat Softw"},{"issue":"4","key":"6958_CR6","doi-asserted-by":"publisher","first-page":"1956","DOI":"10.1137\/080738970","volume":"20","author":"JF Cai","year":"2010","unstructured":"Cai JF, Cand\u00e8s EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956\u20131982","journal-title":"SIAM J Optim"},{"issue":"3","key":"6958_CR7","doi-asserted-by":"publisher","first-page":"1247","DOI":"10.5194\/gmd-7-1247-2014","volume":"7","author":"T Chai","year":"2014","unstructured":"Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?-arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247\u20131250","journal-title":"Geosci Model Dev"},{"issue":"10","key":"6958_CR8","doi-asserted-by":"publisher","first-page":"1087","DOI":"10.1016\/j.jclinepi.2006.01.014","volume":"59","author":"ART Donders","year":"2006","unstructured":"Donders ART, Van Der Heijden GJ, Stijnen T, Moons KG (2006) A gentle introduction to imputation of missing values. J Clin Epidemiol 59(10):1087\u20131091","journal-title":"J Clin Epidemiol"},{"key":"6958_CR9","unstructured":"Dua D, Graff C (2017) UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml"},{"key":"6958_CR10","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/j.chemolab.2014.02.007","volume":"134","author":"M G\u00f3mez-Carracedo","year":"2014","unstructured":"G\u00f3mez-Carracedo M, Andrade J, L\u00f3pez-Mah\u00eda P, Muniategui S, Prada D (2014) A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets. Chemom Intell Lab Syst 134:23\u201333","journal-title":"Chemom Intell Lab Syst"},{"key":"6958_CR11","doi-asserted-by":"crossref","unstructured":"Gondara L, Wang K (2018) MIDA: multiple imputation using denoising autoencoders. In: Pacific-Asia conference on knowledge discovery and data mining (PAKDD 2018). Springer, pp 260\u2013272","DOI":"10.1007\/978-3-319-93040-4_21"},{"issue":"11","key":"6958_CR12","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1145\/3422622","volume":"63","author":"I Goodfellow","year":"2020","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139\u2013144","journal-title":"Commun ACM"},{"issue":"1","key":"6958_CR13","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1161\/CIRCOUTCOMES.109.875658","volume":"3","author":"Y He","year":"2010","unstructured":"He Y (2010) Missing data analysis using multiple imputation: getting to the heart of the matter. Circ Cardiovasc Qual Outcomes 3(1):98\u2013105","journal-title":"Circ Cardiovasc Qual Outcomes"},{"key":"6958_CR14","first-page":"123","volume":"20","author":"JJ Hox","year":"1999","unstructured":"Hox JJ (1999) A review of current software for handling missing data. Kwant Methoden 20:123\u2013138","journal-title":"Kwant Methoden"},{"issue":"5","key":"6958_CR15","doi-asserted-by":"publisher","first-page":"402","DOI":"10.4097\/kjae.2013.64.5.402","volume":"64","author":"H Kang","year":"2013","unstructured":"Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64(5):402","journal-title":"Korean J Anesthesiol"},{"key":"6958_CR16","unstructured":"Kim JK, Fuller W (2013) Hot deck imputation for multivariate missing data. In: Proceedings 59th ISI world statistics congress, pp 25\u201330"},{"issue":"2","key":"6958_CR17","doi-asserted-by":"publisher","first-page":"1487","DOI":"10.1007\/s10462-019-09709-4","volume":"53","author":"WC Lin","year":"2020","unstructured":"Lin WC, Tsai CF (2020) Missing value imputation: a review and analysis of the literature (2006\u20132017). Artif Intell Rev 53(2):1487\u20131509","journal-title":"Artif Intell Rev"},{"key":"6958_CR18","unstructured":"Lodder P (2013) To impute or not impute: that\u2019s the question. Advis Res Methods Sel Top 1\u20137"},{"key":"6958_CR19","doi-asserted-by":"crossref","unstructured":"Mahboob T, Ijaz A, Shahzad A, Kalsoom M (2018) Handling missing values in chronic kidney disease datasets using KNN, K-means and K-medoids algorithms. In: 12th international conference on open source systems and technologies (ICOSST), pp 76\u201381. IEEE","DOI":"10.1109\/ICOSST.2018.8632179"},{"key":"6958_CR20","unstructured":"McKnight PE, McKnight KM, Sidani S, Figueredo AJ (2007) Missing data: a gentle introduction, vol\u00a01. Guilford Press"},{"issue":"4","key":"6958_CR21","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1076\/edre.7.4.353.8937","volume":"7","author":"TD Pigott","year":"2001","unstructured":"Pigott TD (2001) A review of methods for missing data. Educ Res Eval 7(4):353\u2013383","journal-title":"Educ Res Eval"},{"issue":"3","key":"6958_CR22","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1177\/1536867X0400400301","volume":"4","author":"P Royston","year":"2004","unstructured":"Royston P (2004) Multiple imputation of missing values. Stata J 4(3):227\u2013241","journal-title":"Stata J"},{"issue":"3","key":"6958_CR23","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1093\/biomet\/63.3.581","volume":"63","author":"DB Rubin","year":"1976","unstructured":"Rubin DB (1976) Inference and missing data. Biometrika 63(3):581\u2013592","journal-title":"Biometrika"},{"issue":"17","key":"6958_CR24","doi-asserted-by":"publisher","first-page":"13233","DOI":"10.1007\/s00521-019-04013-2","volume":"32","author":"A S\u00e1nchez-Morales","year":"2020","unstructured":"S\u00e1nchez-Morales A, Sancho-G\u00f3mez JL, Mart\u00ednez-Garc\u00eda JA, Figueiras-Vidal AR (2020) Improving deep learning performance with missing values via deletion and compensation. Neural Comput Appl 32(17):13233\u201313244","journal-title":"Neural Comput Appl"},{"key":"6958_CR25","doi-asserted-by":"crossref","unstructured":"Schafer JL (1997) Analysis of incomplete multivariate data, vol\u00a01. CRC press","DOI":"10.1201\/9781439821862"},{"issue":"6","key":"6958_CR26","doi-asserted-by":"publisher","first-page":"764","DOI":"10.1093\/aje\/kwt312","volume":"179","author":"AD Shah","year":"2014","unstructured":"Shah AD, Bartlett JW, Carpenter J, Nicholas O, Hemingway H (2014) Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol 179(6):764\u2013774","journal-title":"Am J Epidemiol"},{"key":"6958_CR27","doi-asserted-by":"publisher","first-page":"150","DOI":"10.1016\/j.knosys.2019.02.034","volume":"173","author":"M \u015amieja","year":"2019","unstructured":"\u015amieja M, Struski \u0141, Tabor J, Marzec M (2019) Generalized RBF kernel for incomplete data. Knowl Based Syst 173:150\u2013162","journal-title":"Knowl Based Syst"},{"key":"6958_CR28","unstructured":"\u015amieja M, Struski \u0141, Tabor J, Zieli\u0144ski B, Spurek P (2018) Processing of missing data by neural networks. In: Advances in neural information processing systems, pp 2719\u20132729"},{"key":"6958_CR29","doi-asserted-by":"crossref","unstructured":"Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR (2009) Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338","DOI":"10.1136\/bmj.b2393"},{"key":"6958_CR30","doi-asserted-by":"crossref","unstructured":"Stuart EA, Azur M, Frangakis C, Leaf P (2009) Multiple imputation with large data sets: a case study of the children\u2019s mental health initiative. Am J Epidemiol 169(9):1133\u20131139","DOI":"10.1093\/aje\/kwp026"},{"issue":"9","key":"6958_CR31","doi-asserted-by":"publisher","first-page":"2610","DOI":"10.1177\/0962280216683570","volume":"27","author":"TR Sullivan","year":"2018","unstructured":"Sullivan TR, White IR, Salter AB, Ryan P, Lee KJ (2018) Should multiple imputation be the method of choice for handling missing data in randomized trials? Stat Methods Med Res 27(9):2610\u20132626","journal-title":"Stat Methods Med Res"},{"key":"6958_CR32","unstructured":"Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, vol\u00a02. MIT Press"},{"issue":"6","key":"6958_CR33","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1002\/sam.11348","volume":"10","author":"F Tang","year":"2017","unstructured":"Tang F, Ishwaran H (2017) Random forest missing data algorithms. Stat Anal Data Min ASA Data Sci J 10(6):363\u2013377","journal-title":"Stat Anal Data Min ASA Data Sci J"},{"key":"6958_CR34","doi-asserted-by":"crossref","unstructured":"Tran L, Liu X, Zhou J, Jin R (2017) Missing modalities imputation via cascaded residual autoencoder. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1405\u20131414","DOI":"10.1109\/CVPR.2017.528"},{"issue":"12","key":"6958_CR35","doi-asserted-by":"publisher","first-page":"1049","DOI":"10.1080\/10629360600810434","volume":"76","author":"S Van Buuren","year":"2006","unstructured":"Van Buuren S, Brand JP, Groothuis-Oudshoorn CG, Rubin DB (2006) Fully conditional specification in multivariate imputation. J Stat Comput Simul 76(12):1049\u20131064","journal-title":"J Stat Comput Simul"},{"issue":"3\u20134","key":"6958_CR36","first-page":"279","volume":"8","author":"CJ Watkins","year":"1992","unstructured":"Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3\u20134):279\u2013292","journal-title":"Mach Learn"},{"issue":"4","key":"6958_CR37","doi-asserted-by":"publisher","first-page":"377","DOI":"10.1002\/sim.4067","volume":"30","author":"IR White","year":"2011","unstructured":"White IR, Royston P, Wood AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30(4):377\u2013399","journal-title":"Stat Med"},{"issue":"3","key":"6958_CR38","doi-asserted-by":"publisher","first-page":"5866","DOI":"10.1016\/j.eswa.2008.07.018","volume":"36","author":"IC Yeh","year":"2009","unstructured":"Yeh IC, Yang KJ, Ting TM (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3):5866\u20135871","journal-title":"Expert Syst Appl"},{"key":"6958_CR39","unstructured":"Yoon J, Jordon J, Schaar M (2018) GAIN: missing data imputation using generative adversarial nets. In: International conference on machine learning, pp 5689\u20135698. PMLR"},{"key":"6958_CR40","unstructured":"Zhang H, Xie P, Xing E (2018) Missing value imputation based on deep generative models. arXiv preprint arXiv:1808.01684"},{"issue":"1","key":"6958_CR41","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1007\/s10489-010-0244-1","volume":"36","author":"B Zhu","year":"2012","unstructured":"Zhu B, He C, Liatsis P (2012) A robust missing value imputation method for noisy data. Appl Intell 36(1):61\u201374","journal-title":"Appl Intell"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-022-06958-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-022-06958-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-022-06958-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,16]],"date-time":"2022-05-16T06:10:48Z","timestamp":1652681448000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-022-06958-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,11]]},"references-count":41,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["6958"],"URL":"https:\/\/doi.org\/10.1007\/s00521-022-06958-3","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,11]]},"assertion":[{"value":"15 May 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 February 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declaration"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}