{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,24]],"date-time":"2025-11-24T16:41:27Z","timestamp":1764002487648,"version":"build-2065373602"},"reference-count":52,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2021,12,13]],"date-time":"2021-12-13T00:00:00Z","timestamp":1639353600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Robotics Program","award":["1922500051","1922200058"],"award-info":[{"award-number":["1922500051","1922200058"]}]},{"DOI":"10.13039\/501100001348","name":"Agency for Science, Technology and Research","doi-asserted-by":"publisher","award":["1922200108"],"award-info":[{"award-number":["1922200108"]}],"id":[{"id":"10.13039\/501100001348","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Cleaning is one of the fundamental tasks with prime importance given in our day-to-day life. Moreover, the importance of cleaning drives the research efforts towards bringing leading edge technologies, including robotics, into the cleaning domain. However, an effective method to assess the quality of cleaning is an equally important research problem to be addressed. The primary footstep towards addressing the fundamental question of \u201cHow clean is clean\u201d is addressed using an autonomous cleaning-auditing robot that audits the cleanliness of a given area. This research work focuses on a novel reinforcement learning-based experience-driven dirt exploration strategy for a cleaning-auditing robot. The proposed approach uses proximal policy approximation (PPO) based on-policy learning method to generate waypoints and sampling decisions to explore the probable dirt accumulation regions in a given area. The policy network is trained in multiple environments with simulated dirt patterns. Experiment trials have been conducted to validate the trained policy in both simulated and real-world environments using an in-house developed cleaning audit robot called BELUGA.<\/jats:p>","DOI":"10.3390\/s21248331","type":"journal-article","created":{"date-parts":[[2021,12,14]],"date-time":"2021-12-14T01:22:05Z","timestamp":1639444925000},"page":"8331","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["A Reinforcement Learning Based Dirt-Exploration for Cleaning-Auditing Robot"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4010-570X","authenticated-orcid":false,"given":"Thejus","family":"Pathmakumar","sequence":"first","affiliation":[{"name":"Engineering Product Development Pillar, Singapore University of Technology and Design (SUTD), Singapore 487372, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6504-1530","authenticated-orcid":false,"given":"Mohan Rajesh","family":"Elara","sequence":"additional","affiliation":[{"name":"Engineering Product Development Pillar, Singapore University of Technology and Design (SUTD), Singapore 487372, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1425-2319","authenticated-orcid":false,"given":"Braulio F\u00e9lix","family":"G\u00f3mez","sequence":"additional","affiliation":[{"name":"Engineering Product Development Pillar, Singapore University of Technology and Design (SUTD), Singapore 487372, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3243-9814","authenticated-orcid":false,"given":"Balakrishnan","family":"Ramalingam","sequence":"additional","affiliation":[{"name":"Engineering Product Development Pillar, Singapore University of Technology and Design (SUTD), Singapore 487372, Singapore"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1108\/IJCTHR-11-2016-0111","article-title":"The influences of cleanliness and employee attributes on perceived service quality in restaurants in a developing country","volume":"11","author":"Truong","year":"2017","journal-title":"Int. J. Cult. Tour. Hosp. Res."},{"key":"ref_2","unstructured":"(2021, March 02). Cleaning a Nation: Cultivating a Healthy Living Environment, Available online: https:\/\/www.clc.gov.sg\/research-publications\/publications\/urban-systems-studies\/view\/cleaning-a-nation-cultivating-a-healthy-living-environment."},{"key":"ref_3","unstructured":"(2021, June 23). Cleaning Industry Analysis 2020-Cost & Trends. Available online: https:\/\/www.franchisehelp.com\/industry-reports\/cleaning-industry-analysis-2020-cost-trends\/."},{"key":"ref_4","unstructured":"(2021, June 23). Top Three Commercial Cleaning Trends in 2019. Available online: https:\/\/www.wilburncompany.com\/top-three-commercial-cleaning-trends-in-2019\/."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/s13756-020-00878-4","article-title":"Ultraviolet disinfection robots to improve hospital cleaning: Real promise or just a gimmick?","volume":"10","author":"Zingg","year":"2021","journal-title":"Antimicrob. Resist. Infect. Control"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1039\/C6EW00241B","article-title":"LED revolution: Fundamentals and prospects for UV disinfection applications","volume":"3","author":"Chen","year":"2017","journal-title":"Environ. Sci. Water Res. Technol."},{"key":"ref_7","unstructured":"Arnott, B., and Arnott, M. (2018). Automatic Floor Cleaning Machine and Process. (U.S. Patent 10,006,192)."},{"key":"ref_8","first-page":"834","article-title":"New device for air disinfection with a shielded UV radiation and ozone","volume":"19","author":"Martinovs","year":"2021","journal-title":"Agron. Res."},{"key":"ref_9","unstructured":"Dammkoehler, D., and Jin, Z. (2017). Floor Cleaning Machine. (U.S. Patent App. 29\/548,203)."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1016\/j.ajic.2017.09.027","article-title":"Deployment of a touchless ultraviolet light robot for terminal room disinfection: The importance of audit and feedback","volume":"46","author":"Fleming","year":"2018","journal-title":"Am. J. Infect. Control"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Prabakaran, V., Mohan, R.E., Sivanantham, V., Pathmakumar, T., and Kumar, S.S. (2018). Tackling area coverage problems in a reconfigurable floor cleaning robot based on polyomino tiling theory. Appl. Sci., 8.","DOI":"10.3390\/app8030342"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Muthugala, M., Vega-Heredia, M., Mohan, R.E., and Vishaal, S.R. (2020). Design and control of a wall cleaning robot with adhesion-awareness. Symmetry, 12.","DOI":"10.3390\/sym12010122"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Sivanantham, V., Le, A.V., Shi, Y., Elara, M.R., and Sheu, B.J. (2021). Adaptive Floor Cleaning Strategy by Human Density Surveillance Mapping with a Reconfigurable Multi-Purpose Service Robot. Sensors, 21.","DOI":"10.3390\/s21092965"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chang, C.L., Chang, C.Y., Tang, Z.Y., and Chen, S.T. (2018). High-efficiency automatic recharging mechanism for cleaning robot using multi-sensor. Sensors, 18.","DOI":"10.3390\/s18113911"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Pathmakumar, T., Sivanantham, V., Anantha Padmanabha, S.G., Elara, M.R., and Tun, T.T. (2021). Towards an Optimal Footprint Based Area Coverage Strategy for a False-Ceiling Inspection Robot. Sensors, 21.","DOI":"10.3390\/s21155168"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1016\/j.foodcont.2019.01.029","article-title":"Experimental study of effectiveness of robotic cleaning for fish-processing plants","volume":"100","author":"Giske","year":"2019","journal-title":"Food Control"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1016\/j.jhin.2008.03.013","article-title":"A modified ATP benchmark for evaluating the cleaning of some hospital environmental surfaces","volume":"69","author":"Lewis","year":"2008","journal-title":"J. Hosp. Infect."},{"key":"ref_18","first-page":"3365","article-title":"Step by step how to do cleaning validation","volume":"5","author":"Asgharian","year":"2014","journal-title":"Int. J. Pharm. Life Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1259","DOI":"10.14260\/jemds\/2018\/287","article-title":"Assessment of disinfection and cleaning validation in central laboratory, MBS hospital, Kota","volume":"7","author":"Malav","year":"2018","journal-title":"J. Evol. Med Dent. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1016\/j.jhin.2008.08.006","article-title":"How clean is clean? Proposed methods for hospital cleaning assessment","volume":"70","author":"Maxwell","year":"2008","journal-title":"J. Hosp. Infect."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1002","DOI":"10.1016\/j.ajic.2014.04.025","article-title":"How clean is clean\u2014Is a new microbiology standard required?","volume":"42","author":"Spratt","year":"2014","journal-title":"Am. J. Infect. Control"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Pathmakumar, T., Kalimuthu, M., Elara, M.R., and Ramalingam, B. (2021). An Autonomous Robot-Aided Auditing Scheme for Floor Cleaning. Sensors, 21.","DOI":"10.3390\/s21134332"},{"key":"ref_23","unstructured":"Smart, W.D., and Kaelbling, L.P. (2002, January 11\u201315). Effective reinforcement learning for mobile robots. Proceedings of the 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), Washington, DC, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Tai, L., Paolo, G., and Liu, M. (2017, January 24\u201328). Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. Proceedings of the 2017 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8202134"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Rivera, P., Valarezo A\u00f1azco, E., and Kim, T.S. (2021). Object Manipulation with an Anthropomorphic Robotic Hand via Deep Reinforcement Learning with a Synergy Space of Natural Hand Poses. Sensors, 21.","DOI":"10.3390\/s21165301"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Kozjek, D., Malus, A., and Vrabi\u010d, R. (2021). Reinforcement-Learning-Based Route Generation for Heavy-Traffic Autonomous Mobile Robot Systems. Sensors, 21.","DOI":"10.3390\/s21144809"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Pi, C.H., Dai, Y.W., Hu, K.C., and Cheng, S. (2021). General Purpose Low-Level Reinforcement Learning Control for Multi-Axis Rotor Aerial Vehicles. Sensors, 21.","DOI":"10.3390\/s21134560"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"79","DOI":"10.3389\/fnbot.2020.591128","article-title":"Perception-action coupling target tracking control for a snake robot via reinforcement learning","volume":"14","author":"Bing","year":"2020","journal-title":"Front. Neurorobot."},{"key":"ref_29","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"3098","DOI":"10.1109\/LRA.2020.2974648","article-title":"A two-stage reinforcement learning approach for multi-UAV collision avoidance under imperfect sensing","volume":"5","author":"Wang","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_31","unstructured":"Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014, January 22\u201324). Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning, PMLR, Bejing, China."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Mousavi, H.K., Liu, G., Yuan, W., Tak\u00e1\u010d, M., Mu\u00f1oz-Avila, H., and Motee, N. (2019). A layered architecture for active perception: Image classification using deep reinforcement learning. arXiv.","DOI":"10.1109\/IROS40897.2019.8968129"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Hase, H., Azampour, M.F., Tirindelli, M., Paschali, M., Simson, W., Fatemizadeh, E., and Navab, N. (2020, January 25\u201329). Ultrasound-guided robotic navigation with deep reinforcement learning. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9340913"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Choi, J., Park, K., Kim, M., and Seok, S. (2019, January 20\u201324). Deep reinforcement learning of navigation in a complex and crowded environment with a limited field of view. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793979"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"4423","DOI":"10.1109\/LRA.2018.2869644","article-title":"Reinforced imitation: Sample efficient deep reinforcement learning for mapless navigation by leveraging prior demonstrations","volume":"3","author":"Pfeiffer","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/LRA.2019.2891991","article-title":"Deep reinforcement learning robot for search and rescue applications: Exploration in unknown cluttered environments","volume":"4","author":"Niroui","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zuluaga, J.G.C., Leidig, J.P., Trefftz, C., and Wolffe, G. (2018, January 23\u201326). Deep reinforcement learning for autonomous search and rescue. Proceedings of the NAECON 2018-IEEE National Aerospace and Electronics Conference, Dayton, OH, USA.","DOI":"10.1109\/NAECON.2018.8556642"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"14413","DOI":"10.1109\/TVT.2020.3034800","article-title":"Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning","volume":"69","author":"Hu","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1007\/s10846-018-0898-1","article-title":"A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques","volume":"95","author":"Sampedro","year":"2019","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Alqaraawi, A., Schuessler, M., Wei\u00df, P., Costanza, E., and Berthouze, N. (2020, January 17\u201320). Evaluating saliency map explanations for convolutional neural networks: A user study. Proceedings of the 25th International Conference on Intelligent User Interfaces, Cagliari, Italy.","DOI":"10.1145\/3377325.3377519"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Yoo, S., Jeong, S., Kim, S., and Jang, Y. (2021). Saliency-Based Gaze Visualization for Eye Movement Analysis. Sensors, 21.","DOI":"10.3390\/s21155178"},{"key":"ref_43","unstructured":"Schulman, J., Moritz, P., Levine, S., Jordan, M., and Abbeel, P. (2015). High-dimensional continuous control using generalized advantage estimation. arXiv."},{"key":"ref_44","first-page":"111","article-title":"Performance analysis of various activation functions in generalized MLP architectures of neural networks","volume":"1","author":"Karlik","year":"2011","journal-title":"Int. J. Artif. Intell. Expert Syst."},{"key":"ref_45","unstructured":"Liang, E., Liaw, R., Nishihara, R., Moritz, P., Fox, R., Gonzalez, J., Goldberg, K., and Stoica, I. (2017). Ray rllib: A composable and scalable reinforcement learning library. arXiv."},{"key":"ref_46","unstructured":"Liang, E., Liaw, R., Nishihara, R., Moritz, P., Fox, R., Goldberg, K., Gonzalez, J., Jordan, M., and Stoica, I. (2018, January 10\u201315). RLlib: Abstractions for distributed reinforcement learning. Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ketkar, N. (2017). Introduction to pytorch. Deep Learning with Python, Springer.","DOI":"10.1007\/978-1-4842-2766-4"},{"key":"ref_48","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). Openai gym. arXiv."},{"key":"ref_49","unstructured":"Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., and Jordan, M.I. (2018, January 8\u201310). Ray: A distributed framework for emerging AI applications. Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), Carlsbad, CA, USA."},{"key":"ref_50","first-page":"5","article-title":"ROS: An open-source Robot Operating System","volume":"Volume 3","author":"Quigley","year":"2009","journal-title":"IEEE International Conference on Robotics and Automation (ICRA) Workshop on Open Source Software"},{"key":"ref_51","first-page":"6392697","article-title":"Global and local path planning study in a ROS-based research platform for autonomous vehicles","volume":"2018","author":"Hussein","year":"2018","journal-title":"J. Adv. Transp."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1109\/100.580977","article-title":"The dynamic window approach to collision avoidance","volume":"4","author":"Fox","year":"1997","journal-title":"IEEE Robot. Autom. Mag."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/24\/8331\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:46:47Z","timestamp":1760168807000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/24\/8331"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,13]]},"references-count":52,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2021,12]]}},"alternative-id":["s21248331"],"URL":"https:\/\/doi.org\/10.3390\/s21248331","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,12,13]]}}}