{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T13:48:10Z","timestamp":1775051290936,"version":"3.50.1"},"reference-count":47,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2024,7,13]],"date-time":"2024-07-13T00:00:00Z","timestamp":1720828800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"MSIT (Ministry of Science and ICT)","award":["IITP-2023-RS-2023-00266615"],"award-info":[{"award-number":["IITP-2023-RS-2023-00266615"]}]},{"name":"MSIT (Ministry of Science and ICT)","award":["5120200313836"],"award-info":[{"award-number":["5120200313836"]}]},{"name":"MSIT (Ministry of Science and ICT)","award":["RS-2022-00155911"],"award-info":[{"award-number":["RS-2022-00155911"]}]},{"name":"MSIT (Ministry of Science and ICT)","award":["20015440"],"award-info":[{"award-number":["20015440"]}]},{"name":"MSIT (Ministry of Science and ICT)","award":["20025094"],"award-info":[{"award-number":["20025094"]}]},{"name":"Ministry of Education of Korea","award":["IITP-2023-RS-2023-00266615"],"award-info":[{"award-number":["IITP-2023-RS-2023-00266615"]}]},{"name":"Ministry of Education of Korea","award":["5120200313836"],"award-info":[{"award-number":["5120200313836"]}]},{"name":"Ministry of Education of Korea","award":["RS-2022-00155911"],"award-info":[{"award-number":["RS-2022-00155911"]}]},{"name":"Ministry of Education of Korea","award":["20015440"],"award-info":[{"award-number":["20015440"]}]},{"name":"Ministry of Education of Korea","award":["20025094"],"award-info":[{"award-number":["20025094"]}]},{"name":"Korea government (MSIT)","award":["IITP-2023-RS-2023-00266615"],"award-info":[{"award-number":["IITP-2023-RS-2023-00266615"]}]},{"name":"Korea government (MSIT)","award":["5120200313836"],"award-info":[{"award-number":["5120200313836"]}]},{"name":"Korea government (MSIT)","award":["RS-2022-00155911"],"award-info":[{"award-number":["RS-2022-00155911"]}]},{"name":"Korea government (MSIT)","award":["20015440"],"award-info":[{"award-number":["20015440"]}]},{"name":"Korea government (MSIT)","award":["20025094"],"award-info":[{"award-number":["20025094"]}]},{"name":"Ministry of Trade, Industry and Energy (MOTIE)","award":["IITP-2023-RS-2023-00266615"],"award-info":[{"award-number":["IITP-2023-RS-2023-00266615"]}]},{"name":"Ministry of Trade, Industry and Energy (MOTIE)","award":["5120200313836"],"award-info":[{"award-number":["5120200313836"]}]},{"name":"Ministry of Trade, Industry and Energy (MOTIE)","award":["RS-2022-00155911"],"award-info":[{"award-number":["RS-2022-00155911"]}]},{"name":"Ministry of Trade, Industry and Energy (MOTIE)","award":["20015440"],"award-info":[{"award-number":["20015440"]}]},{"name":"Ministry of Trade, Industry and Energy (MOTIE)","award":["20025094"],"award-info":[{"award-number":["20025094"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Robot navigation has transitioned from avoiding static obstacles to adopting socially aware navigation strategies for coexisting with humans. Consequently, socially aware navigation in dynamic, human-centric environments has gained prominence in the field of robotics. One of the methods for socially aware navigation, the reinforcement learning technique, has fostered its advancement. However, defining appropriate reward functions, particularly in congested environments, holds a significant challenge. These reward functions, crucial for guiding robot actions, necessitate intricate human-crafted design due to their complex nature and inability to be set automatically. The multitude of manually designed reward functions contains issues such as hyperparameter redundancy, imbalance, and inadequate representation of unique object characteristics. To address these challenges, we introduce a transformable Gaussian reward function (TGRF). The TGRF possesses two main features. First, it reduces the burden of tuning by utilizing a small number of hyperparameters that function independently. Second, it enables the application of various reward functions through its transformability. Consequently, it exhibits high performance and accelerated learning rates within the deep reinforcement learning (DRL) framework. We also validated the performance of TGRF through simulations and experiments.<\/jats:p>","DOI":"10.3390\/s24144540","type":"journal-article","created":{"date-parts":[[2024,7,15]],"date-time":"2024-07-15T14:15:49Z","timestamp":1721052949000},"page":"4540","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Transformable Gaussian Reward Function for Socially Aware Navigation Using Deep Reinforcement Learning"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-2349-0801","authenticated-orcid":false,"given":"Jinyeob","family":"Kim","sequence":"first","affiliation":[{"name":"Department of Artificial Intelligence, College of Software, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6422-5781","authenticated-orcid":false,"given":"Sumin","family":"Kang","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering (AgeTech-Service Convergence Major), College of Electronics & Information, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3393-0269","authenticated-orcid":false,"given":"Sungwoo","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering (AgeTech-Service Convergence Major), College of Electronics & Information, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-1856-2495","authenticated-orcid":false,"given":"Beomjoon","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, College of Software, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0832-8740","authenticated-orcid":false,"given":"Jargalbaatar","family":"Yura","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering (AgeTech-Service Convergence Major), College of Electronics & Information, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0477-3847","authenticated-orcid":false,"given":"Donghan","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Electronic Engineering (AgeTech-Service Convergence Major), College of Electronics & Information, Kyung Hee University, Yongin 17104, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1016\/S0921-8890(97)00051-1","article-title":"Mobile robot obstacle avoidance via depth from focus","volume":"22","author":"Nourbakhsh","year":"1997","journal-title":"Robot. Auton. Syst."},{"key":"ref_2","unstructured":"Ulrich, I., and Borenstein, J. (1998, January 20). VFH+: Reliable obstacle avoidance for fast mobile robots. Proceedings of the 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146), Leuven, Belgium."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1142\/S0219843611002381","article-title":"Stereovision-based fuzzy obstacle avoidance method","volume":"8","author":"Nalpantidis","year":"2011","journal-title":"Int. J. Humanoid Robot."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"114027","DOI":"10.1088\/0957-0233\/22\/11\/114027","article-title":"Non-probabilistic cellular automata-enhanced stereo vision simultaneous localization and mapping","volume":"22","author":"Nalpantidis","year":"2011","journal-title":"Meas. Sci. Technol."},{"key":"ref_5","unstructured":"Pritsker, A.A.B. (1995). Introduction to Simulation and SLAM II, John Wiley & Sons, Inc."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1109\/MITS.2010.939925","article-title":"A tutorial on graph-based SLAM","volume":"2","author":"Grisetti","year":"2010","journal-title":"IEEE Intell. Transp. Syst. Mag."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"162335","DOI":"10.1109\/ACCESS.2020.2991441","article-title":"DDL-SLAM: A robust RGB-D SLAM in dynamic environments combined with deep learning","volume":"8","author":"Ai","year":"2020","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"95301","DOI":"10.1109\/ACCESS.2020.2994348","article-title":"SDF-SLAM: Semantic depth filter SLAM for dynamic environments","volume":"8","author":"Cui","year":"2020","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1179","DOI":"10.1109\/21.44033","article-title":"Real-time obstacle avoidance for fast mobile robots","volume":"19","author":"Borenstein","year":"1989","journal-title":"IEEE Trans. Syst. Man, Cybern."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Van Den Berg, J., Guy, S.J., Lin, M., and Manocha, D. (2011). Reciprocal n-body collision avoidance. Robotics Research: The 14th International Symposium ISRR, Springer.","DOI":"10.1007\/978-3-642-19457-3_1"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4282","DOI":"10.1103\/PhysRevE.51.4282","article-title":"Social force model for pedestrian dynamics","volume":"51","author":"Helbing","year":"1995","journal-title":"Phys. Rev. E"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Patel, U., Kumar, N.K.S., Sathyamoorthy, A.J., and Manocha, D. (June, January 30). Dwa-rl: Dynamically feasible deep reinforcement learning policy for robot navigation among mobile obstacles. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Virtual.","DOI":"10.1109\/ICRA48506.2021.9561462"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Liu, S., Chang, P., Huang, Z., Chakraborty, N., Hong, K., Liang, W., and Driggs-Campbell, K. (June, January 29). Intention aware robot crowd navigation with attention-based interaction graph. Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK.","DOI":"10.1109\/ICRA48891.2023.10160660"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chen, C., Liu, Y., Kreiss, S., and Alahi, A. (2019, January 20\u201324). Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794134"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Van Den Berg, J., Lin, M., and Manocha, D. (2008, January 19\u201323). Reciprocal velocity obstacles for real-time multi-agent navigation. Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA.","DOI":"10.1109\/ROBOT.2008.4543489"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Oh, J., Heo, J., Lee, J., Lee, G., Kang, M., Park, J., and Oh, S. (June, January 29). Scan: Socially-aware navigation using monte carlo tree search. Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK.","DOI":"10.1109\/ICRA48891.2023.10160270"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, S., Chang, P., Liang, W., Chakraborty, N., and Driggs-Campbell, K. (June, January 30). Decentralized structural-rnn for robot crowd navigation with deep reinforcement learning. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Virtual.","DOI":"10.1109\/ICRA48506.2021.9561595"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1007\/s13218-010-0034-2","article-title":"Lifelong map learning for graph-based slam in static environments","volume":"24","author":"Kretzschmar","year":"2010","journal-title":"KI-K\u00fcnstliche Intell."},{"key":"ref_19","unstructured":"Brown, N. (2001). Edward T. Hall: Proxemic Theory, 1966, Center for Spatially Integrated Social Science, University of California, Santa Barbara. Available online: http:\/\/www.csiss.org\/classics\/content\/13."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s12369-014-0251-1","article-title":"From proxemics theory to socially-aware navigation: A survey","volume":"7","author":"Spalanzani","year":"2015","journal-title":"Int. J. Soc. Robot."},{"key":"ref_21","first-page":"679","article-title":"A Markovian decision process","volume":"6","author":"Bellman","year":"1957","journal-title":"J. Math. Mech."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1093\/biomet\/57.1.97","article-title":"Monte Carlo sampling methods using Markov chains and their applications","volume":"57","author":"Hastings","year":"1970","journal-title":"Biometrika"},{"key":"ref_23","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1007\/BF00115009","article-title":"Learning to predict by the methods of temporal differences","volume":"3","author":"Sutton","year":"1988","journal-title":"Mach. Learn."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Jeong, H., Hassani, H., Morari, M., Lee, D.D., and Pappas, G.J. (June, January 30). Deep reinforcement learning for active target tracking. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Virtual.","DOI":"10.1109\/ICRA48506.2021.9561258"},{"key":"ref_26","unstructured":"Gleave, A., Dennis, M., Legg, S., Russell, S., and Leike, J. (2020). Quantifying differences in reward functions. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Mataric, M.J. (1994, January 10\u201313). Reward functions for accelerated learning. Proceedings of the Machine Learning Proceedings 1994, New Brunswick, NJ, USA.","DOI":"10.1016\/B978-1-55860-335-6.50030-1"},{"key":"ref_28","unstructured":"Laud, A.D. (2004). Theory and Application of Reward Shaping in Reinforcement Learning. [Ph.D. Thesis, University of Illinois at Urbana-Champaign]."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1149","DOI":"10.1007\/s40747-023-01216-y","article-title":"Dynamic warning zone and a short-distance goal for autonomous robot navigation using deep reinforcement learning","volume":"10","author":"Montero","year":"2024","journal-title":"Complex Intell. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"5223","DOI":"10.1109\/LRA.2021.3071954","article-title":"Socially compliant robot navigation in crowded environment by human behavior resemblance using deep reinforcement learning","volume":"6","author":"Samsani","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2147","DOI":"10.1007\/s40747-022-00906-3","article-title":"Memory-based crowd-aware robot navigation using deep reinforcement learning","volume":"9","author":"Samsani","year":"2023","journal-title":"Complex Intell. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1007\/s11370-021-00387-2","article-title":"Reinforcement learning-based dynamic obstacle avoidance and integration of path planning","volume":"14","author":"Choi","year":"2021","journal-title":"Intell. Serv. Robot."},{"key":"ref_33","unstructured":"Liu, S., Chang, P., Huang, Z., Chakraborty, N., Liang, W., Geng, J., and Driggs-Campbell, K. (2022). Socially aware robot crowd navigation with interaction graphs and human trajectory prediction. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"P\u00e9rez-D\u2019Arpino, C., Liu, C., Goebel, P., Mart\u00edn-Mart\u00edn, R., and Savarese, S. (June, January 30). Robot navigation in constrained pedestrian environments using reinforcement learning. Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Virtual.","DOI":"10.1109\/ICRA48506.2021.9560893"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Scholz, J., Jindal, N., Levihn, M., Isbell, C.L., and Christensen, H.I. (2016, January 9\u201314). Navigation among movable obstacles with learned dynamic constraints. Proceedings of the 2016 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea.","DOI":"10.1109\/IROS.2016.7759546"},{"key":"ref_36","unstructured":"Cassandra, A.R. (1998, January 22\u201324). A survey of POMDP applications. Proceedings of the Working Notes of AAAI 1998 Fall Symposium on Planning with Partially Observable Markov Decision Processes, Orlando, FL, USA."},{"key":"ref_37","first-page":"15931","article-title":"Learning to utilize shaping rewards: A new approach of reward shaping","volume":"33","author":"Hu","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1613\/jair.1.12440","article-title":"Reward machines: Exploiting reward function structure in reinforcement learning","volume":"73","author":"Icarte","year":"2022","journal-title":"J. Artif. Intell. Res."},{"key":"ref_39","unstructured":"Yuan, M., Li, B., Jin, X., and Zeng, W. (2023, January 23\u201329). Automatic intrinsic reward shaping for exploration in deep reinforcement learning. Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA."},{"key":"ref_40","unstructured":"Zhang, S., Wan, Y., Sutton, R.S., and Whiteson, S. (2021, January 18\u201324). Average-reward off-policy policy evaluation with function approximation. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_41","unstructured":"Rucker, M.A., Watson, L.T., Gerber, M.S., and Barnes, L.E. (2020). Reward shaping for human learning via inverse reinforcement learning. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Goyal, P., Niekum, S., and Mooney, R.J. (2019). Using natural language for reward shaping in reinforcement learning. arXiv.","DOI":"10.24963\/ijcai.2019\/331"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Trautman, P., and Krause, A. (2010, January 18\u201322). Unfreezing the robot: Navigation in dense, interacting crowds. Proceedings of the 2010 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan.","DOI":"10.1109\/IROS.2010.5654369"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1198","DOI":"10.1109\/LRA.2021.3138547","article-title":"Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction","volume":"7","author":"Huang","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.neucom.2021.03.091","article-title":"A review on the attention mechanism of deep learning","volume":"452","author":"Niu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Fu, R., Zhang, Z., and Li, L. (2016, January 11\u201313). Using LSTM and GRU neural network methods for traffic flow prediction. Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China.","DOI":"10.1109\/YAC.2016.7804912"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1214\/aoms\/1177704250","article-title":"Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction)","volume":"34","author":"Goodman","year":"1963","journal-title":"Ann. Math. Stat."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/14\/4540\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:16:17Z","timestamp":1760109377000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/14\/4540"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,13]]},"references-count":47,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2024,7]]}},"alternative-id":["s24144540"],"URL":"https:\/\/doi.org\/10.3390\/s24144540","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,13]]}}}