{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T21:35:35Z","timestamp":1777671335611,"version":"3.51.4"},"reference-count":22,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2021,10,25]],"date-time":"2021-10-25T00:00:00Z","timestamp":1635120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>A recurrent neural network (RNN) combines variable-length input data with a hidden state that depends on previous time steps to generate output data. RNNs have been widely used in time-series data analysis, and various RNN algorithms have been proposed, such as the standard RNN, long short-term memory (LSTM), and gated recurrent units (GRUs). In particular, it has been experimentally proven that LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. The learning ability is a measure of the effectiveness of gradient of error information that would be backpropagated. This study provided a theoretical and experimental basis for the result that LSTM and GRU have more efficient gradient descent than the standard RNN by analyzing and experimenting the gradient vanishing of the standard RNN, LSTM, and GRU. As a result, LSTM and GRU are robust to the degradation of gradient descent even when LSTM and GRU learn long-range input data, which means that the learning ability of LSTM and GRU is greater than standard RNN when learning long-range input data. Therefore, LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. In addition, it was verified whether the experimental results of river-level prediction models, solar power generation prediction models, and speech signal models using the standard RNN, LSTM, and GRUs are consistent with the analysis results of gradient vanishing.<\/jats:p>","DOI":"10.3390\/info12110442","type":"journal-article","created":{"date-parts":[[2021,10,25]],"date-time":"2021-10-25T21:40:21Z","timestamp":1635198021000},"page":"442","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":173,"title":["Analysis of Gradient Vanishing of RNNs and Performance Comparison"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1720-0012","authenticated-orcid":false,"given":"Seol-Hyun","family":"Noh","sequence":"first","affiliation":[{"name":"Department of Statistical Data Science, ICT Convergence Engineering, Anyang University, Anyang 14028, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2021,10,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1207\/s15516709cog1402_1","article-title":"Finding structure in time","volume":"14","author":"Elman","year":"1990","journal-title":"Cogn. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1016\/0893-6080(88)90007-X","article-title":"Generalization of backpropagation with application to a recurrent gas market model","volume":"1","author":"Werbos","year":"1988","journal-title":"Neural Netw."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Cho, K., Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014, January 25\u201329). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_6","unstructured":"Chung, J., Gulcehre, C., Cho, K.H., and Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"607","DOI":"10.5626\/JOK.2017.44.6.607","article-title":"Water level forecasting based on deep learning: A use case of Trinity River-Texas-The United States","volume":"44","author":"Tran","year":"2020","journal-title":"J. KIISE"},{"key":"ref_8","unstructured":"Cho, W., and Kang, D. (2017, January 20\u201322). Estimation method of river water level using LSTM. Proceedings of the Korea Conference on Software Engineering, Busan, Korea."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"506","DOI":"10.5626\/JOK.2019.46.6.506","article-title":"Design of photovoltaic power generation prediction model with recurrent neural network","volume":"46","author":"Kim","year":"2019","journal-title":"J. KIISE"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"435","DOI":"10.5626\/KTCP.2020.26.10.435","article-title":"LSTM-based 24-h solar power forecasting model using weather forecast data","volume":"26","author":"Son","year":"2020","journal-title":"KIISE Trans. Comput. Pract."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"292","DOI":"10.5626\/JOK.2020.47.3.292","article-title":"A deep learning LSTM framework for urban traffic flow and fine dust prediction","volume":"47","author":"Yi","year":"2020","journal-title":"J. KIISE"},{"key":"ref_12","unstructured":"Jo, S., Jeong, M., Lee, J., Oh, I., and Han, Y. (2020, January 2\u20134). Analysis of correlation of wind direction\/speed and particulate matter (PM10) and prediction of particulate matter using LSTM. Proceedings of the Korea Computer Congress, Busan, Korea."},{"key":"ref_13","unstructured":"Munir, M.S., Abedin, S.F., Alam, G.R., Kim, D.H., and Hong, C.S. (2017, January 20\u201322). RNN based energy demand prediction for smart-home in smart-grid framework. Proceedings of the Korea Conference on Software Engineering, Busan, Korea."},{"key":"ref_14","unstructured":"Hidasi, B., Karatzoglou, A., Baltrumas, L., and Tikk, D. (2016, January 2\u20134). Session-based recommendations with recurrent neural networks. Proceedings of the International Conference on Learning Representations (ICLR), San Juan, Puerto Rico."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wu, S., Ren, W., Yu, C., Chen, G., Zhang, D., and Zhu, J. (2016, January 16\u201320). Personal recommendation using deep recurrent neural networks in NetEase. Proceedings of the IEEE International Conference on Data Engineering (ICDE), Helsinki, Finland.","DOI":"10.1109\/ICDE.2016.7498326"},{"key":"ref_16","unstructured":"Kwon, D., Kwon, S., Byun, J., and Kim, M. (2020, January 5\u201311). Forecasting KOSPI Index with LSTM deep learning model using COVID-19 data. Proceedings of the Korea Conference on Software Engineering, Seoul, Korea."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"654","DOI":"10.1016\/j.ejor.2017.11.054","article-title":"Deep learning with long short-term memory networks for financial market predictions","volume":"270","author":"Fischer","year":"2018","journal-title":"Eur. J. Oper. Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/72.279181","article-title":"Learning long-term dependencies with gradient descent is difficult","volume":"5","author":"Bengio","year":"1994","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_19","unstructured":"Pascanu, R., Mikolov, T., and Bengio, Y. (2013, January 17\u201319). On the difficulty of training recurrent neural networks. Proceedings of the International Conference on Machine Learning (PMLR), Atlanta, GA, USA."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Shi, Y., Hwang, M.-Y., Yao, K., and Larson, M. (2013, January 25\u201329). Speed up of recurrent neural network language models with sentence independent subsampling stochastic gradient descent. Proceedings of the INTERSPEECH, Lyon, France.","DOI":"10.21437\/Interspeech.2013-327"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Khomenko, V., Shyshkov, O., Radyvonenko, O., and Bokhan, K. (2016, January 23\u201327). Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. Proceedings of the IEEE Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine.","DOI":"10.1109\/DSMP.2016.7583516"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/11\/442\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:22:56Z","timestamp":1760167376000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/12\/11\/442"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,25]]},"references-count":22,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2021,11]]}},"alternative-id":["info12110442"],"URL":"https:\/\/doi.org\/10.3390\/info12110442","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,25]]}}}