{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T03:57:35Z","timestamp":1775447855931,"version":"3.50.1"},"reference-count":137,"publisher":"MIT Press","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Neural Computation"],"published-print":{"date-parts":[[2019,7]]},"abstract":"<jats:p> Recurrent neural networks (RNNs) have been widely adopted in research areas concerned with sequential data, such as text, audio, and video. However, RNNs consisting of sigma cells or tanh cells are unable to learn the relevant information of input data when the input gap is large. By introducing gate functions into the cell structure, the long short-term memory (LSTM) could handle the problem of long-term dependencies well. Since its introduction, almost all the exciting results based on RNNs have been achieved by the LSTM. The LSTM has become the focus of deep learning. We review the LSTM cell and its variants to explore the learning capacity of the LSTM cell. Furthermore, the LSTM networks are divided into two broad categories: LSTM-dominated networks and integrated LSTM networks. In addition, their various applications are discussed. Finally, future research directions are presented for LSTM networks. <\/jats:p>","DOI":"10.1162\/neco_a_01199","type":"journal-article","created":{"date-parts":[[2019,5,22]],"date-time":"2019-05-22T00:21:10Z","timestamp":1558484470000},"page":"1235-1270","source":"Crossref","is-referenced-by-count":4884,"title":["A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures"],"prefix":"10.1162","volume":"31","author":[{"given":"Yong","family":"Yu","sequence":"first","affiliation":[{"name":"Department of Automation, Xi'an Institute of High-Technology, Xi'an 710025, China, and Institute No. 25, Second Academy of China, Aerospace Science and Industry Corporation, Beijing 100854, China"}]},{"given":"Xiaosheng","family":"Si","sequence":"additional","affiliation":[{"name":"Department of Automation, Xi'an Institute of High-Technology, Xi'an 710025, China"}]},{"given":"Changhua","family":"Hu","sequence":"additional","affiliation":[{"name":"Department of Automation, Xi'an Institute of High-Technology, Xi'an 710025, China"}]},{"given":"Jianxun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Automation, Xi'an Institute of High-Technology, Xi'an 710025, China"}]}],"member":"281","reference":[{"key":"B1","author":"Achanta R.","year":"2010","journal-title":"SLIC superpixels"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC.2017.8317913"},{"key":"B3","doi-asserted-by":"publisher","DOI":"10.1561\/2200000006"},{"key":"B4","doi-asserted-by":"publisher","DOI":"10.1109\/72.279181"},{"key":"B6","author":"Britz D.","year":"2017","journal-title":"Massive exploration of neural machine translation architectures"},{"key":"B7","author":"Brown B.","year":"2004","journal-title":"Circuits, Signals, and Systems: IASTED International Conference Proceedings"},{"key":"B8","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/3296874"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1109\/ICNN.1996.549199"},{"key":"B10","author":"Chen X.","year":"2015","journal-title":"Microsoft COCO captions: Data collection and evaluation server"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.254"},{"key":"B12","volume-title":"Proceedings of the IEEE 24th International Conference on Network Protocols","author":"Cheng M.","year":"2016"},{"key":"B14","author":"Chung J.","year":"2014","journal-title":"Empirical evaluation of gated recurrent neural networks on sequence modeling"},{"key":"B16","volume-title":"APSIPA transactions on signal and information processing","author":"Deng L.","year":"2013"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1109\/MWSCAS.2017.8053243"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1109\/VTCFall.2017.8288312"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1207\/s15516709cog1402_1"},{"key":"B20","volume-title":"Proceedings of the 20th International Joint Conference on Artificial Intelligence","author":"Fern\u00e1ndez S.","year":"2007"},{"key":"B21","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74695-9_23"},{"key":"B22","first-page":"104","volume-title":"Proceedings of the International Workshop on Graphics Recognition","author":"Francesconi E.","year":"1997"},{"key":"B23","doi-asserted-by":"publisher","DOI":"10.1109\/72.712151"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.1007\/BF00344251"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2005.1555002"},{"key":"B26","author":"Gers F.","year":"2001","journal-title":"Long short-term memory in recurrent neural networks"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2000.861302"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.1109\/72.963769"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.1162\/089976600300015015"},{"key":"B30","first-page":"115","volume":"3","author":"Gers F. A.","year":"2002","journal-title":"Journal of Machine Learning Research"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11179-7_28"},{"key":"B32","first-page":"347","volume":"1","author":"Goller C.","year":"1996","journal-title":"Neural Networks"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24797-2"},{"key":"B34","author":"Graves A.","year":"2014","journal-title":"Generating sequences with recurrent neural networks"},{"key":"B35","unstructured":"Graves, A., Fern\u00e1ndez, S. & Schmidhuber, J. (2007). Multi-dimensional recurrent neural networks. In Proceedings of the International Conference on Artificial Neural Networks. Berlin: Springer."},{"key":"B36","author":"Graves A.","year":"2014","journal-title":"Neural Turing machines"},{"key":"B37","doi-asserted-by":"publisher","DOI":"10.1038\/nature20101"},{"key":"B38","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2005.06.042"},{"key":"B39","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2582924"},{"key":"B40","first-page":"1","volume":"2","author":"Guo Y.","year":"2017","journal-title":"International Journal of Multimedia Information Retrieval"},{"key":"B41","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123394"},{"key":"B42","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472718"},{"key":"B43","doi-asserted-by":"publisher","DOI":"10.1109\/MWSCAS.2017.8053242"},{"key":"B44","author":"Hochreiter S.","year":"1991","journal-title":"Untersuchungen zu dynamischen neuronalen Netzen"},{"key":"B45","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"issue":"1","key":"B46","first-page":"395","volume":"50","author":"Hsu W. N.","year":"2016","journal-title":"Cell"},{"key":"B47","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2016-491"},{"key":"B48","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1971.4308320"},{"key":"B49","volume-title":"Cybernetic predicting devices","author":"Ivakhnenko A. G.","year":"1965"},{"key":"B50","author":"Jing L.","year":"2017","journal-title":"Gated orthogonal recurrent units: On learning to forget"},{"key":"B51","first-page":"531","volume-title":"Proceedings of the Annual Conference of the Cognitive Science Society","author":"Jordan M.","year":"1986"},{"key":"B52","first-page":"2342","volume-title":"Proceedings of the International Conference on International Conference on Machine Learning","author":"Jozefowicz R.","year":"2015"},{"key":"B53","author":"Kalchbrenner N.","year":"2015","journal-title":"Grid long short-term memory"},{"key":"B54","author":"Karpathy A.","year":"2015","journal-title":"Visualizing and understanding recurrent networks"},{"key":"B55","doi-asserted-by":"publisher","DOI":"10.1016\/j.ymssp.2017.11.024"},{"key":"B56","author":"Kim J.","year":"2017","journal-title":"Residual LSTM: Design of a deep recurrent architecture for distant speech recognition"},{"key":"B57","author":"Koutnik J.","year":"2014","journal-title":"A clockwork RNN"},{"key":"B58","author":"Krause B.","year":"2016","journal-title":"Multiplicative LSTM for sequence modelling"},{"key":"B59","doi-asserted-by":"publisher","DOI":"10.1145\/2820783.2820801"},{"key":"B60","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.4.541"},{"key":"B61","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"B62","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-1164"},{"key":"B63","author":"Li J.","year":"2015","journal-title":"When are tree structures necessary for deep learning of representations"},{"key":"B64","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472617"},{"key":"B65","first-page":"187","volume-title":"Proceedings of the Conference on Automatic Speech Recognition and Understanding","author":"Li J.","year":"2016"},{"key":"B66","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00572"},{"key":"B67","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.234"},{"key":"B68","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2408360"},{"key":"B69","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_8"},{"key":"B70","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.347"},{"key":"B71","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"B72","author":"Lipton Z. C.","year":"2015","journal-title":"A critical review of recurrent neural networks for sequence learning"},{"key":"B73","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1098"},{"key":"B74","author":"Liu P.","year":"2016","journal-title":"Modelling interaction of sentence pair with coupled-LSTMs"},{"key":"B75","doi-asserted-by":"publisher","DOI":"10.3390\/rs9121330"},{"key":"B76","author":"Mallinar N.","year":"2018","journal-title":"Deep canonically correlated LSTMs"},{"key":"B77","author":"McCarter G.","year":"2007","journal-title":"Air freight image segmentation database"},{"key":"B78","author":"Miwa M.","year":"2016","journal-title":"End-to-end relation extraction using LSTMs on sequences and tree structures"},{"key":"B79","author":"Moniz J. R. A.","year":"2018","journal-title":"Nested LSTMs"},{"key":"B80","first-page":"863","volume-title":"Advances in neural information processing systems","volume":"5","author":"Mozer M. C.","year":"1993"},{"key":"B81","first-page":"807","volume-title":"Proceedings of the International Conference on International Conference on Machine Learning","author":"Nair V.","year":"2010"},{"key":"B82","first-page":"3882","volume-title":"Advances in neural information processing systems","volume":"16","author":"Neil D.","year":"2016"},{"key":"B83","doi-asserted-by":"publisher","DOI":"10.1109\/DSC.2016.72"},{"key":"B84","first-page":"1","volume-title":"Proceedings of the International Conference on Information, Communications and Signal Processing","author":"Nina O.","year":"2016"},{"key":"B85","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.208"},{"key":"B86","author":"Oord A. V. D.","year":"2016","journal-title":"Pixel recurrent neural networks"},{"key":"B87","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2016.2520371"},{"key":"B88","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.2.263"},{"key":"B89","first-page":"3439","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Peng Z.","year":"2016"},{"key":"B90","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-016-0965-7"},{"key":"B91","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2017.7965940"},{"key":"B92","doi-asserted-by":"publisher","DOI":"10.1109\/ASRU.2017.8268932"},{"key":"B93","first-page":"1","volume-title":"Proceedings of the International Conference on Electrical Engineering and Information Communication Technology","author":"Rahman L.","year":"2017"},{"key":"B94","author":"Ranzato M. A.","year":"2014","journal-title":"Video (language) modeling: A baseline for generative models of natural videos"},{"key":"B95","doi-asserted-by":"publisher","DOI":"10.1162\/neco_a_00990"},{"key":"B96","volume-title":"The utility driven dynamic error propagation network","author":"Robinson A. J.","year":"1987"},{"key":"B97","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178838"},{"key":"B98","author":"Sak H. I.","year":"2014","journal-title":"Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition"},{"key":"B99","first-page":"327","volume-title":"Proceedings of the IEEE International Conference on Intelligent Transportation Systems","author":"Saleh K.","year":"2018"},{"key":"B100","author":"Schlag I.","year":"2017","journal-title":"NIPS Metalearning Workshop"},{"key":"B101","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2063-6_110"},{"key":"B102","author":"Schmidhuber J.","year":"2012","journal-title":"Self-delimiting neural networks"},{"key":"B103","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2007.19.3.757"},{"key":"B104","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40602-7_18"},{"key":"B105","doi-asserted-by":"publisher","DOI":"10.1109\/78.650093"},{"key":"B106","author":"Shabanian S.","year":"2017","journal-title":"Variational Bi-LSTMs"},{"key":"B107","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCNT.2017.8203938"},{"key":"B108","first-page":"802","volume-title":"Advances in neural information processing systems","author":"Shi X.","year":"2015"},{"key":"B109","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-016-0059-z"},{"key":"B110","doi-asserted-by":"publisher","DOI":"10.1109\/72.572108"},{"key":"B111","author":"Srivastava R. K.","year":"2015","journal-title":"Highway networks"},{"key":"B112","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-012-9259-4"},{"key":"B113","first-page":"577","volume-title":"Proceedings of the International Joint Conference on Neural Networks","author":"Sun G.","year":"1990"},{"key":"B114","first-page":"3104","volume-title":"Advances in neural information processing systems","author":"Sutskever I.","year":"2014"},{"issue":"1","key":"B115","first-page":"36","volume":"5","author":"Tai K. S.","year":"2015","journal-title":"Computer Science"},{"key":"B116","author":"Teng Z.","year":"2016","journal-title":"Bidirectional tree-structured LSTM with head lexicalization"},{"key":"B117","doi-asserted-by":"publisher","DOI":"10.1109\/tcbb.2007.1015"},{"key":"B118","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.460"},{"key":"B119","doi-asserted-by":"publisher","DOI":"10.1109\/DSAA.2015.7344820"},{"key":"B120","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298788"},{"key":"B121","doi-asserted-by":"publisher","DOI":"10.1145\/3184558.3191571"},{"key":"B122","author":"Weiss G.","year":"2018","journal-title":"On the practical computational power of finite precision RNNs for language recognition"},{"key":"B123","first-page":"121","volume-title":"Proceedings of the Fourth International Conference on Computer Vision","author":"Weng J. J.","year":"1993"},{"key":"B124","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(88)90007-X"},{"key":"B125","volume-title":"Complexity of exact gradient computation algorithms for recurrent neural networks","author":"Williams R. J.","year":"1989"},{"key":"B126","author":"Wu H.","year":"2016","journal-title":"An empirical exploration of skip connections for sequential tagging"},{"key":"B127","doi-asserted-by":"publisher","DOI":"10.12677\/CSA.2018.81008"},{"key":"B128","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6248101"},{"key":"B129","first-page":"1","volume":"99","author":"Yang Y.","year":"2017","journal-title":"IEEE Geoscience and Remote Sensing Letters"},{"key":"B130","author":"Yao K.","year":"2015","journal-title":"Depth-gated LSTM"},{"key":"B131","doi-asserted-by":"publisher","DOI":"10.1109\/DSC.2018.00019"},{"key":"B132","author":"Zaremba W.","year":"2014","journal-title":"Learning to execute"},{"key":"B133","first-page":"1655","volume-title":"Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing","author":"Zhang J.","year":"2016"},{"key":"B134","author":"Zhang X.","year":"2015","journal-title":"Tree recurrent neural networks with application to language modeling"},{"key":"B135","first-page":"5755","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing","author":"Zhang Y.","year":"2015"},{"key":"B136","doi-asserted-by":"publisher","DOI":"10.1109\/ICSensT.2016.7796266"},{"issue":"4","key":"B137","first-page":"39","volume":"1","author":"Zhou C.","year":"2016","journal-title":"Computer Science"},{"key":"B138","doi-asserted-by":"publisher","DOI":"10.1007\/s11633-016-1006-2"},{"key":"B139","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2684186"},{"key":"B140","author":"Zhu X.","year":"2015","journal-title":"Long short-term memory over tree structures"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/neco_a_01199","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:43:12Z","timestamp":1615585392000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/31\/7\/1235-1270\/8500"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7]]},"references-count":137,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2019,7]]}},"alternative-id":["10.1162\/neco_a_01199"],"URL":"https:\/\/doi.org\/10.1162\/neco_a_01199","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,7]]}}}