{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T21:49:52Z","timestamp":1769636992292,"version":"3.49.0"},"reference-count":33,"publisher":"MIT Press - Journals","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Neural Computation"],"published-print":{"date-parts":[[2020,1]]},"abstract":"<jats:p> Catastrophic forgetting and capacity saturation are the central challenges of any parametric lifelong learning system. In this work, we study these challenges in the context of sequential supervised learning with an emphasis on recurrent neural networks. To evaluate the models in the lifelong learning setting, we propose a curriculum-based, simple, and intuitive benchmark where the models are trained on tasks with increasing levels of difficulty. To measure the impact of catastrophic forgetting, the model is tested on all the previous tasks as it completes any task. As a step toward developing true lifelong learning systems, we unify gradient episodic memory (a catastrophic forgetting alleviation approach) and Net2Net (a capacity expansion approach). Both models are proposed in the context of feedforward networks, and we evaluate the feasibility of using them for recurrent networks. Evaluation on the proposed benchmark shows that the unified model is more suitable than the constituent models for lifelong learning setting. <\/jats:p>","DOI":"10.1162\/neco_a_01246","type":"journal-article","created":{"date-parts":[[2019,11,9]],"date-time":"2019-11-09T00:19:30Z","timestamp":1573258770000},"page":"1-35","source":"Crossref","is-referenced-by-count":39,"title":["Toward Training Recurrent Neural Networks for Lifelong Learning"],"prefix":"10.1162","volume":"32","author":[{"given":"Shagun","family":"Sodhani","sequence":"first","affiliation":[{"name":"Mila, University of Montr\u00e9al, Montreal, Quebec H3T 1J4, Canada"}]},{"given":"Sarath","family":"Chandar","sequence":"additional","affiliation":[{"name":"Mila, University of Montr\u00e9al, Montreal, Quebec H3T 1J4, Canada"}]},{"given":"Yoshua","family":"Bengio","sequence":"additional","affiliation":[{"name":"Mila, University of Montr\u00e9al, Montreal, Quebec H3T 1J4, Canada, and CIFAR"}]}],"member":"281","reference":[{"key":"B1","author":"Aljundi R.","year":"2016","journal-title":"Expert Gate: Lifelong learning with a network of experts"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"B3","first-page":"1306","volume-title":"Proceedings of the 24th AAAI Conference on Artificial Intelligence","author":"Carlson A.","year":"2010"},{"key":"B4","author":"Chaudhry A.","year":"2018","journal-title":"Riemannian Walk for incremental learning: Understanding forgetting and intransigence"},{"key":"B5","author":"Chen T.","year":"2015","journal-title":"Net2net: Accelerating learning via knowledge transfer"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.1090\/qam\/112751"},{"key":"B8","author":"Furlanello T.","year":"2018","journal-title":"Born again neural networks"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1142\/S0218213008004059"},{"key":"B12","author":"Hinton G.","year":"2015","journal-title":"Distilling the knowledge in a neural network"},{"key":"B13","author":"Kingma D. P.","year":"2014","journal-title":"Adam: A method for stochastic optimization"},{"key":"B14","author":"Kirkpatrick J.","year":"2016","journal-title":"Overcoming catastrophic forgetting in neural networks"},{"key":"B15","first-page":"4652","volume-title":"Advances in neural information processing systems","volume":"30","author":"Lee S.-W.","year":"2017"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46493-0_37"},{"key":"B17","author":"Liu X.","year":"2018","journal-title":"Rotate your networks: Better weight consolidation and less catastrophic forgetting"},{"key":"B18","unstructured":"Lomonaco, V. & Maltoni, D. (2017). Core50: A new dataset and benchmark for continuous object recognition. In S. Levine, V. Vanhoucke, & K. Goldberg (Eds.), Proceedings of the 1st Annual Conference on Robot Learning (vol. 78, pp. 17\u201326). http:\/\/proceedings.mlr.press\/v78\/lomonaco17a.html"},{"key":"B19","first-page":"6467","volume-title":"Advances in neural information processing systems","volume":"30","author":"Lopez-Paz D.","year":"2017"},{"key":"B20","first-page":"67","author":"Mallya A.","year":"2018","journal-title":"Proceedings of the European Conference on Computer Vision"},{"key":"B21","first-page":"109","volume-title":"Psychology of learning and motivation","author":"McCloskey M.","year":"1989"},{"key":"B22","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33709-3_35"},{"key":"B23","author":"Paszke A.","year":"2017","journal-title":"Proceedings of the 2017 NIPS Autodiff Workshop"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.587"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007331723572"},{"key":"B26","author":"Romero A.","year":"2014","journal-title":"Fitnets: Hints for thin deep nets"},{"key":"B27","author":"Rusu A. A.","year":"2016","journal-title":"Progressive neural networks"},{"key":"B28","author":"Serr\u00e0 J.","year":"2018","journal-title":"Overcoming catastrophic forgetting with hard attention to the task"},{"key":"B29","first-page":"90","author":"Silver D. L.","year":"2002","journal-title":"Conference of the Canadian Society for Computational Studies of Intelligence"},{"key":"B30","first-page":"5","volume-title":"Proceedings of the AAAI spring symposium: Lifelong machine learning","author":"Silver D. L.","year":"2013"},{"key":"B31","first-page":"515","author":"Solomonoff R. J.","year":"1989","journal-title":"Proceedings of the Sixth Israeli Conference on Artificial Intelligence, Computer Vision and Pattern Recognition"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4613-1381-6"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-5529-2_8"},{"key":"B34","volume-title":"Explanation-based neural network learning: A lifelong learning approach","author":"Thrun S.","year":"2012"},{"key":"B35","author":"Zenke F.","year":"2017","journal-title":"Continual learning through synaptic intelligence"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/neco_a_01246","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:43:26Z","timestamp":1615585406000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/32\/1\/1-35\/95562"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1]]}},"alternative-id":["10.1162\/neco_a_01246"],"URL":"https:\/\/doi.org\/10.1162\/neco_a_01246","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1]]}}}