{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T09:42:09Z","timestamp":1780479729967,"version":"3.54.1"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"15","license":[{"start":{"date-parts":[[2021,1,26]],"date-time":"2021-01-26T00:00:00Z","timestamp":1611619200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,1,26]],"date-time":"2021-01-26T00:00:00Z","timestamp":1611619200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2021,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recurrent neural networks (RNNs) have achieved state-of-the-art performances on various applications. However, RNNs are prone to be memory-bandwidth limited in practical applications and need both long periods of training and inference time. The aforementioned problems are at odds with training and deploying RNNs on resource-limited devices where the memory and floating-point operations (FLOPs) budget are strictly constrained. To address this problem, conventional model compression techniques usually focus on reducing inference costs, operating on a costly pre-trained model. Recently, dynamic sparse training has been proposed to accelerate the training process by directly training sparse neural networks from scratch. However, previous sparse training techniques are mainly designed for convolutional neural networks and multi-layer perceptron. In this paper, we introduce a method to train intrinsically sparse RNN models with a fixed number of parameters and floating-point operations (FLOPs) during training. We demonstrate state-of-the-art sparse performance with long short-term memory and recurrent highway networks on widely used tasks, language modeling, and text classification. We simply use the results to advocate that, contrary to the general belief that training a sparse neural network from scratch leads to worse performance than dense networks, sparse training with adaptive connectivity can usually achieve better performance than dense models for RNNs.<\/jats:p>","DOI":"10.1007\/s00521-021-05727-y","type":"journal-article","created":{"date-parts":[[2021,1,26]],"date-time":"2021-01-26T09:03:21Z","timestamp":1611651801000},"page":"9625-9636","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Efficient and effective training of sparse recurrent neural networks"],"prefix":"10.1007","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6195-771X","authenticated-orcid":false,"given":"Shiwei","family":"Liu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Iftitahu","family":"Ni\u2019mah","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Vlado","family":"Menkovski","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Decebal Constantin","family":"Mocanu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mykola","family":"Pechenizkiy","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,1,26]]},"reference":[{"key":"5727_CR1","unstructured":"Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th $$\\{$$USENIX$$\\}$$ symposium on operating systems design and implementation ($$\\{$$OSDI$$\\}$$ 16), pp 265\u2013283"},{"key":"5727_CR2","unstructured":"Tessera k, Hooker S, Rosman B (2021) Keep the gradients flowing: using gradient flow to study sparse network optimization. https:\/\/openreview.net\/forum?id=HI0j7omXTaG"},{"key":"5727_CR3","unstructured":"Liu S, Mocanu DC, Pei Y, Pechenizkiy M (2021) Selfish sparse RNN training. In: Submitted to international conference on learning representations. https:\/\/openreview.net\/forum?id=5wmNjjvGOXh"},{"key":"5727_CR4","doi-asserted-by":"crossref","unstructured":"Antol S, Agrawal A, Lu J, Mitchell M, Batra D, Lawrence ZC, Parikh D (2015) VQA: visual question answering. In: Proceedings of the IEEE international conference on computer vision, pp 2425\u20132433","DOI":"10.1109\/ICCV.2015.279"},{"key":"5727_CR5","doi-asserted-by":"publisher","first-page":"46324","DOI":"10.1109\/ACCESS.2020.2979141","volume":"8","author":"G Aquino","year":"2020","unstructured":"Aquino G, Rubio JDJ, Pacheco J, Gutierrez GJ, Ochoa G, Balcazar R, Cruz DR, Garcia E, Novoa JF, Zacarias A (2020) Novel nonlinear hypothesis for the delta parallel robot modeling. IEEE Access 8:46324\u201346334","journal-title":"IEEE Access"},{"key":"5727_CR6","doi-asserted-by":"publisher","first-page":"107159","DOI":"10.1016\/j.patcog.2019.107159","volume":"100","author":"WJ Baddar","year":"2020","unstructured":"Baddar WJ, Ro YM (2020) Encoding features robust to unseen modes of variation with attentive long short-term memory. Pattern Recognit 100:107159","journal-title":"Pattern Recognit"},{"key":"5727_CR7","unstructured":"Bellec G, Kappel D, Maass W, Legenstein R (2018) Deep rewiring: training very sparse deep networks. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=BJ_wN01C-"},{"key":"5727_CR8","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1016\/j.patcog.2018.07.034","volume":"85","author":"AK Bhunia","year":"2019","unstructured":"Bhunia AK, Konwer A, Bhunia AK, Bhowmick A, Roy PP, Pal U (2019) Script identification in natural scene image and video frames using an attention based convolutional-LSTM network. Pattern Recognit 85:172\u2013184","journal-title":"Pattern Recognit"},{"key":"5727_CR9","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1016\/j.patrec.2017.05.003","volume":"94","author":"SB Bhushan","year":"2017","unstructured":"Bhushan SB, Danti A (2017) Classification of text documents based on score level fusion approach. Pattern Recognit Lett 94:118\u2013126","journal-title":"Pattern Recognit Lett"},{"key":"5727_CR10","doi-asserted-by":"crossref","unstructured":"Chebotar Y, Waters A (2016) Distilling knowledge from ensembles of neural networks for speech recognition. In: Interspeech, pp 3439\u20133443","DOI":"10.21437\/Interspeech.2016-1190"},{"key":"5727_CR11","doi-asserted-by":"publisher","first-page":"103255","DOI":"10.1109\/ACCESS.2019.2929266","volume":"7","author":"HS Chiang","year":"2019","unstructured":"Chiang HS, Chen MY, Huang YJ (2019) Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access 7:103255\u2013103262","journal-title":"IEEE Access"},{"key":"5727_CR12","doi-asserted-by":"crossref","unstructured":"Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 1, long papers. Association for Computational Linguistics, Valencia, Spain, pp 1107\u20131116. https:\/\/www.aclweb.org\/anthology\/E17-1104","DOI":"10.18653\/v1\/E17-1104"},{"issue":"6","key":"5727_CR13","doi-asserted-by":"publisher","first-page":"1296","DOI":"10.1109\/TFUZZ.2009.2029569","volume":"17","author":"J de Jes\u00fas Rubio","year":"2009","unstructured":"de Jes\u00fas Rubio J (2009) Sofmls: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296\u20131309","journal-title":"IEEE Trans Fuzzy Syst"},{"key":"5727_CR14","doi-asserted-by":"crossref","unstructured":"de Rubio JJ (2020) Stability analysis of the modified Levenberg\u2013Marquardt algorithm for the artificial neural network training. IEEE Trans Neural Netw Learn Syst","DOI":"10.1109\/TNNLS.2020.3015200"},{"key":"5727_CR15","unstructured":"Dettmers T, Zettlemoyer L (2019) Sparse networks from scratch: faster training without losing performance. arXiv preprint arXiv:1907.04840"},{"key":"5727_CR16","doi-asserted-by":"crossref","unstructured":"Donahue J, Anne HL, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625\u20132634","DOI":"10.1109\/CVPR.2015.7298878"},{"issue":"10","key":"5727_CR17","doi-asserted-by":"publisher","first-page":"2279","DOI":"10.1016\/S0031-3203(01)00178-9","volume":"35","author":"M Egmont-Petersen","year":"2002","unstructured":"Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networks\u2014a review. Pattern Recognit 35(10):2279\u20132301","journal-title":"Pattern Recognit"},{"key":"5727_CR18","unstructured":"Evci U, Gale T, Menick J, Castro, PS, Elsen, E (2019) Rigging the lottery: making all tickets winners. arXiv preprint arXiv:1911.11134"},{"key":"5727_CR19","unstructured":"Evci U, Ioannou YA, Keskin C, Dauphin Y (2020) Gradient flow in sparse neural networks and how lottery tickets win. arXiv preprint arXiv:2010.03533"},{"key":"5727_CR20","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/j.patrec.2015.07.028","volume":"65","author":"G Feng","year":"2015","unstructured":"Feng G, Guo J, Jing BY, Sun T (2015) Feature subset selection using Naive Bayes for text classification. Pattern Recognit Lett 65:109\u2013115","journal-title":"Pattern Recognit Lett"},{"key":"5727_CR21","unstructured":"Frankle J, Carbin M (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=rJl-b3RcF7"},{"issue":"5","key":"5727_CR22","doi-asserted-by":"publisher","first-page":"848","DOI":"10.1109\/72.317740","volume":"5","author":"CL Giles","year":"1994","unstructured":"Giles CL, Omlin CW (1994) Pruning recurrent neural networks for improved generalization performance. IEEE Trans Neural Netw 5(5):848\u2013851","journal-title":"IEEE Trans Neural Netw"},{"key":"5727_CR23","unstructured":"Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient DNNs. In: Advances in neural information processing systems, pp 1379\u20131387"},{"key":"5727_CR24","doi-asserted-by":"crossref","unstructured":"Han S, Kang J, Mao H, Hu Y, Li X, Li Y, Xie D, Luo H, Yao S, Wang Y et al (2017) ESE: efficient speech recognition engine with sparse LSTM on FPGA. In: Proceedings of the 2017 ACM\/SIGDA international symposium on field-programmable gate arrays. ACM, pp 75\u201384","DOI":"10.1145\/3020078.3021745"},{"key":"5727_CR25","unstructured":"Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems, pp 1135\u20131143"},{"key":"5727_CR26","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1016\/j.neucom.2019.08.095","volume":"390","author":"G Hern\u00e1ndez","year":"2020","unstructured":"Hern\u00e1ndez G, Zamora E, Sossa H, T\u00e9llez G, Furl\u00e1n F (2020) Hybrid neural networks for big data classification. Neurocomputing 390:327\u2013340","journal-title":"Neurocomputing"},{"issue":"8","key":"5727_CR27","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735\u20131780","journal-title":"Neural Comput"},{"key":"5727_CR28","doi-asserted-by":"crossref","unstructured":"Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, short papers. Association for Computational Linguistics, Valencia, Spain, pp 427\u2013431. https:\/\/www.aclweb.org\/anthology\/E17-2068","DOI":"10.18653\/v1\/E17-2068"},{"key":"5727_CR29","doi-asserted-by":"crossref","unstructured":"Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: 2017 ACM\/IEEE 44th annual international symposium on computer architecture (ISCA). IEEE, pp 1\u201312","DOI":"10.1145\/3079856.3080246"},{"issue":"12","key":"5727_CR30","doi-asserted-by":"publisher","first-page":"2705","DOI":"10.1016\/S0031-3203(01)00242-4","volume":"35","author":"A Juan","year":"2002","unstructured":"Juan A, Vidal E (2002) On the use of Bernoulli mixture models for text classification. Pattern Recognit 35(12):2705\u20132710","journal-title":"Pattern Recognit"},{"key":"5727_CR31","doi-asserted-by":"crossref","unstructured":"Kisel\u2019\u00e1k J, Lu Y, \u0160vihra J, Sz\u00e9pe P, Stehl\u00edk M (2020) \u201cSPOCU\u201d: scaled polynomial constant unit activation function. Neural Comput Appl 1\u201317","DOI":"10.1007\/s00521-020-05412-6"},{"key":"5727_CR32","first-page":"971","volume":"30","author":"G Klambauer","year":"2017","unstructured":"Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. Adv Neural Inf Process Syst 30:971\u2013980","journal-title":"Adv Neural Inf Process Syst"},{"key":"5727_CR33","unstructured":"LeCun Y, Denker JS, Solla, SA (1990) Optimal brain damage. In: Advances in neural information processing systems, pp 598\u2013605"},{"key":"5727_CR34","unstructured":"Lee N, Ajanthan T, Gould S, Torr PH (2019) A signal propagation perspective for pruning neural networks at initialization. arXiv preprint https:\/\/openreview.net\/forum?id=HI0j7omXTaG0"},{"key":"5727_CR35","doi-asserted-by":"crossref","unstructured":"Liu S, van der Lee T, Yaman A, Atashgahi Z, Ferrar D, Sokar G, Pechenizkiy M, Mocanu D (2020) Topological insights into sparse neural networks. In: Joint European conference on machine learning and knowledge discovery in databases","DOI":"10.1007\/978-3-030-67664-3_17"},{"key":"5727_CR36","doi-asserted-by":"crossref","unstructured":"Liu S, Mocanu DC, Matavalam ARR, Pei Y, Pechenizkiy M (2020) Sparse evolutionary deep learning with over one million artificial neurons on commodity hardware. Neural Comput Appl 1\u201316","DOI":"10.1007\/s00521-020-05136-7"},{"key":"5727_CR37","unstructured":"Louizos C, Welling M, Kingma, DP (2018) Learning sparse neural networks through $$ l\\_0 $$ regularization. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=H1Y8hhg0b"},{"key":"5727_CR38","unstructured":"Lu G, Zhao X, Yin J, Yang W, Li B (2018) Multi-task learning using variational auto-encoder for sentiment classification. Pattern Recognit Lett"},{"key":"5727_CR39","unstructured":"Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 142\u2013150"},{"issue":"2","key":"5727_CR40","first-page":"313","volume":"19","author":"MP Marcus","year":"1993","unstructured":"Marcus MP, Santorini B, Marcinkiewicz MA (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313\u2013330","journal-title":"Comput Linguist"},{"key":"5727_CR41","doi-asserted-by":"publisher","first-page":"31968","DOI":"10.1109\/ACCESS.2018.2846483","volume":"6","author":"JA Meda-Campa\u00f1a","year":"2018","unstructured":"Meda-Campa\u00f1a JA (2018) On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access 6:31968\u201331973","journal-title":"IEEE Access"},{"key":"5727_CR42","unstructured":"Merity S, Keskar NS, Socher, R (2017) Regularizing and optimizing LSTM language models. arXiv preprint arXiv:1708.02182"},{"key":"5727_CR43","unstructured":"Michael H, Zhu SG (2018) To prune, or not to prune: exploring the efficacy of pruning for model compression. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=S1lN69AT-"},{"key":"5727_CR44","doi-asserted-by":"crossref","unstructured":"Mikolov T, Karafi\u00e1t M, Burget L, \u010cernock\u1ef3 J, Khudanpur S (2010) Recurrent neural network based language model. In: Eleventh annual conference of the international speech communication association","DOI":"10.1109\/ICASSP.2011.5947611"},{"key":"5727_CR45","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1016\/j.patcog.2017.04.017","volume":"69","author":"DC Mocanu","year":"2017","unstructured":"Mocanu DC, Ammar HB, Puig L, Eaton E, Liotta A (2017) Estimating 3D trajectories from 2D projections via disjunctive factored four-way conditional restricted Boltzmann machines. Pattern Recognit 69:325\u2013335","journal-title":"Pattern Recognit"},{"issue":"2","key":"5727_CR46","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1007\/s10994-016-5570-z","volume":"104","author":"DC Mocanu","year":"2016","unstructured":"Mocanu DC, Mocanu E, Nguyen PH, Gibescu M, Liotta A (2016) A topological insight into restricted Boltzmann machines. Mach Learn 104(2):243\u2013270","journal-title":"Mach Learn"},{"issue":"1","key":"5727_CR47","doi-asserted-by":"publisher","first-page":"2383","DOI":"10.1038\/s41467-018-04316-3","volume":"9","author":"DC Mocanu","year":"2018","unstructured":"Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2018) Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nat Commun 9(1):2383","journal-title":"Nat Commun"},{"key":"5727_CR48","unstructured":"Molchanov D, Ashukha A, Vetrov D (2017) Variational dropout sparsifies deep neural networks. In: Proceedings of the 34th international conference on machine learning, vol 70. JMLR.org, pp 2498\u20132507"},{"key":"5727_CR49","unstructured":"Mostafa H, Wang X (2019) Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization. In: Proceedings of the 36th international conference on machine learning, vol 97. JMLR.org, pp 4646\u20134655"},{"key":"5727_CR50","unstructured":"Narang S, Elsen E, Diamos G, Sengupta S (2017) Exploring sparsity in recurrent neural networks. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=BylSPv9gx"},{"key":"5727_CR51","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, pp 8026\u20138037"},{"key":"5727_CR52","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1016\/j.patcog.2019.04.015","volume":"93","author":"H Ren","year":"2019","unstructured":"Ren H, Wang W, Liu C (2019) Recognizing online handwritten Chinese characters using RNNs with new computing architectures. Pattern Recognit 93:179\u2013192","journal-title":"Pattern Recognit"},{"key":"5727_CR53","unstructured":"Shen Y, Tan S, Sordoni A, Courville A (2018) Ordered neurons: integrating tree structures into recurrent neural networks. arXiv preprint arXiv:1810.09536"},{"key":"5727_CR54","unstructured":"Srivastav, RK, Greff K, Schmidhuber, J (2015) Highway networks. arXiv preprint https:\/\/openreview.net\/forum?id=HI0j7omXTaG7"},{"key":"5727_CR55","doi-asserted-by":"publisher","first-page":"397","DOI":"10.1016\/j.patcog.2016.10.016","volume":"63","author":"B Su","year":"2017","unstructured":"Su B, Lu S (2017) Accurate recognition of words in scenes without character segmentation using recurrent neural network. Pattern Recognit 63:397\u2013405","journal-title":"Pattern Recognit"},{"key":"5727_CR56","unstructured":"Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104\u20133112"},{"key":"5727_CR57","unstructured":"Wen W, He Y, Rajbhandari S, Zhang M, Wang W, Liu F, Hu B, Chen Y, Li H (2018) Learning intrinsic sparse structures within long short-term memory. In: International conference on learning representations. https:\/\/openreview.net\/forum?id=rk6cfpRjZ"},{"key":"5727_CR58","unstructured":"Xiao Y, Cho K (2016) Efficient character-level document classification by combining convolution and recurrent layers. arXiv preprint https:\/\/openreview.net\/forum?id=HI0j7omXTaG9"},{"key":"5727_CR59","unstructured":"Yang Z, Dai Z, Salakhutdinov R, Cohen WW (2017) Breaking the softmax bottleneck: a high-rank RNN language model. arXiv preprint arXiv:1711.03953"},{"key":"5727_CR60","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1016\/j.patcog.2016.11.011","volume":"64","author":"S Yousfi","year":"2017","unstructured":"Yousfi S, Berrani SA, Garcia C (2017) Contribution of recurrent connectionist language models in improving LSTM-based Arabic text recognition in videos. Pattern Recognit 64:245\u2013254","journal-title":"Pattern Recognit"},{"key":"5727_CR61","unstructured":"Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv preprint https:\/\/openreview.net\/forum?id=5wmNjjvGOXh1"},{"key":"5727_CR62","unstructured":"Zhang X, LeCun Y (2015) Text understanding from scratch. arXiv preprint https:\/\/openreview.net\/forum?id=5wmNjjvGOXh2"},{"key":"5727_CR63","unstructured":"Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649\u2013657"},{"key":"5727_CR64","doi-asserted-by":"crossref","unstructured":"Zhang Y, Chen G, Yu D, Yaco K, Khudanpur S, Glass J (2016) Highway long short-term memory RNNs for distant speech recognition. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5755\u20135759","DOI":"10.1109\/ICASSP.2016.7472780"},{"key":"5727_CR65","unstructured":"Zhou H, Lan J, Liu R, Yosinski J (2019) Deconstructing lottery tickets: zeros, signs, and the supermask. In: Advances in neural information processing systems, pp 3592\u20133602"},{"key":"5727_CR66","unstructured":"Zilly JG, Srivastava RK, Koutn\u00edk J, Schmidhuber, J (2017) Recurrent highway networks. In: Proceedings of the 34th international conference on machine learning, vol 70. JMLR.org, pp 4189\u20134198"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-021-05727-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-021-05727-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-021-05727-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,7]],"date-time":"2021-07-07T21:13:17Z","timestamp":1625692397000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-021-05727-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,26]]},"references-count":66,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2021,8]]}},"alternative-id":["5727"],"URL":"https:\/\/doi.org\/10.1007\/s00521-021-05727-y","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,26]]},"assertion":[{"value":"28 October 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 January 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}