{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:10:08Z","timestamp":1750201808776,"version":"3.41.0"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2017,4,18]],"date-time":"2017-04-18T00:00:00Z","timestamp":1492473600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"DOI":"10.13039\/501100003030","name":"AGAUR","doi-asserted-by":"crossref","award":["FI-DGR ECO\/1551\/2012"],"award-info":[{"award-number":["FI-DGR ECO\/1551\/2012"]}],"id":[{"id":"10.13039\/501100003030","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2017,10]]},"DOI":"10.1007\/s10489-017-0910-7","type":"journal-article","created":{"date-parts":[[2017,4,18]],"date-time":"2017-04-18T02:18:04Z","timestamp":1492481884000},"page":"670-704","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Bellman residuals minimization using online support vector machines"],"prefix":"10.1007","volume":"47","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9700-2971","authenticated-orcid":false,"given":"Gennaro","family":"Esposito","sequence":"first","affiliation":[]},{"given":"Mario","family":"Martin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2017,4,18]]},"reference":[{"issue":"2","key":"910_CR1","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1016\/j.jspi.2011.08.007","volume":"142","author":"F Amir-massoud","year":"2012","unstructured":"Amir-massoud F, Szepesv\u00e1ri CC (2012) Regularized least-squares regression: Learning from a \u03b2-mixing sequence. Journal of Statistical Planning and Inference 142(2):493\u2013505. http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0378375811003181","journal-title":"Journal of Statistical Planning and Inference"},{"key":"910_CR2","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1007\/s10994-007-5038-2","volume":"71","author":"A Antos","year":"2008","unstructured":"Antos A, Szepesv\u00e1ri C, Munos R (2008) Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal 71:89\u2013129","journal-title":"Machine Learning Journal"},{"key":"910_CR3","doi-asserted-by":"crossref","unstructured":"Baird L (1995) Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the 12th international conference on machine learning. Morgan kaufmann, pp 30\u201337","DOI":"10.1016\/B978-1-55860-377-6.50013-X"},{"key":"910_CR4","volume-title":"Dynamic programming and optimal control vol II","author":"DP Bertsekas","year":"2007","unstructured":"Bertsekas DP (2007) Dynamic programming and optimal control, vol II. Athena Scientific, Boston"},{"key":"910_CR5","volume-title":"Kernel-based approximate dynamic programming using bellman residual elimination","author":"BM Bethke","year":"2010","unstructured":"Bethke BM (2010) Kernel-based approximate dynamic programming using bellman residual elimination. Ph.D. thesis, Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, Cambridge MA. http:\/\/acl.mit.edu\/papers\/BethkePhD.pdf"},{"key":"910_CR6","doi-asserted-by":"crossref","unstructured":"Busoniu L, Ernst D, De Schutter B, Babuska R (2010) Online least-squares policy iteration for reinforcement learning control. In: Baltimore U (ed) Proceedings of American Control Conference ACC-10, pp 486\u2013491","DOI":"10.1109\/ACC.2010.5530856"},{"key":"910_CR7","doi-asserted-by":"publisher","unstructured":"Busoniu L, Lazaric A, Ghavamzadeh M, Munos R, Babuska R, Schutter B (2012) Least-squares methods for policy iteration. In: Wiering M., Otterlo M (eds) reinforcement learning, adaptation, learning, and optimization, vol 12. Springer, Berlin Heidelberg, pp 75\u2013109. doi: 10.1007\/978-3-642-27645-3_3","DOI":"10.1007\/978-3-642-27645-3_3"},{"key":"910_CR8","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1017\/S0266466602181023","volume":"18","author":"M Carrasco","year":"2002","unstructured":"Carrasco M, Chen X (2002) Mixing and moment properties of various garch and stochastic volatility model. Econ Theory 18:17\u201339","journal-title":"Econ Theory"},{"key":"910_CR9","doi-asserted-by":"publisher","unstructured":"Cherkassky V, Ma Y (2004) Comparison of loss functions for linear regression. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, 2004, vol 1, p 400. doi: 10.1109\/IJCNN.2004.1379938","DOI":"10.1109\/IJCNN.2004.1379938"},{"key":"910_CR10","first-page":"1007","volume":"5","author":"A Christmann","year":"2004","unstructured":"Christmann A, Steinwart I (2004) On robust properties of convex risk minimization methods for pattern recognition. J Mach Learn Res 5:1007\u20131034","journal-title":"J Mach Learn Res"},{"key":"910_CR11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1090\/S0273-0979-01-00923-5","volume":"39","author":"F Cucker","year":"2002","unstructured":"Cucker F, Smale S (2002) On the mathematical foundations of learning. Bull Am Math Soc 39:1\u201349","journal-title":"Bull Am Math Soc"},{"key":"910_CR12","first-page":"503","volume":"6","author":"E Daniel","year":"2005","unstructured":"Daniel E, Pierre G, Luis W (2005) Tree based batch mode reinforcement learning. J Mach Learn Res 6:503\u2013556","journal-title":"J Mach Learn Res"},{"key":"910_CR13","first-page":"321","volume":"18","author":"YA Davydov","year":"1973","unstructured":"Davydov YA (1973) Mixing conditions for markov chains. Teor Veroyatnost i Primenen 18:321\u2013338","journal-title":"Teor Veroyatnost i Primenen"},{"key":"910_CR14","volume-title":"Regularized Approximate Policy Iteration using kernel for on-line Reinforcement Learning","author":"G Esposito","year":"2015","unstructured":"Esposito G (2015) Regularized approximate policy iteration using Kernel for on-line reinforcement learning. PhD Thesis, Universitat Politecnica de Catalunya"},{"key":"910_CR15","volume-title":"A unified framework for regularization networks and support vector machines","author":"T Evgeniou","year":"1999","unstructured":"Evgeniou T, Pontil M, Poggio T (1999) A unified framework for regularization networks and support vector machines. Tech. rep., MIT, Cambridge, MA, USA. http:\/\/www.ncstrl.org:8900\/ncstrl\/servlet\/search?formname=detail&id=oai"},{"key":"910_CR16","volume-title":"Regularization in reinforcement learning","author":"Am Farahmand","year":"2011","unstructured":"Farahmand Am (2011) Regularization in reinforcement learning. Ph.D. thesis, University of Alberta"},{"key":"910_CR17","unstructured":"Farahmand Am, Munos R, Szepesv\u00e1ri C (2010) Error propagation for approximate policy and value iteration. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) NIPS. Curran Associates, Inc, pp 568\u2013576"},{"key":"910_CR18","unstructured":"van de Geer S (2009) Empirical Processes in M-Estimation. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press. http:\/\/books.google.es\/books?id=0VEcQAAACAAJ"},{"key":"910_CR19","doi-asserted-by":"crossref","unstructured":"Gy\u00f6rfi L, Kohler M, Krzyzak A, Walk H (2002) A Distribution-Free theory of nonparametric regression. Springer","DOI":"10.1007\/b97848"},{"issue":"5","key":"910_CR20","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/0893-6080(89)90020-8","volume":"2","author":"K Hornik","year":"1989","unstructured":"Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359\u2013366","journal-title":"Neural Netw"},{"key":"910_CR21","unstructured":"Jung T, Polani D (2006) Least squares svm for least squares td learning. In: Proceedings of 17th european conference on artificial intelligence, pp 499\u2013503"},{"key":"910_CR22","unstructured":"Karandikar RL, Vidyasagar M (2004) Probably approximately correct learning with beta mixing input sequences"},{"key":"910_CR23","unstructured":"Kohler M, Krzyzak A, Schfer D (2000) Application of structural risk minimization to multivariate smoothing spline regression estimates"},{"key":"910_CR24","first-page":"1107","volume":"4","author":"MG Lagoudakis","year":"2003","unstructured":"Lagoudakis MG, Parr R (2003) Least-squares policy iteration. J sMach Learn Res 4:1107\u20131149. http:\/\/dblp.uni-trier.de\/db\/journals\/jmlr\/jmlr4.html#LagoudakisP03","journal-title":"J sMach Learn Res"},{"key":"910_CR25","first-page":"3041","volume":"13","author":"A Lazaric","year":"2012","unstructured":"Lazaric A, Ghavamzadeh M, Munos R (2012) Finite-sample analysis of least-squares policy iteration. J Mach Learn Res 13:3041\u20133074","journal-title":"J Mach Learn Res"},{"key":"910_CR26","doi-asserted-by":"crossref","unstructured":"Lee DH, Kim JJ, Lee JJ (2010) Online support vector regression based actor-critic method. In: IECON 2010 - 36th annual conference on IEEE industrial electronics society, pp 193\u2013198","DOI":"10.1109\/IECON.2010.5675206"},{"key":"910_CR27","unstructured":"Maillard OA, Munos R, Lazaric A, Ghavamzadeh M (2010) Finite-sample analysis of bellman residual minimization. In: Sugiyama M, Q.Y. 0001 (eds) Proceedings of the ACML, JMLR, pp 299\u2013314. JMLR.org"},{"key":"910_CR28","doi-asserted-by":"crossref","unstructured":"Martin M (2002) On-line support vector machine regression. In: Proceedings of the 13th European conference on machine learning, ECML \u201902. Springer-Verlag, London, UK, UK, pp 282\u2013294. http:\/\/dl.acm.org\/citation.cfm?id=645329.650050","DOI":"10.1007\/3-540-36755-1_24"},{"key":"910_CR29","doi-asserted-by":"crossref","unstructured":"Meir R, Hellerstein L (2000) Nonparametric time series prediction through adaptive model selection. Mach Learn:5\u201334","DOI":"10.1023\/A:1007602715810"},{"key":"910_CR30","first-page":"789","volume":"11","author":"M Mohri","year":"2010","unstructured":"Mohri M, Rostamizadeh A (2010) Stability bounds for stationary \u03b2\u2212mixing and \u03b1-mixing processes. J Mach Learn Res 11:789\u2013814. http:\/\/dl.acm.org\/citation.cfm?id=1756006.1756032","journal-title":"J Mach Learn Res"},{"issue":"3","key":"910_CR31","first-page":"199","volume":"21","author":"AW Moore","year":"1995","unstructured":"Moore AW, Atkeson CG (1995) The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Mach Learn 21(3):199\u2013233","journal-title":"Mach Learn"},{"key":"910_CR32","unstructured":"Randlov J, Alstrom P (1998) Learning to drive a bycicle using reinforcement learning an shaping. In: Proceeding of the 5th international conference on machine learning, pp 463\u2013471"},{"key":"910_CR33","doi-asserted-by":"crossref","unstructured":"Sch\u00f6lkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: Helmbold DP, Williamson B (eds) COLT\/EuroCOLT, lecture notes in computer science, vol 2111. Springer, pp 416\u2013426. http:\/\/dblp.uni-trier.de\/db\/conf\/colt\/colt2001.html#ScholkopfHS01","DOI":"10.1007\/3-540-44581-1_27"},{"key":"910_CR34","doi-asserted-by":"crossref","unstructured":"van Seijen H, van Hasselt H, Whiteson S, Wiering M (2009) A theoretical and empirical analysis of expected sarsa. In: ADPRL 2009: Proceedings of the IEEE symposium on adaptive dynamic programming and reinforcement learning, pp 177\u2013184","DOI":"10.1109\/ADPRL.2009.4927542"},{"key":"910_CR35","unstructured":"Steinwart I, Christmann A (2008) Sparsity of svms that use the epsilon-insensitive loss. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) NIPS. Curran Associates, Inc., pp 1569\u20131576. http:\/\/dblp.uni-trier.de\/db\/conf\/nips\/nips2008.html#SteinwartC08"},{"key":"910_CR36","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-77242-4","volume-title":"Support Vector Machines","author":"I Steinwart","year":"2008","unstructured":"Steinwart I, Christmann A (2008) Support vector machines, 1st edn. Springer Publishing Company, Incorporated","edition":"1st edn."},{"issue":"1","key":"910_CR37","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1016\/j.jmva.2008.04.001","volume":"100","author":"I Steinwart","year":"2009","unstructured":"Steinwart I, Hush D, Scovel C (2009) Learning from dependent observations. J Multivar Anal 100 (1):175\u2013194. doi: 10.1016\/j.jmva.2008.04.001","journal-title":"J Multivar Anal"},{"key":"910_CR38","doi-asserted-by":"crossref","unstructured":"Taylor G, Parr R (2009) Kernelized value function approximation for reinforcement learning. In: Proceedings of the 26th annual international conference on machine learning, pp 1017\u20131024","DOI":"10.1145\/1553374.1553504"},{"issue":"2","key":"910_CR39","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1007\/s10208-004-0155-9","volume":"6","author":"Q Wu","year":"2006","unstructured":"Wu Q, Ying Y, Zhou DX (2006) Learning rates of least-square regularized regression. Found Comput Math 6(2):171\u2013192. doi: 10.1007\/s10208-004-0155-9","journal-title":"Found Comput Math"},{"issue":"1","key":"910_CR40","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1214\/aop\/1176988849","volume":"22","author":"B Yu","year":"1994","unstructured":"Yu B (1994) Rates of convergence for empirical processes of stationary mixing sequences. Ann Probab 22(1):94\u2013116. doi: 10.1214\/aop\/1176988849","journal-title":"Ann Probab"},{"key":"910_CR41","doi-asserted-by":"publisher","unstructured":"Zhu DX, Smale S (2003) Estimating the approximation error in learning theory. Anal Appl 1-1:1\u201349. doi: 10.1142\/S0219530503000089","DOI":"10.1142\/S0219530503000089"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10489-017-0910-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-017-0910-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-017-0910-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:29Z","timestamp":1750200089000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10489-017-0910-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,4,18]]},"references-count":41,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,10]]}},"alternative-id":["910"],"URL":"https:\/\/doi.org\/10.1007\/s10489-017-0910-7","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"type":"print","value":"0924-669X"},{"type":"electronic","value":"1573-7497"}],"subject":[],"published":{"date-parts":[[2017,4,18]]}}}