{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T08:23:35Z","timestamp":1775031815214,"version":"3.50.1"},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"3-4","license":[{"start":{"date-parts":[[1992,5,1]],"date-time":"1992-05-01T00:00:00Z","timestamp":704678400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[1992,5]]},"DOI":"10.1007\/bf00992701","type":"journal-article","created":{"date-parts":[[2005,1,9]],"date-time":"2005-01-09T16:35:16Z","timestamp":1105288516000},"page":"341-362","source":"Crossref","is-referenced-by-count":109,"title":["The convergence of TD(?) for general ?"],"prefix":"10.1007","volume":"8","author":[{"given":"Peter","family":"Dayan","sequence":"first","affiliation":[]}],"member":"297","reference":[{"key":"CR1","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1115\/1.3450344","volume":"97","author":"J.S. Albus","year":"1975","unstructured":"Albus, J.S. (1975). A new approach to manipulator control: The Cerebellar Model Articulation Controller (CMAC).Transactions of the ASME: Journal of Dynamical Systems, Measurement and Control, 97, 220?227.","journal-title":"Transactions of the ASME: Journal of Dynamical Systems, Measurement and Control"},{"key":"CR2","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","volume":"13","author":"A.G. Barto","year":"1983","unstructured":"Barto, A.G., Sutton, R.S. & Anderson, C.W. (1983). Neuronlike elements that can solve difficult learning problems.IEEE Transactions on Systems, Man, and Cybernetics, 13, 834?846.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"key":"CR3","volume-title":"Learning and computational neuroscience: Foundations of adaptive networks","author":"A.G. Barto","year":"1990","unstructured":"Barto, A.G., Sutton, R.S. & Watkins, C.J.C.H. (1990). Learning and sequential decision making. In M. Gabriel & J. Moore (Eds.),Learning and computational neuroscience: Foundations of adaptive networks. Cambridge, MA: MIT Press, Bradford Books."},{"key":"CR4","doi-asserted-by":"crossref","unstructured":"Bellman, R.E. & Dreyfus, S.E. (1962).Applied dynamic programming. RAND Corporation.","DOI":"10.1515\/9781400874651"},{"key":"CR5","volume-title":"Reinforcing connectionism: Learning the statistical way","author":"P. Dayan","year":"1991","unstructured":"Dayan, P. (1991).Reinforcing connectionism: Learning the statistical way. Ph.D. Thesis, University of Edinburgh, Scotland."},{"key":"CR6","volume-title":"A neural model of adaptive behavior","author":"S.E. Hampson","year":"1983","unstructured":"Hampson, S.E. (1983).A neural model of adaptive behavior. Ph.D. Thesis. University of California, Irvine, CA."},{"key":"CR7","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4684-6770-3","volume-title":"Connectionistic problem solving: computational aspects of biological learning","author":"S.E. Hampson","year":"1990","unstructured":"Hampson, S.E. (1990).Connectionistic problem solving: computational aspects of biological learning. Boston, MA: Birkh\u00e4user Boston."},{"key":"CR8","volume-title":"Machine learning: An artificial intelligence approach, 2","author":"J.H. Holland","year":"1986","unstructured":"Holland, J.H. (1986). Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In R.S. Michalski, J.G. Carbonell & T.M. Mitchell (Eds.),Machine learning: An artificial intelligence approach, 2. Los Altos, CA: Morgan Kaufmann."},{"key":"CR9","unstructured":"Klopf, A.H. (1972).Brain function and adaptive systems?A heterostatic theory. Air Force Research Laboratories Research Report, AFCRL-72-0164. Bedford, MA."},{"key":"CR10","volume-title":"The hedonistic neuron: A theory of memory, learning, and intelligence","author":"A.H. Klopf","year":"1982","unstructured":"Klopf, A.H. (1982).The hedonistic neuron: A theory of memory, learning, and intelligence. Washington, DC: Hemisphere."},{"key":"CR11","first-page":"137","volume":"2","author":"D. Michie","year":"1968","unstructured":"Michie. D. & Chambers, R.A. (1968). BOXES: An experiment in adaptive control.Machine Intelligence, 2, 137?152.","journal-title":"Machine Intelligence"},{"key":"CR12","volume-title":"Efficient memory-based learning for robot control","author":"A.W. Moore","year":"1990","unstructured":"Moore, A.W. (1990).Efficient memory-based learning for robot control. Ph.D. Thesis, University of Cambridge Computer Laboratory, Cambridge, England."},{"key":"CR13","first-page":"273","volume":"1","author":"S. Omohundro","year":"1987","unstructured":"Omohundro, S. (1987). Efficient algorithms with neural network behaviour.Complex Systems, 1, 273?347.","journal-title":"Complex Systems"},{"key":"CR14","doi-asserted-by":"crossref","unstructured":"Samuel, A.L. (1959). Some studies in machine learning using the game of checkers. Reprinted in E.A. Feigenbaum & J. Feldman (Eds.) (1963).Computers and thought. McGraw-Hill.","DOI":"10.1147\/rd.33.0210"},{"key":"CR15","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1147\/rd.116.0601","volume":"11","author":"A.L. Samuel","year":"1967","unstructured":"Samuel, A.L. (1967). Some studies in machine learning using the game of checkers II: Recent progress.IBM Journal of Research and Development, 11, 601?617.","journal-title":"IBM Journal of Research and Development"},{"key":"CR16","volume-title":"Temporal credit assignment in reinforcement learning","author":"R.S. Sutton","year":"1984","unstructured":"Sutton, R.S. (1984).Temporal credit assignment in reinforcement learning. Ph.D. Thesis, University of Massachusetts, Amherst, MA."},{"key":"CR17","first-page":"9","volume":"3","author":"R.S. Sutton","year":"1988","unstructured":"Sutton, R.S. (1988). Learning to predict by the methods of temporal difference.Machine Learning, 3, 9?44.","journal-title":"Machine Learning"},{"key":"CR18","volume-title":"Matrix iterative analysis","author":"R.S. Varga","year":"1962","unstructured":"Varga, R.S. (1962).Matrix iterative analysis. Englewood Cliffs, NJ: Prentice-Hall."},{"key":"CR19","volume-title":"Learning from delayed rewards","author":"C.I.C.H. Watkins","year":"1989","unstructured":"Watkins, C.I.C.H. (1989).Learning from delayed rewards. Ph.D. Thesis. University of Cambridge, England."},{"key":"CR20","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1016\/0893-6080(90)90088-3","volume":"3","author":"P.J. Werbos","year":"1990","unstructured":"Werbos, P.J. (1990). Consistency of HDP applied to a simple reinforcement learning problem.Neural Networks, 3, 179?189.","journal-title":"Neural Networks"},{"key":"CR21","volume-title":"Adaptive signal processing","author":"B. Widrow","year":"1985","unstructured":"Widrow, B. & Stearns, S.D. (1985).Adaptive signal processing. Englewood Cliffs, NJ: Prentice-Hall."},{"key":"CR22","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1016\/S0019-9958(77)90354-0","volume":"34","author":"I.H. Witten","year":"1977","unstructured":"Witten, I.H. (1977). An adaptive optimal controller for discrete-time Markov environments.Information and Control, 34, 286?295.","journal-title":"Information and Control"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00992701.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/BF00992701\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/BF00992701","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,4,29]],"date-time":"2019-04-29T22:58:34Z","timestamp":1556578714000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/BF00992701"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[1992,5]]},"references-count":22,"journal-issue":{"issue":"3-4","published-print":{"date-parts":[[1992,5]]}},"alternative-id":["BF00992701"],"URL":"https:\/\/doi.org\/10.1007\/bf00992701","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[1992,5]]}}}