References and models cited by this paper
References and models that cite this paper
Baum LE, Petrie T, Soulds G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains Ann Math Stat 41:164-171
Bouton ME, Nelson JB (1998) Mechanisms of feature-positive and feature-negative discrimination learning in an appetitive conditioning paradigm Occasion setting: Associative learning and cognition in animals, Schmajuk NA:Holland PC, ed. pp.69
Bradtke SJ, Duff MO (1995) Reinforcement learning methods for continuous-time Markov decision problems Advances in neural information processing systems, Tesauro G:Touretzky DS:Leen TK, ed. pp.393
Brown J, Bullock D, Grossberg S (1999) How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. J Neurosci 19:10502-11 [PubMed]
Chrisman L (1992) Reinforcement learning with perceptual aliasing: The perceptual distinctions approach Proceedings of the Tenth National Conference on Artificial Intelligence :183-188
Courville AC, Daw ND, Gordon GJ, Touretzky DS (2003) Model uncertainty in classical conditioning Advances in neural information processing systems, Thrun S:Saul LK:Scholkopf, ed.
Courville AC, Daw ND, Touretzky DS (2004) Similarity and discrimination in classical conditioning: A latent variable account Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.
Courville AC, Touretzky DS (2001) Modeling temporal structure in classical conditioning Advances in neural information processing systems, Dietterich TG:Becker S:Ghahramani Z, ed. pp.3
Das T, Gosavi A, Mahadevan S, Marchalleck N (1999) Solving semi-Markov decision problems using average reward reinforcement learning Management Science 45:560-574
Daw N, Touretzky D, Skaggs W (2004) Contrasting neuronal correlates between dorsal and ventral striatum in the rat Cosyne04 Comput Sys Neurosci Abstr
Daw ND (2003) Reinforcement learning models of the dopamine system and their behavioral implications Unpublished doctoral dissertation
Daw ND, Niv Y, Dayan P (2006) Actions, values, policies, and the basal ganglia Recent breakthroughs in basal ganglia research, Bezard E, ed.
Dayan P (2002) Motivated reinforcement learning Advances in neural information processing systems, Dietterich T:Becker S:Ghahramani Z, ed. pp.11
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1-38
Deneve S (2004) Bayesian inference in spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.
Dickinson A, Balleine B (2002) The role of learning in motivation Stevens handbook of experimental psychology (3rd ed), Gallistel CR, ed. pp.497
Dickinson A, Hall G, Mackintosh NJ (1976) Surprise and the attenuation of blocking J Exp Psychol: Animal Behav Process 2:313-322
Dickinson A, Mackintosh NJ (1979) Reinforcer specificity in the enhancement of conditioning by posttrial surprise J Exp Psychol: Animal Behav Process 5:162-177
Faure A, Haberland U, Condé F, El Massioui N (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci 25:2771-80 [Journal] [PubMed]
Fiorillo CD, Schultz W (2001) The reward responses of dopamine neurons persist when prediction of reward is probabilistic with respect to time or occurrence Soc Neurosci Abstr 27:827
Gibbon J (1977) Scalar expectancy theory and Weber's law in animal timing Psychol Rev 84:279-325
Guedon Y, Cocozza-Thivent C (1990) Explicit state occupancy modeling by hidden semi-Markov models: Application of Derin's scheme Computer Speech And Language 4:167-192
Holland PC, Lamoureux JA, Han JS, Gallagher M (1999) Hippocampal lesions interfere with Pavlovian negative occasion setting. Hippocampus 9:143-57 [Journal] [PubMed]
Houk JC, Adams JL, Barto AGA (1995) A model of how the basal ganglia generate and use neural signals that predict reinforcement. Models Of Information Processing In The Basal Ganglia, Houk JC:Davis JL:Beiser DG, ed. pp.249
Kaelbling LP, Littman ML, Cassandra AR (1998) Planning and acting in partially observable stochastic domains Art Intell 101:99-134
Kakade S, Dayan P (2000) Acquisition in autoshaping Advances in neural information processing systems, Solla SA:Leen TK:Muller KR, ed.
Kurth-nelson Z, Redish A (2004) µagents: Action-selection in temporally dependent phenomena using temporal difference learning over a collective belief structure Soc Neurosci Abstr 30:207
Levinson SE (1986) Continuously variable duration hidden Markov models for automatic speech recognition Computer Speech And Language 1:29-45
Lewicki MS, Olshausen BA (1999) A probabilistic framework for the adaptation and comparison of image codes J Opt Soc Am A: Optics, Image, Science And Vision 16:1587-1601
Mahadevan S, Marchalleck N, Das T, Gosavi A (1997) Self-improving factory simulation using continuous-time average-reward reinforcement learning Proceedings of the 14th International Conference on Machine Learning
Mellon RC, Leak TM, Fairhurst S, Gibbon J (1995) Timing processes in the reinforcement-omission effect Animal Learn Behav 23:286-296
Moore AW, Atkeson CG (1993) Prioritized sweeping: Reinforcement learning with less data and less real time Mach Learn 13:103-130
Morris G, Arkadir D, Nevet A, Vaadia E, Bergman H (2004) Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43:133-43 [Journal] [PubMed]
Niv Y, Daw ND, Dayan P (2005) How fast to work: Response vigor, motivation, and tonic dopamine Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.
Niv Y, Duff MO, Dayan P (2004) The effects of uncertainty on TD learning Cosyne04-Comput Sys Neurosci Abstr
O'Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004) Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304:452-4 [Journal] [PubMed]
Pan WX, Schmidt R, Wickens JR, Hyland BI (2005) Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J Neurosci 25:6235-42 [Journal] [PubMed]
Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, Rudarakanchana N, Halkerston KM, Robbins TW, Everitt BJ (2002) Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav Brain Res 137:149-63 [PubMed]
Rao RPN (2004) Hierarchical Bayesian inference in networks of spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.
Rao RPN, Olshausen BA, Lewicki MS (2002) Probabilistic models of the brain: Perception and neural function
Schultz W, Apicella P, Ljungberg T (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 13:900-13 [PubMed]
Smith AJ, Becker S, Kapur S (2005) A computational model of the functional role of the ventral-striatal D2 receptor in the expression of previously acquired behaviors. Neural Comput 17:361-95 [Journal] [PubMed]
Sutton RS (1984) Temporal credit assignment in reinforcement learning Unpublished doctoral dissertation
Sutton RS (1988) Learning to predict by the method of temporal diferences Machine Learning 3:9-44
Sutton RS (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming Proceedings of the Seventh International Conference on Machine Learning :216-224
Sutton RS, Barto AG (1990) Time-derivative models of Pavlovian reinforcement Learning and computational neuroscience: Foundations of adaptive networks, Gabriel M:Moore J, ed. pp.497
Tsitsiklis JN, Van_Roy B (2002) On average versus discounted reward temporal-difference learning Mach Learn 49:179-191
Voorn P, Vanderschuren LJ, Groenewegen HJ, Robbins TW, Pennartz CM (2004) Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci 27:468-74 [Journal] [PubMed]
Zemel R, Huys Q, Natarajan R, Dayan P (2004) Probabilistic computation in spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.