Citation Relationships



Daw ND, Courville AC, Tourtezky DS, Touretzky DS (2006) Representation and timing in theories of the dopamine system. Neural Comput 18:1637-77 [PubMed]

References and models cited by this paper

References and models that cite this paper

Baum LE, Petrie T, Soulds G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains Ann Math Stat 41:164-171

Bayer HM, Glimcher PW (2005) Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47:129-41 [Journal] [PubMed]

Bouton ME, Nelson JB (1998) Mechanisms of feature-positive and feature-negative discrimination learning in an appetitive conditioning paradigm Occasion setting: Associative learning and cognition in animals, Schmajuk NA:Holland PC, ed. pp.69

Bradtke SJ, Duff MO (1995) Reinforcement learning methods for continuous-time Markov decision problems Advances in neural information processing systems, Tesauro G:Touretzky DS:Leen TK, ed. pp.393

Brown J, Bullock D, Grossberg S (1999) How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues. J Neurosci 19:10502-11 [PubMed]

Chrisman L (1992) Reinforcement learning with perceptual aliasing: The perceptual distinctions approach Proceedings of the Tenth National Conference on Artificial Intelligence :183-188

Courville AC, Daw ND, Gordon GJ, Touretzky DS (2003) Model uncertainty in classical conditioning Advances in neural information processing systems, Thrun S:Saul LK:Scholkopf, ed.

Courville AC, Daw ND, Touretzky DS (2004) Similarity and discrimination in classical conditioning: A latent variable account Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.

Courville AC, Touretzky DS (2001) Modeling temporal structure in classical conditioning Advances in neural information processing systems, Dietterich TG:Becker S:Ghahramani Z, ed. pp.3

Das T, Gosavi A, Mahadevan S, Marchalleck N (1999) Solving semi-Markov decision problems using average reward reinforcement learning Management Science 45:560-574

Daw N, Touretzky D, Skaggs W (2004) Contrasting neuronal correlates between dorsal and ventral striatum in the rat Cosyne04 Comput Sys Neurosci Abstr

Daw ND (2003) Reinforcement learning models of the dopamine system and their behavioral implications Unpublished doctoral dissertation

Daw ND, Kakade S, Dayan P (2002) Opponent interactions between serotonin and dopamine. Neural Netw 15:603-16 [PubMed]

Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8:1704-11 [Journal] [PubMed]

Daw ND, Niv Y, Dayan P (2006) Actions, values, policies, and the basal ganglia Recent breakthroughs in basal ganglia research, Bezard E, ed.

Daw ND, Touretzky DS (2002) Long-term reward prediction in TD models of the dopamine system. Neural Comput 14:2567-83 [Journal] [PubMed]

Dayan P (2002) Motivated reinforcement learning Advances in neural information processing systems, Dietterich T:Becker S:Ghahramani Z, ed. pp.11

Dayan P, Balleine BW (2002) Reward, motivation, and reinforcement learning. Neuron 36:285-98 [PubMed]

Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1-38

Deneve S (2004) Bayesian inference in spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.

Dickinson A, Balleine B (2002) The role of learning in motivation Stevens handbook of experimental psychology (3rd ed), Gallistel CR, ed. pp.497

Dickinson A, Hall G, Mackintosh NJ (1976) Surprise and the attenuation of blocking J Exp Psychol: Animal Behav Process 2:313-322

Dickinson A, Mackintosh NJ (1979) Reinforcer specificity in the enhancement of conditioning by posttrial surprise J Exp Psychol: Animal Behav Process 5:162-177

Dickinson A, Smith J, Mirenowicz J (2000) Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav Neurosci 114:468-83 [PubMed]

Doya K (1999) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw 12:961-974 [PubMed]

Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr Opin Neurobiol 10:732-9 [PubMed]

Faure A, Haberland U, Condé F, El Massioui N (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation. J Neurosci 25:2771-80 [Journal] [PubMed]

Fiorillo CD, Schultz W (2001) The reward responses of dopamine neurons persist when prediction of reward is probabilistic with respect to time or occurrence Soc Neurosci Abstr 27:827

Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299:1898-902 [Journal] [PubMed]

Gallistel CR, Gibbon J (2000) Time, rate, and conditioning. Psychol Rev 107:289-344 [PubMed]

Gallistel CR, King A, McDonald R (2004) Sources of variability and systematic error in mouse timing behavior. J Exp Psychol Anim Behav Process 30:3-16 [Journal] [PubMed]

Gibbon J (1977) Scalar expectancy theory and Weber's law in animal timing Psychol Rev 84:279-325

Gold JI, Shadlen MN (2002) Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward. Neuron 36:299-308 [PubMed]

Guedon Y, Cocozza-Thivent C (1990) Explicit state occupancy modeling by hidden semi-Markov models: Application of Derin's scheme Computer Speech And Language 4:167-192

Holland PC (1988) Excitation and inhibition in unblocking. J Exp Psychol Anim Behav Process 14:261-79 [PubMed]

Holland PC, Kenmuir C (2005) Variations in unconditioned stimulus processing in unblocking. J Exp Psychol Anim Behav Process 31:155-71 [Journal] [PubMed]

Holland PC, Lamoureux JA, Han JS, Gallagher M (1999) Hippocampal lesions interfere with Pavlovian negative occasion setting. Hippocampus 9:143-57 [Journal] [PubMed]

Hollerman JR, Schultz W (1998) Dopamine neurons report an error in the temporal prediction of reward during learning. Nat Neurosci 1:304-9 [Journal] [PubMed]

Houk JC, Adams JL, Barto AGA (1995) A model of how the basal ganglia generate and use neural signals that predict reinforcement. Models Of Information Processing In The Basal Ganglia, Houk JC:Davis JL:Beiser DG, ed. pp.249

Joel D, Niv Y, Ruppin E (2002) Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw 15:535-47 [PubMed]

Kaelbling LP, Littman ML, Cassandra AR (1998) Planning and acting in partially observable stochastic domains Art Intell 101:99-134

Kakade S, Dayan P (2000) Acquisition in autoshaping Advances in neural information processing systems, Solla SA:Leen TK:Muller KR, ed.

Kakade S, Dayan P (2002) Acquisition and extinction in autoshaping. Psychol Rev 109:533-44 [PubMed]

Killeen PR, Fetterman JG (1988) A behavioral theory of timing. Psychol Rev 95:274-95 [PubMed]

Kurth-nelson Z, Redish A (2004) µagents: Action-selection in temporally dependent phenomena using temporal difference learning over a collective belief structure Soc Neurosci Abstr 30:207

Levinson SE (1986) Continuously variable duration hidden Markov models for automatic speech recognition Computer Speech And Language 1:29-45

Lewicki MS (2002) Efficient coding of natural sounds. Nat Neurosci 5:356-63 [Journal] [PubMed]

Lewicki MS, Olshausen BA (1999) A probabilistic framework for the adaptation and comparison of image codes J Opt Soc Am A: Optics, Image, Science And Vision 16:1587-1601

Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol 67:145-63 [Journal] [PubMed]

Machado A (1997) Learning the temporal dynamics of behavior. Psychol Rev 104:241-65 [PubMed]

Mahadevan S, Marchalleck N, Das T, Gosavi A (1997) Self-improving factory simulation using continuous-time average-reward reinforcement learning Proceedings of the 14th International Conference on Machine Learning

Matell MS, Meck WH (1999) Reinforcement-induced within-trial resetting of an internal clock. Behav Processes 45:159-71 [PubMed]

McClure SM, Daw ND, Montague PR (2003) A computational substrate for incentive salience. Trends Neurosci 26:423-8 [PubMed]

Mellon RC, Leak TM, Fairhurst S, Gibbon J (1995) Timing processes in the reinforcement-omission effect Animal Learn Behav 23:286-296

Mirenowicz J, Schultz W (1996) Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 379:449-51 [Journal] [PubMed]

Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936-47 [PubMed]

Moore AW, Atkeson CG (1993) Prioritized sweeping: Reinforcement learning with less data and less real time Mach Learn 13:103-130

Morris G, Arkadir D, Nevet A, Vaadia E, Bergman H (2004) Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43:133-43 [Journal] [PubMed]

Niv Y, Daw ND, Dayan P (2005) How fast to work: Response vigor, motivation, and tonic dopamine Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.

Niv Y, Duff MO, Dayan P (2004) The effects of uncertainty on TD learning Cosyne04-Comput Sys Neurosci Abstr

Niv Y, Duff MO, Dayan P (2005) Dopamine, uncertainty and TD learning. Behav Brain Funct 1:6 [Journal] [PubMed]

O'Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004) Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304:452-4 [Journal] [PubMed]

Owen AM (1997) Cognitive planning in humans: neuropsychological, neuroanatomical and neuropharmacological perspectives. Prog Neurobiol 53:431-50 [PubMed]

Pan WX, Schmidt R, Wickens JR, Hyland BI (2005) Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network. J Neurosci 25:6235-42 [Journal] [PubMed]

Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, Rudarakanchana N, Halkerston KM, Robbins TW, Everitt BJ (2002) Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav Brain Res 137:149-63 [PubMed]

Rao RPN (2004) Hierarchical Bayesian inference in networks of spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.

Rao RPN, Olshausen BA, Lewicki MS (2002) Probabilistic models of the brain: Perception and neural function

Satoh T, Nakai S, Sato T, Kimura M (2003) Correlated coding of motivation and outcome of decision by dopamine neurons. J Neurosci 23:9913-23 [PubMed]

Schultz W (1998) Predictive reward signal of dopamine neurons. J Neurophysiol 80:1-27 [Journal] [PubMed]

Schultz W, Apicella P, Ljungberg T (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 13:900-13 [PubMed]

Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593-9 [PubMed]

Schultz W, Romo R (1990) Dopamine neurons of the monkey midbrain: contingencies of responses to stimuli eliciting immediate behavioral reactions. J Neurophysiol 63:607-24 [Journal] [PubMed]

Smith AJ, Becker S, Kapur S (2005) A computational model of the functional role of the ventral-striatal D2 receptor in the expression of previously acquired behaviors. Neural Comput 17:361-95 [Journal] [PubMed]

Staddon JE, Cerutti DT (2003) Operant conditioning. Annu Rev Psychol 54:115-44 [Journal] [PubMed]

Staddon JE, Higa JJ (1999) Time and memory: towards a pacemaker-free theory of interval timing. J Exp Anal Behav 71:215-51 [Journal] [PubMed]

Staddon JE, Innis NK (1969) Reinforcement omission on fixed-interval schedules. J Exp Anal Behav 12:689-700 [PubMed]

Suri RE (2001) Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model. Exp Brain Res 140:234-40 [Journal] [PubMed]

Suri RE, Schultz W (1998) Learning of sequential movements by neural network model with dopamine-like reinforcement signal. Exp Brain Res 121:350-4 [PubMed]

Suri RE, Schultz W (1999) A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task. Neuroscience 91:871-90 [PubMed]

Sutton RS (1984) Temporal credit assignment in reinforcement learning Unpublished doctoral dissertation

Sutton RS (1988) Learning to predict by the method of temporal diferences Machine Learning 3:9-44

Sutton RS (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming Proceedings of the Seventh International Conference on Machine Learning :216-224

Sutton RS, Barto AG (1990) Time-derivative models of Pavlovian reinforcement Learning and computational neuroscience: Foundations of adaptive networks, Gabriel M:Moore J, ed. pp.497

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction [Journal]

   A reinforcement learning example (Sutton and Barto 1998) [Model]

Szita I, Lorincz A (2004) Kalman filter control embedded into the reinforcement learning framework. Neural Comput 16:491-9 [PubMed]

Tobler PN, Dickinson A, Schultz W (2003) Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm. J Neurosci 23:10402-10 [PubMed]

Tsitsiklis JN, Van_Roy B (2002) On average versus discounted reward temporal-difference learning Mach Learn 49:179-191

Ungless MA, Magill PJ, Bolam JP (2004) Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303:2040-2 [Journal] [PubMed]

Voorn P, Vanderschuren LJ, Groenewegen HJ, Robbins TW, Pennartz CM (2004) Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci 27:468-74 [Journal] [PubMed]

Waelti P, Dickinson A, Schultz W (2001) Dopamine responses comply with basic assumptions of formal learning theory. Nature 412:43-8 [Journal] [PubMed]

Yin H, Barnet RC, Miller RR (1994) Second-order conditioning and Pavlovian conditioned inhibition: operational similarities and differences. J Exp Psychol Anim Behav Process 20:419-28 [PubMed]

Zemel R, Huys Q, Natarajan R, Dayan P (2004) Probabilistic computation in spiking neurons Advances in neural information processing systems, Saul LK:Weiss Y:Bottou L, ed.

Fuhs MC, Touretzky DS (2007) Context learning in the rodent hippocampus. Neural Comput 19:3173-215 [Journal] [PubMed]

Rivest F, Kalaska JF, Bengio Y (2010) Alternative time representation in dopamine models. J Comput Neurosci 28:107-30 [Journal] [PubMed]

   Alternative time representation in dopamine models (Rivest et al. 2009) [Model]

(94 refs)