Striatal dopamine ramping: an explanation by reinforcement learning with decay (Morita & Kato, 2014)

 Download zip file 
Help downloading and running models
Accession:153573
Incorporation of decay of learned values into temporal-difference (TD) learning (Sutton & Barto, 1998, Reinforcement Learning (MIT Press)) causes ramping of TD reward prediction error (RPE), which could explain, given the hypothesis that dopamine represents TD RPE (Montague et al., 1996, J Neurosci 16:1936; Schultz et al., 1997, Science 275:1593), the reported ramping of the dopamine concentration in the striatum in a reward-associated spatial navigation task (Howe et al., 2013, Nature 500:575).
Reference:
1 . Morita K, Kato A (2014) Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits Front. Neural Circuits 8:36
Model Information (Click on a link to find other models with that property)
Model Type: Realistic Network;
Brain Region(s)/Organism:
Cell Type(s):
Channel(s):
Gap Junctions:
Receptor(s):
Gene(s):
Transmitter(s): Dopamine;
Simulation Environment: MATLAB;
Model Concept(s): Reinforcement Learning;
Implementer(s): Morita, Kenji [morita at p.u-tokyo.ac.jp];
Search NeuronDB for information about:  Dopamine;

Morita K, Kato A (2014) Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits Front. Neural Circuits 8:36

References and models cited by this paper

References and models that cite this paper

Bayer HM, Glimcher PW (2005) Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47:129-41 [PubMed]

Bayer HM, Lau B, Glimcher PW (2007) Statistics of midbrain dopamine neuron spike trains in the awake primate. J Neurophysiol 98:1428-39 [PubMed]

Bjorklund A, Dunnett SB (2007) Dopamine neuron systems in the brain: an update. Trends Neurosci 30:194-202

Boeijinga PH, Mulder AB, Pennartz CM, Manshanden I, Lopes da Silva FH (1993) Responses of the nucleus accumbens following fornix/fimbria stimulation in the rat. Identification and long-term potentiation of mono- and polysynaptic pathways. Neuroscience 53:1049-58

Bolam JP, Pissadaki EK (2012) Living on the edge with too many mouths to feed: why dopamine neurons die. Mov Disord 27:1478-83

Bromberg-Martin ES, Matsumoto M, Hikosaka O (2010) Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 68:815-34

Calabresi P, Maj R, Pisani A, Mercuri NB, Bernardi G (1992) Long-term synaptic depression in the striatum: physiological and pharmacological characterization. J Neurosci 12:4224-33 [PubMed]

Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr Opin Neurobiol 10:732-9 [PubMed]

Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299:1898-902 [PubMed]

Gerfen CR, Surmeier DJ (2011) Modulation of striatal projection systems by dopamine. Annu Rev Neurosci 34:441-66

Gershman SJ (2014) Dopamine ramps are a consequence of reward prediction errors. Neural Comput 26:467-71

Glimcher PW (2011) Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A 108 Suppl 3:15647-54

Gustafsson B, Asztely F, Hanse E, Wigstrom H (1989) Onset Characteristics of Long-Term Potentiation in the Guinea-Pig Hippocampal CA1 Region in Vitro. Eur J Neurosci 1:382-394 [PubMed]

Hardt O, Nader K, Nadel L (2013) Decay happens: the role of active forgetting in memory. Trends Cogn Sci 17:111-20

Hardt O, Nader K, Wang YT (2014) GluA2-dependent AMPA receptor endocytosis and the decay of early and late long-term potentiation: possible mechanisms for forgetting of short- and long-term memories. Philos Trans R Soc Lond B Biol Sci 369:20130141-20

Hart AS, Rutledge RB, Glimcher PW, Phillips PE (2014) Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J Neurosci 34:698-704 [PubMed]

Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM (2013) Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature 500:575-9

Ito M, Doya K (2011) Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Curr Opin Neurobiol 21:368-73

Kawagoe R, Takikawa Y, Hikosaka O (2004) Reward-predicting activity of dopamine and caudate neurons--a possible mechanism of motivational control of saccadic eye movement. J Neurophysiol 91:1013-24 [Journal]

Laughlin SB (2001) Energy as a constraint on the coding and processing of sensory information. Curr Opin Neurobiol 11:475-80 [PubMed]

Matsuzaki M, Honkura N, Ellis-Davies GC, Kasai H (2004) Structural basis of long-term potentiation in single dendritic spines. Nature 429:761-6 [PubMed]

Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936-47 [PubMed]

Montague PR, Hyman SE, Cohen JD (2004) Computational roles for dopamine in behavioural control. Nature 431:760-7 [PubMed]

Morita K (2014) Differential cortical activation of the striatal direct and indirect-pathway cells: reconciling the anatomical and optogenetic results by a computational method. J Neurophysiol 21: -73

Morita K, Morishima M, Sakai K, Kawaguchi Y (2012) Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways. Trends Neurosci 35:457-67

Morita K, Morishima M, Sakai K, Kawaguchi Y (2013) Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior. J Neurosci 33:8866-90

Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H (2006) Midbrain dopamine neurons encode decisions for future action. Nat Neurosci 9:1057-63 [PubMed]

Niv Y (2013) Neuroscience: Dopamine ramps up. Nature 500:533-5

Niv Y, Daw ND, Dayan P (2006) Choice values. Nat Neurosci 9:987-8

Niv Y, Daw ND, Joel D, Dayan P (2007) Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl) 191:507-20 [PubMed]

Niv Y, Duff MO, Dayan P (2005) Dopamine, uncertainty and TD learning. Behav Brain Funct 1:6-43 [PubMed]

O'Doherty JP, Hampton A, Kim H (2007) Model-based fMRI and its application to reward learning and decision making. Ann N Y Acad Sci 1104:35-53 [PubMed]

Pennartz CM, Ito R, Verschure PF, Battaglia FP, Robbins TW (2011) The hippocampal-striatal axis in learning, prediction and goal-directed behavior. Trends Neurosci 34:548-59

Pissadaki EK, Bolam JP (2013) The energy cost of action potential propagation in dopamine neurons: clues to susceptibility in Parkinson's disease. Front Comput Neurosci 7:13-59

Potjans W, Diesmann M, Morrison A (2011) An imperfect dopaminergic error signal can drive temporal-difference learning. PLoS Comput Biol 7:e1001133-59

Rangel A, Camerer C, Montague PR (2008) A framework for studying the neurobiology of value-based decision making. Nat Rev Neurosci 9:545-56

Reynolds JN, Hyland BI, Wickens JR (2001) A cellular mechanism of reward-related learning. Nature 413:67-70 [PubMed]

Roesch MR, Calu DJ, Schoenbaum G (2007) Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci 10:1615-24 [PubMed]

Rummery GA, Niranjan M (1994) On-line Q-learning using connectionist systems Technical Report CUED/F-INFENG/TR 166

Samejima K, Ueda Y, Doya K, Kimura M (2005) Representation of action-specific reward values in the striatum. Science 310:1337-40 [PubMed]

Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593-9 [PubMed]

Shen W, Flajolet M, Greengard P, Surmeier DJ (2008) Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321:848-51 [PubMed]

Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH (2013) A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci 16:966-73

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction [Journal]

   A reinforcement learning example (Sutton and Barto 1998) [Model]

Threlfell S, Lalic T, Platt NJ, Jennings KA, Deisseroth K, Cragg SJ (2012) Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron 75:58-64 [PubMed]

Ungerstedt U (1971) Stereotaxic mapping of the monoamine pathways in the rat brain. Acta Physiol Scand Suppl 367:1-48

Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, Uchida N (2012) Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron 74:858-73

Watkins CJCH (1989) Learning from delayed rewards Unpublished doctoral dissertation

Xiao MY, Niu YP, Wigstrom H (1996) Activity-dependent decay of early LTP revealed by dual EPSP recording in hippocampal slices from young rats. Eur J Neurosci 8:1916-23 [PubMed]

(49 refs)