Citation Relationships



Sutton RS, Barto AG (1998) Reinforcement learning: an introduction

   A reinforcement learning example (Sutton and Barto 1998)

References and models cited by this paper

References and models that cite this paper

Anastasio TJ, Gad YP (2007) Sparse cerebellar innervation can morph the dynamics of a model oculomotor neural integrator. J Comput Neurosci 22:239-54 [Journal] [PubMed]

Baras D, Meir R (2007) Reinforcement learning, spike-time-dependent plasticity, and the BCM rule. Neural Comput 19:2245-79 [Journal] [PubMed]

Bogacz R, Gurney K (2007) The basal ganglia and cortex implement optimal decision making between alternative actions. Neural Comput 19:442-77 [Journal] [PubMed]

Brzosko Z, Zannone S, Schultz W, Clopath C, Paulsen O (2017) Sequential neuromodulation of Hebbian plasticity offers mechanism for effective reward-based navigation. Elife [Journal] [PubMed]

   Sequential neuromodulation of Hebbian plasticity in reward-based navigation (Brzosko et al 2017) [Model]

Chadderdon GL, Neymotin SA, Kerr CC, Lytton WW (2012) Reinforcement learning of targeted movement in a spiking neuronal model of motor cortex. PLoS One 7:e47251 [Journal] [PubMed]

   Reinforcement learning of targeted movement (Chadderdon et al. 2012) [Model]

Clopath C, Ziegler L, Vasilaki E, Büsing L, Gerstner W (2008) Tag-trigger-consolidation: a model of early and late long-term-potentiation and depression. PLoS Comput Biol 4:e1000248 [Journal] [PubMed]

   Tag Trigger Consolidation (Clopath and Ziegler et al. 2008) [Model]

Daw ND, Courville AC, Tourtezky DS, Touretzky DS (2006) Representation and timing in theories of the dopamine system. Neural Comput 18:1637-77 [Journal] [PubMed]

Florian RV (2007) Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Comput 19:1468-502 [Journal] [PubMed]

Fujita H, Ishii S (2007) Model-based reinforcement learning for partially observable games with sampling-based state estimation. Neural Comput 19:3051-87 [Journal] [PubMed]

Gutkin BS, Dehaene S, Changeux JP (2006) A neurocomputational hypothesis for nicotine addiction. Proc Natl Acad Sci U S A 103:1106-11 [Journal] [PubMed]

Hasselmo ME (2005) A model of prefrontal cortical mechanisms for goal-directed behavior. J Cogn Neurosci 17:1115-29 [Journal] [PubMed]

   Prefrontal cortical mechanisms for goal-directed behavior (Hasselmo 2005) [Model]

Hasselmo ME, Eichenbaum H (2005) Hippocampal mechanisms for the context-dependent retrieval of episodes. Neural Netw 18:1172-90 [Journal] [PubMed]

   Hippocampal context-dependent retrieval (Hasselmo and Eichenbaum 2005) [Model]

Hazy TE, Frank MJ, O'reilly RC (2007) Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system. Philos Trans R Soc Lond B Biol Sci 362:1601-13 [Journal] [PubMed]

Izhikevich EM (2007) Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb Cortex 17:2443-52 [Journal] [PubMed]

   Linking STDP and Dopamine action to solve the distal reward problem (Izhikevich 2007) [Model]

Kulvicius T, Tamosiunaite M, Ainge J, Dudchenko P, Wörgötter F (2008) Odor supported place cell model and goal navigation in rodents. J Comput Neurosci 25:481-500 [Journal] [PubMed]

   Odor supported place cell model and goal navigation in rodents (Kulvicius et al. 2008) [Model]

Low KH, Leow WK, Ang MH Jr (2005) An Ensemble of Cooperative Extended Kohonen Maps for Complex Robot Motion Tasks Neural Comput 17:1411-1445

Morimoto J, Doya K (2007) Reinforcement learning state estimator. Neural Comput 19:730-56 [Journal] [PubMed]

Morita K, Kato A (2014) Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits. Front Neural Circuits 8:36 [Journal] [PubMed]

   Striatal dopamine ramping: an explanation by reinforcement learning with decay (Morita & Kato, 2014) [Model]

Moustafa AA, Cohen MX, Sherman SJ, Frank MJ (2008) A role for dopamine in temporal decision making and reward maximization in parkinsonism. J Neurosci 28:12294-304 [Journal] [PubMed]

Nakano T, Otsuka M, Yoshimoto J, Doya K (2015) A spiking neural network model of model-free reinforcement learning with high-dimensional sensory input and perceptual ambiguity. PLoS One 10:e0115620 [Journal] [PubMed]

   A spiking neural network model of model-free reinforcement learning (Nakano et al 2015) [Model]

O'Reilly RC, Frank MJ (2005) Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia Neural Comput 18:283-328

O'Reilly RC, Frank MJ (2006) Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput 18:283-328 [Journal] [PubMed]

Richmond P, Buesing L, Giugliano M, Vasilaki E (2011) Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations. PLoS One 6:e18539 [Journal] [PubMed]

   Democratic population decisions result in robust policy-gradient learning (Richmond et al. 2011) [Model]

Rivest F, Kalaska JF, Bengio Y (2010) Alternative time representation in dopamine models. J Comput Neurosci 28:107-30 [Journal] [PubMed]

   Alternative time representation in dopamine models (Rivest et al. 2009) [Model]

Roelfsema PR, van Ooyen A (2005) Attention-gated reinforcement learning of internal representations for classification. Neural Comput 17:2176-214 [Journal] [PubMed]

Sakai Y, Fukai T (2008) The actor-critic learning is behind the matching law: matching versus optimal behaviors. Neural Comput 20:227-51 [Journal] [PubMed]

Smith AJ, Becker S, Kapur S (2005) A computational model of the functional role of the ventral-striatal D2 receptor in the expression of previously acquired behaviors. Neural Comput 17:361-95 [Journal] [PubMed]

Soltani A, Wang XJ (2006) A biophysically based neural model of matching law behavior: melioration by stochastic synapses. J Neurosci 26:3731-44 [Journal] [PubMed]

Todorov E (2005) Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput 17:1084-108 [Journal] [PubMed]

Toussaint M (2006) A sensorimotor map: modulating lateral interactions for anticipation and planning. Neural Comput 18:1132-55 [Journal] [PubMed]

Triesch J (2007) Synergies between intrinsic and synaptic plasticity mechanisms. Neural Comput 19:885-909 [Journal] [PubMed]

Troyer TW, Doupe AJ (2000) An associational model of birdsong sensorimotor learning I. Efference copy and the learning of song syllables. J Neurophysiol 84:1204-23 [Journal] [PubMed]

Troyer TW, Doupe AJ (2000) An associational model of birdsong sensorimotor learning II. Temporal hierarchies and the learning of song sequence. J Neurophysiol 84:1224-39 [Journal] [PubMed]

Wörgötter F, Porr B (2005) Temporal sequence learning, prediction, and control: a review of different models and their relation to biological mechanisms. Neural Comput 17:245-319 [Journal] [PubMed]

(34 refs)