Citation Relationships

Sakai Y, Fukai T (2008) The actor-critic learning is behind the matching law: matching versus optimal behaviors. Neural Comput 20:227-51 [PubMed]

References and models cited by this paper

References and models that cite this paper

Barraclough DJ, Conroy ML, Lee D (2004) Prefrontal cortex and decision making in a mixed-strategy game. Nat Neurosci 7:404-10 [Journal] [PubMed]

Baum WM (1981) Optimization and the matching law as accounts of instrumental behavior. J Exp Anal Behav 36:387-403 [PubMed]

Baum WM, Rachlin HC (1969) Choice as time allocation. J Exp Anal Behav 12:861-74 [PubMed]

Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P (2001) Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 30:619-39 [PubMed]

Davison M, Mccarthy D (1987) The matching law: A research review

Daw ND, Touretzky DS (2001) Operant behavior suggestsattentional gating of dopamine system inputs Neurocomputing 38:1161-1167

Daw ND, Touretzky DS (2002) Long-term reward prediction in TD models of the dopamine system. Neural Comput 14:2567-83 [Journal] [PubMed]

Dayan P, Abbott LF (2001) Theoretical Neuroscience. Computational and Mathematical Modeling of Neural Systems

Dayan P, Balleine BW (2002) Reward, motivation, and reinforcement learning. Neuron 36:285-98 [PubMed]

DeCarlo LT (1985) Matching and maximizing with variable-time schedules. J Exp Anal Behav 43:75-81 [Journal] [PubMed]

Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr Opin Neurobiol 10:732-9 [PubMed]

Gallistel CR, Mark TA, King AP, Latham PE (2001) The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. J Exp Psychol Anim Behav Process 27:354-72 [PubMed]

Haruno M, Kuroda T, Doya K, Toyama K, Kimura M, Samejima K, Imamizu H, Kawato M (2004) A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. J Neurosci 24:1660-5 [Journal] [PubMed]

Herrnstein RJ, Heyman GM (1979) Is matching compatible with reinforcement maximization on concurrent variable interval variable ratio? J Exp Anal Behav 31:209-23 [PubMed]

Herrnstein RJ, Rachlin H, Laibson DI (1997) The matching law: papers in psychology and economics

Herrnstein RJ, Vaughan WJ (1980) Melioration and behavioral allocation Limits to action: the allocation of individual behavior, Staddon JER, ed. pp.143

Heyman GM (1979) A Markov model description of changeover probabilities on concurrent variable-interval schedules. J Exp Anal Behav 31:41-51 [PubMed]

Heyman GM, Monaghan MM (1994) Reinforcer magnitude (sucrose concentration) and the matching law theory of response strength. J Exp Anal Behav 61:505-16 [PubMed]

Houk JC, Davis JL, Beiser DG (1995) Models of Information Processing in the Basal Ganglia

Houston AI, McNamara J (1981) How to maximize reward rate on two variable-interval paradigms. J Exp Anal Behav 35:367-96 [PubMed]

Jacobs EA, Hackenberg TD (1996) Humans' choices in situations of time-based diminishing returns: effects of fixed-interval duration and progressive-interval step size. J Exp Anal Behav 65:5-19 [Journal] [PubMed]

Knutson B, Adams CM, Fong GW, Hommer D (2001) Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J Neurosci 21:RC159 [PubMed]

Mazur JE (1981) Optimization theory fails to predict performance of pigeons in a two-response situation. Science 214:823-5 [PubMed]

Mazur JE (2005) Learning and behavior (6th ed)

McClure SM, Berns GS, Montague PR (2003) Temporal prediction errors in a passive learning task activate human striatum. Neuron 38:339-46 [PubMed]

Montague PR, Berns GS (2002) Neural economics and the biological substrates of valuation. Neuron 36:265-84 [PubMed]

Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936-47 [PubMed]

Morris G, Arkadir D, Nevet A, Vaadia E, Bergman H (2004) Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron 43:133-43 [Journal] [PubMed]

Platt ML, Glimcher PW (1999) Neural correlates of decision variables in parietal cortex. Nature 400:233-8 [Journal] [PubMed]

Rachlin H, Green L, Kagel J, Battalio R (1976) Economic demand theory and psychological studies of choice The psychology of learning and motivation, Bower G, ed. pp.129

Sakagami T, Hursh SR, Christensen J, Silberberg A (1989) Income maximizing in concurrent interval-ratio schedules. J Exp Anal Behav 52:41-6 [PubMed]

Samejima K, Ueda Y, Doya K, Kimura M (2005) Representation of action-specific reward values in the striatum. Science 310:1337-40 [Journal] [PubMed]

Savastano HI, Fantino E (1994) Human choice in concurrent ratio-interval schedules of reinforcement. J Exp Anal Behav 61:453-63 [Journal] [PubMed]

Schultz W (2004) Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Curr Opin Neurobiol 14:139-47 [Journal] [PubMed]

Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593-9 [PubMed]

Seung HS (2003) Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40:1063-73 [PubMed]

Silberberg A, Thomas JR, Berendzen N (1991) Human choice on concurrent variable-interval variable-ratio schedules. J Exp Anal Behav 56:575-84 [Journal] [PubMed]

Staddon JE, Hinson JM (1983) Optimization: a result or a mechanism? Science 221:976-7

Stubbs DA, Pliskoff SS, Reid HM (1977) Concurrent schedules: a quantitative relation between changeover behavior and its consequences. J Exp Anal Behav 27:85-96 [PubMed]

Sugrue LP, Corrado GS, Newsome WT (2004) Matching behavior and the representation of value in the parietal cortex. Science 304:1782-7 [Journal] [PubMed]

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction [Journal]

   A reinforcement learning example (Sutton and Barto 1998) [Model]

Tanaka SC, Doya K, Okada G, Ueda K, Okamoto Y, Yamawaki S (2004) Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat Neurosci 7:887-93 [Journal] [PubMed]

Vyse SA, Belke TW (1992) Maximizing versus matching on concurrent variable-interval schedules. J Exp Anal Behav 58:325-34 [PubMed]

Wang XJ (2002) Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36:955-68 [PubMed]

(44 refs)