This article provides tools for an analytic treatment of reward-modulated
STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning
These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but
also temporal firing patterns of presynaptic neurons.
They also can learn to respond to specific presynaptic firing patterns
with particular spike patterns.
Finally, the resulting learning theory predicts that even difficult credit-assignment problems,
where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system,
can be solved in a self-organizing manner through reward-modulated STDP.
This yields an explanation for a fundamental
experimental result on biofeedback in monkeys by Fetz and Baker.
In this experiment monkeys were rewarded for
increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment
In addition our model
demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without
endangering the stability of the network dynamics."