Reward modulated STDP (Legenstein et al. 2008)

 Download zip file 
Help downloading and running models
Accession:116837
"... This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. ... In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics."
Reference:
1 . Legenstein R, Pecevski D, Maass W (2008) A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback. PLoS Comput Biol 4:e1000180 [PubMed]
Model Information (Click on a link to find other models with that property)
Model Type: Realistic Network;
Brain Region(s)/Organism: Neocortex;
Cell Type(s):
Channel(s):
Gap Junctions:
Receptor(s):
Gene(s):
Transmitter(s):
Simulation Environment: Python; PCSIM;
Model Concept(s): Pattern Recognition; Spatio-temporal Activity Patterns; Reinforcement Learning; STDP; Biofeedback; Reward-modulated STDP;
Implementer(s):
This directory contains the scripts for computer simulation 1 
(with weight-dependent RM-STDP rule) from 

	Legenstein R, Pecevski D, Maass W 2008 A Learning Theory 
    for Reward-Modulated Spike-Timing-Dependent Plasticity with 
    Application to Biofeedback. PLoS Computational Biology 4(10): e1000180, Oct, 2008 
    
The produced results are supplementary figures 3 and 4.

To create these figures you need to:

1. The computer simulation is setup to run as an MPI application
   on 16 computing nodes on a cluster, with 2 processes per computing node.
   To set the list of the names of machines you want to use edit the file start_simulation.py.
   
2. Start mpdboot on the cluster machines. See the mpich2 documentation on how to do this. 
   
3. Execute:

    start_simulation.py
    
    This is an executable file, you don't need to run 'python start_simulation.py'.
    
    Wait until the simulation finishes. The script will produce one hdf5 file in the current directory.

4. To create supplementary figure 3 run:
   
      ipython -pylab figure_draft_journal.py
      
5. The next script uses IPython parallel computing capabilities to parallelize execution 
   of some calculations. To setup on which machines the ipython cluster should run, 
   edit the clusterconf.py file.      
      
      
6. Then, to create supplementary figure 3 run:
   
      ipython -pylab figure_journal_ai_analysis.py
      

      

Loading data, please wait...