Reinforcement learning of targeted movement (Chadderdon et al. 2012)

 Download zip file   Auto-launch 
Help downloading and running models
Accession:144538
"Sensorimotor control has traditionally been considered from a control theory perspective, without relation to neurobiology. In contrast, here we utilized a spiking-neuron model of motor cortex and trained it to perform a simple movement task, which consisted of rotating a single-joint “forearm” to a target. Learning was based on a reinforcement mechanism analogous to that of the dopamine system. This provided a global reward or punishment signal in response to decreasing or increasing distance from hand to target, respectively. Output was partially driven by Poisson motor babbling, creating stochastic movements that could then be shaped by learning. The virtual forearm consisted of a single segment rotated around an elbow joint, controlled by flexor and extensor muscles. ..."
Reference:
1 . Chadderdon GL, Neymotin SA, Kerr CC, Lytton WW (2012) Reinforcement learning of targeted movement in a spiking neuronal model of motor cortex. PLoS One 7:e47251 [PubMed]
Citations  Citation Browser
Model Information (Click on a link to find other models with that property)
Model Type: Realistic Network;
Brain Region(s)/Organism: Neocortex;
Cell Type(s): Neocortex fast spiking (FS) interneuron; Neocortex spiking regular (RS) neuron; Neocortex spiking low threshold (LTS) neuron;
Channel(s):
Gap Junctions:
Receptor(s): GabaA; AMPA; NMDA;
Gene(s):
Transmitter(s): Dopamine; Gaba; Glutamate;
Simulation Environment: NEURON;
Model Concept(s): Simplified Models; Synaptic Plasticity; Long-term Synaptic Plasticity; Reinforcement Learning; Reward-modulated STDP;
Implementer(s): Neymotin, Sam [Samuel.Neymotin at nki.rfmh.org]; Chadderdon, George [gchadder3 at gmail.com];
Search NeuronDB for information about:  GabaA; AMPA; NMDA; Dopamine; Gaba; Glutamate;
/
arm1d
README
drspk.mod *
infot.mod *
intf6_.mod *
intfsw.mod *
misc.mod *
nstim.mod *
stats.mod *
updown.mod *
vecst.mod *
arm.hoc
basestdp.hoc
col.hoc *
colors.hoc *
declist.hoc *
decmat.hoc *
decnqs.hoc *
decvec.hoc *
default.hoc *
drline.hoc *
filtutils.hoc *
geom.hoc
grvec.hoc *
hinton.hoc *
infot.hoc *
init.hoc
intfsw.hoc *
labels.hoc *
local.hoc *
misc.h *
mosinit.hoc
network.hoc
nload.hoc
nqs.hoc *
nqsnet.hoc *
nrnoc.hoc *
params.hoc
run.hoc
samutils.hoc *
sense.hoc *
setup.hoc *
sim.hoc
simctrl.hoc *
stats.hoc *
stim.hoc
syncode.hoc *
units.hoc *
xgetargs.hoc *
                            
This code generates figures related to the model defined in

Chadderdon GL, Neymotin SA, Kerr CC, Lytton WW (2012) 
Reinforcement learning of targeted movement in a spiking neuronal 
model of motor cortex.  PLoS ONE.

This document provides brief installation and usage instructions.

INSTALLATION

Note: the code has been designed to work on Linux machines. It may
work on Macs and will definitely not work on Windows.  This 
simulation also requires a good deal of memory to run: e.g. it will  
run on a machine with ~ 8 GB, but not one with ~ 3 GB.

Instructions:

1. Unzip all files.
2. Type "nrnivmodl *.mod" in the directory. This should create a
directory called either i686 or x86_64, depending on your computer's
architecture, and put a file called "special" in that directory.


USAGE

Run the simulation by typing "nrngui sim.hoc" in the directory. 
After brief initialization, a window should appear showing the arm 
moving with respect to the target X.  After this has run for a 
simulated 200 s, two more windows will be generated on top of this: 
a window showing the spiking of all of the units of the model in 
red for the first 1 s of the simulation, and another window showing 
the trajectory of the elbow joint in blue as it learns to reach for 
35 degrees.  Note that the raster plot shows 96 PR and 96 EM 
cells.  48 of each of these are related to shoulder joint activity 
of the virtual arm; the firing of these cells is not shown in the 
figures of the paper.

To run under a different learning target, you can go into arm.hoc, 
Line 66 and change the value of 'targid' to pick a different elbow 
angle target  (0 = 135, 1 = 105, 2 = 75, 3 = 35, 4 = 0 degrees).  
To choose a different learning condition, you can go to Line 54 and 
change the value of RLMode: 0 = no-learning, 1 = reward-only, 
2 = punishment only, 3 = reward-and-punishment.  The random seed for 
network wiring can be changed by going into Line 91 of network.hoc 
and setting 'dvseed' to a different value.  The random seed for the 
babbling noise can be changed by resetting the value of 'inputseed' 
in Line 13 in stim.hoc.  After resetting these parameters and 
saving the files, simulations can then be started as above.

CHANGELOG

20160921 Updates from the Lytton lab to allow the model to run on mac
OS X.

20220517 Updated MOD files to contain valid C++ and be compatible with
the upcoming versions 8.2 and 9.0 of NEURON. Updated to use post ~2011
signature of mcell_ran4_init function and fix hashseed2 argument.
20230420 Updated MOD files for compatibility with the new data
structures in the upcoming version 9.0 of NEURON.