Circuits that contain the Model Concept : Reward-modulated STDP

Re-display model names without descriptions
    Models   Description
1. Acetylcholine-modulated plasticity in reward-driven navigation (Zannone et al 2018)
"Neuromodulation plays a fundamental role in the acquisition of new behaviours. In previous experimental work, we showed that acetylcholine biases hippocampal synaptic plasticity towards depression, and the subsequent application of dopamine can retroactively convert depression into potentiation. We also demonstrated that incorporating this sequentially neuromodulated Spike- Timing-Dependent Plasticity (STDP) rule in a network model of navigation yields effective learning of changing reward locations. Here, we employ computational modelling to further characterize the effects of cholinergic depression on behaviour. We find that acetylcholine, by allowing learning from negative outcomes, enhances exploration over the action space. We show that this results in a variety of effects, depending on the structure of the model, the environment and the task. Interestingly, sequentially neuromodulated STDP also yields flexible learning, surpassing the performance of other reward-modulated plasticity rules."
2. Cortical model with reinforcement learning drives realistic virtual arm (Dura-Bernal et al 2015)
We developed a 3-layer sensorimotor cortical network of consisting of 704 spiking model-neurons, including excitatory, fast-spiking and low-threshold spiking interneurons. Neurons were interconnected with AMPA/NMDA, and GABAA synapses. We trained our model using spike-timing-dependent reinforcement learning to control a virtual musculoskeletal human arm, with realistic anatomical and biomechanical properties, to reach a target. Virtual arm position was used to simultaneously control a robot arm via a network interface.
3. First-Spike-Based Visual Categorization Using Reward-Modulated STDP (Mozafari et al. 2018)
"...Here, for the first time, we show that (Reinforcement Learning) RL can be used efficiently to train a spiking neural network (SNN) to perform object recognition in natural images without using an external classifier. We used a feedforward convolutional SNN and a temporal coding scheme where the most strongly activated neurons fire first, while less activated ones fire later, or not at all. In the highest layers, each neuron was assigned to an object category, and it was assumed that the stimulus category was the category of the first neuron to fire. ..."
4. Motor system model with reinforcement learning drives virtual arm (Dura-Bernal et al 2017)
"We implemented a model of the motor system with the following components: dorsal premotor cortex (PMd), primary motor cortex (M1), spinal cord and musculoskeletal arm (Figure 1). PMd modulated M1 to select the target to reach, M1 excited the descending spinal cord neurons that drove the arm muscles, and received arm proprioceptive feedback (information about the arm position) via the ascending spinal cord neurons. The large-scale model of M1 consisted of 6,208 spiking Izhikevich model neurons [37] of four types: regular-firing and bursting pyramidal neurons, and fast-spiking and low-threshold-spiking interneurons. These were distributed across cortical layers 2/3, 5A, 5B and 6, with cell properties, proportions, locations, connectivity, weights and delays drawn primarily from mammalian experimental data [38], [39], and described in detail in previous work [29]. The network included 486,491 connections, with synapses modeling properties of four different receptors ..."
5. Reinforcement learning of targeted movement (Chadderdon et al. 2012)
"Sensorimotor control has traditionally been considered from a control theory perspective, without relation to neurobiology. In contrast, here we utilized a spiking-neuron model of motor cortex and trained it to perform a simple movement task, which consisted of rotating a single-joint “forearm” to a target. Learning was based on a reinforcement mechanism analogous to that of the dopamine system. This provided a global reward or punishment signal in response to decreasing or increasing distance from hand to target, respectively. Output was partially driven by Poisson motor babbling, creating stochastic movements that could then be shaped by learning. The virtual forearm consisted of a single segment rotated around an elbow joint, controlled by flexor and extensor muscles. ..."
6. Reward modulated STDP (Legenstein et al. 2008)
"... This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. ... In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics."
7. Sensorimotor cortex reinforcement learning of 2-joint virtual arm reaching (Neymotin et al. 2013)
"... We developed a model of sensory and motor neocortex consisting of 704 spiking model-neurons. Sensory and motor populations included excitatory cells and two types of interneurons. Neurons were interconnected with AMPA/NMDA, and GABAA synapses. We trained our model using spike-timing-dependent reinforcement learning to control a 2-joint virtual arm to reach to a fixed target. ... "

Re-display model names without descriptions