Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

Răzvan V. Florian

Neural Computation, 19(6), 1468–1502 (2007) .

Full text: http://dx.doi.org/10.1162/neco.2007.19.6.1468

Author-provided full text:

Publisher's version/PDF, on author's personal website or repository - added by Răzvan Valentin Florian

Abstract

The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity, by applying a reinforcement learning algorithm to the stochastic spike response model of spiking neurons. These rules have several features common to plasticity mechanisms experimentally found in the brain. We then demonstrate in simulations of networks of integrate-and-fire neurons the efficacy of two simple learning rules involving modulated STDP. One rule is a direct extension of the standard STDP model (modulated STDP), and the other one involves an eligibility trace stored at each synapse that keeps a decaying memory of the relationships between the recent pairs of pre- and postsynaptic spike pairs (modulated STDP with eligibility trace). This latter rule permits learning even if the reward signal is delayed. The proposed rules are able to solve the XOR problem with both rate coded and temporally coded input and to learn a target output firing-rate pattern. These learning rules are biologically plausible, may be used for training generic artificial spiking neural networks, regardless of the neural model used, and suggest the experimental investigation in animals of the existence of reward-modulated STDP.

Add this to the list of publications that I have authored

Save to my library

Add your rating and review

If all scientific publications that you have read were ranked according to their scientific quality and importance from 0% (worst) to 100% (best), where would you place this publication? Please rate by selecting a range.

0% - 100%

This publication ranks between % and % of publications that I have read in terms of scientific quality and importance.

Review title (optional)

Review (optional)

Keep my rating and review anonymous
Show publicly that I gave the rating and I wrote the review

Comments

Răzvan Valentin Florian
Răzvan ValentinFlorian

May 07, 2015

2015-05-07T13:24:08+00:00
Preliminary results that have already shown that the modulation of STDP leads to reinforcement learning have been published in 2005 in this paper:
- Florian, R. V. (2005). A reinforcement learning algorithm for spiking neural networks. In D. Zaharie, D. Petcu, V. Negru, T. Jebelean, G. Ciobanu, A. Cicortaş, et al. (Eds.), Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2005), 25-29 September 2005, Timişoara, Romania (p. 299–306). IEEE Computer Society.
The paper predicted, trough analytical derivations and computer simulations, the control of the polarity of STDP by neuromodulators such as dopamine. This has been later experimentally found in the brain, here is a collection of papers showing the experimental support for the neuromodulation of STDP.
Răzvan Valentin Florian
Răzvan ValentinFlorian

December 14, 2018

2018-12-14T07:49:31+00:00

The learning rules introduced in this paper are implemented by BindsNET, a machine learning-oriented spiking neural networks library in Python.

A basic implementation of the learning rules has been provided by Sergio Chevtchenko.
Răzvan Valentin Florian
Răzvan ValentinFlorian

July 15, 2019

2019-07-15T16:35:55+00:00

Another open-source implementation of reward-modulated STDP is provided by SpykeTorch.

Export to:

Notice: Undefined index: publicationsCaching in /www/html/epistemio/application/controllers/PublicationController.php on line 2240

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

Abstract

Add your rating and review

Comments

Sign up / Log in

Services

Company

Legal info

Blog & newsletter

Follow us

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

Abstract

Add your rating and review

Comments

Embed publication

Save publication

Share comment

Sign up / Log in

Services

Company

Legal info

Blog & newsletter

Follow us

Log in