A reinforcement learning algorithm for spiking neural networks
Răzvan V. Florian
The paper presents a new reinforcement learning mechanism for spiking neural networks. The algorithm is derived for networks of stochastic integrate-and-fire neurons, but it can be also applied to generic spiking neural networks. Learning is achieved by synaptic changes that depend on the firing of pre- and postsynaptic neurons, and that are modulated with a global reinforcement signal. The efficacy of the algorithm is verified in a biologically-inspired experiment, featuring a simulated worm that searches for food. Our model recovers a form of neural plasticity experimentally observed in animals, combining spike-timing-dependent synaptic changes of one sign with non-associative synaptic changes of the opposite sign determined by presynaptic spikes. The model also predicts that the time constant of spike-timing-dependent synaptic changes is equal to the membrane time constant of the neuron, in agreement with experimental observations in the brain. This study also led to the discovery of a biologically-plausible reinforcement learning mechanism that works by modulating spike-timing-dependent plasticity (STDP) with a global reward signal.