Citation Relationships



Baxter J, Bartlett PL, Weaver L (2001) Experiments with infinite-horizon, policy-gradient estimation J Artif Intel Res 15:351-381

References and models cited by this paper

References and models that cite this paper

Florian RV (2007) Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Comput 19:1468-502 [Journal] [PubMed]

Richmond P, Buesing L, Giugliano M, Vasilaki E (2011) Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations. PLoS One 6:e18539 [Journal] [PubMed]

   Democratic population decisions result in robust policy-gradient learning (Richmond et al. 2011) [Model]

(2 refs)