Citation Relationships



Marbach P, Tsitsiklis JN (2000) Approximate gradient methods in policy-space optimization of Markov reward processes Discrete Event Dynamic Systems: Theory and Applications 13:111-148

References and models cited by this paper

References and models that cite this paper

Florian RV (2007) Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Comput 19:1468-502 [Journal] [PubMed]

(1 refs)