Algorithms_for_Reinforcement_Learning
Unknown
Algorithms_for_Reinforcement_Learning
Free
Description
Contents
Reviews
Language
English
ISBN
Unknown
Overview
Markov decision processes
Preliminaries
Markov Decision Processes
Value functions
Dynamic programming algorithms for solving MDPs
Value prediction problems
Temporal difference learning in finite state spaces
Tabular TD(0)
Every-visit Monte-Carlo
TD(): Unifying Monte-Carlo and TD(0)
Algorithms for large state spaces
TD() with function approximation
Gradient temporal difference learning
Least-squares methods
The choice of the function space
Control
A catalog of learning problems
Closed-loop interactive learning
Online learning in bandits
Active learning in bandits
Active learning in Markov Decision Processes
Online learning in Markov Decision Processes
Direct methods
Q-learning in finite MDPs
Q-learning with function approximation
Actor-critic methods
Implementing a critic
Implementing an actor
For further exploration
Further reading
Applications
Software
Acknowledgements
The theory of discounted Markovian decision processes
Contractions and Banach's fixed-point theorem
Application to MDPs
The book hasn't received reviews yet.