This is based on David Silver's course but targeting younger students within a shorter 50min format (missing the advanced derivations) + more examples and Colab code.
Slides: [ Ссылка ]
Twitter: [ Ссылка ]
Next video: [ Ссылка ]
Introduction
- definition
- examples
- comparison
A Brief History
- learning by trial and error
- optimal control and dynamic programming
- Monte Carlo tree search
- temporal difference algorithms
Key Concepts
- designing rewards
- action spaces
- observability
- information states
- policies
- value functions
- model
- taxonomy
#reinforcementlearning #dynamicprogramming #MCTS #TD #history #reinforcement #optimalcontrol #actionspaces #valuefunction
Ещё видео!