Note: These instructions apply to Udacity students who are enrolled in the Machine Learning 3: Reinforcement Learning course. They do not apply to Georgia Tech OMSCS students.
Reinforcement learning is learning what to do--how to map situations to actions--so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two characteristics--trial-and-error search and delayed reward--are the two most important distinguishing features of reinforcement learning.
In the 1980's you had to play Pacman. But it is 2014 now. We want our Pacman to learn and adapt on its own.
In this project, you will work on two domains - Gridworld and Pacman. You will implement value iteration and Q Learning for the Gridworld problem. You will then use these implementations to create an intelligent Pacman agent that eats all the food while avoiding the monsters.
Ready to play?
The original version of the project was developed by the Berkeley Group.