Intermediate

Approx. {{courseState.expectedDuration}} {{courseState.expectedDurationUnit}}

Assumes 6hr/wk (work at your own pace)

Built by
Join {{10488 | number:0}} Students
view course trailer
View Trailer

Course Summary

This is the third and final course of the 3-course Machine Learning Series and is offered at Georgia Tech as CS7641. Taking this class here does not earn Georgia Tech credit.

Can we program machines to learn like humans? This Reinforcement Learning course will teach you the algorithms for designing self-learning agents like us!

Reinforcement Learning is the area of Machine Learning concerned with the actions that software agents ought to take in a particular environment in order to maximize rewards. You can apply Reinforcement Learning to robot control, chess, backgammon, checkers, and other activities that a software agent can learn. Reinforcement Learning uses behaviorist psychology in order to achieve reward maximization.

This course includes important Reinforcement Learning approaches like Markov Decision Processes and Game Theory. Please refer to the Syllabus for a detailed breakdown of topics.

Series Information: Machine Learning is a graduate-level series of 3 courses, covering the area of Artificial Intelligence concerned with computer programs that modify and improve their performance through experiences.

If you are new to Machine Learning, we suggest you take these 3 courses in order.

The entire series is taught as a lively and rigorous dialogue between two eminent Machine Learning professors and friends: Professor Charles Isbell (Georgia Tech) and Professor Michael Littman (Brown University).

Why Take This Course?

You will learn about Reinforcement Learning, the field of Machine Learning concerned with the actions that software agents ought to take in a particular environment in order to maximize rewards.

Michael: Reinforcement Learning is a very popular field.
Charles: Perhaps because you're in it, Michael.
Michael: I don't think that's it.

In this course, you will gain an understanding of topics and methods in Reinforcement Learning, including Markov Decision Processes and Game Theory. You will gain experience implementing Reinforcement Learning techniques in a final project.

In the final project, we’ll bring back the 80's and design a Pacman agent capable of eating all the food without getting eaten by monsters.

Prerequisites and Requirements

We recommend you take Machine Learning 1: Supervised Learning and Machine Learning 2: Unsupervised Learning prior to taking this course.

An introductory course like Udacity's Introduction to Artificial Intelligence provides a helpful background for this course. Programming experience (for example, taking Udacity's Introduction to CS), and basic familiarity with statistics and probability theory is required. Udacity's Intro to Statistics, especially Lessons 8, 9 and 10, is helpful.

The most important prerequisite for enjoying and doing well in this class is your interest in the material.

See the Technology Requirements for using Udacity.

What Will I Learn?

Projects

Use a familiar Gridworld domain to train a Reinforcement Learning agent and then design an agent that can play Pacman!

Syllabus

Lesson 1: Markov Decision Processes

  • Decision Making and Reinforcement Learning
  • Markov Decision Processes
  • Sequences of Rewards
  • Assumptions
  • Policies
  • Finding Policies

Lesson 2: Reinforcement Learning

  • Rat Dinosaurs
  • API
  • Three Approaches to RL
  • A New Kind of Value Function
  • Estimating Q from Transitions
  • Q Learning Convergence
  • Greedy Expoloration

Lesson 3: Game Theory

  • What is Game Theory
  • Minimax
  • Fundamental Result
  • Game Tree
  • Von Neumann
  • Center Game
  • Snitch
  • A Beautiful Equilibrium
  • The Two Step
  • 2Step2Furious

Lesson 4: Game Theory, Continued

  • The Sequencing
  • Iterated Prisioner’s Dilemna
  • Uncertain End
  • Tit for Tat
  • Finite State Strategy
  • Folk Theorem
  • Security Level Profile
  • Grim Trigger
  • Implausible Threats
  • Pavlov
  • Computational Folk Theorem
  • Stochastic Games and Multiagent RL
  • Zero Sum Stochastic Games
  • General Sum Games

Reinforcement Learning Project

Instructors & Partners

Charles Isbell is a Professor and Senior Associate Dean at the School of Interactive Computing at Georgia Tech. His research passion is artificial intelligence, particularly on building autonomous agents that must live and interact with large numbers of other intelligent agents, some of whom may be human. Lately, he has turned his energies toward adaptive modeling, especially activity discovery (as distinct from activity recognition), scalable coordination, and development environments that support the rapid prototyping of adaptive agents. He is developing adaptive programming languages, and trying to understand what it means to bring machine learning tools to non-expert authors, designers and developers. He sometimes interacts with the physical world through racquetball, weight-lifting and Ultimate Frisbee.

Michael Littman is a Professor of Computer Science at Brown University. He also teaches Udacity’s Algorithms course (CS215) on crunching social networks. Prior to joining Brown in 2012, he led the Rutgers Laboratory for Real-Life Reinforcement Learning (RL3) at Rutgers, where he served as the Computer Science Department Chair from 2009-2012. He is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), served as program chair for AAAI's 2013 conference and the International Conference on Machine Learning in 2009, and received university-level teaching awards at both Duke and Rutgers. Charles Isbell taught him about racquetball, weight-lifting and Ultimate Frisbee, but he's not that great at any of them. He's pretty good at singing and juggling, though.

instructor photo

Pushkar Kolhe

Course Developer

Pushkar Kolhe is currently pursuing his PhD in Computer Science at Georgia Tech. He believes that Machine Learning is going to help him create AI that will reach the singularity. When he is not working on that problem, he is busy climbing, jumping or skiing on things.

track icon

View more courses in Software Engineering