The Deep Reinforcement Learning Nanodegree has four courses: Introduction to Deep Reinforcement Learning, Value-Based Methods, Policy-Based Methods, and Multi-Agent RL. Students learn to implement classical solution methods, define Markov decision processes, policies, and value functions, and derive Bellman equations. They learn dynamic programming, Monte Carlo methods, temporal-difference methods, deep RL, and apply these techniques to solve real-world problems. They learn to train agents to navigate virtual worlds, generate optimal financial trading strategies, and apply RL to multiple interacting agents.

## Courses In This Program

Course 1 1 day

### Introduction to Deep Reinforcement Learning

Lesson 1

#### Welcome to Deep Reinforcement Learning

Welcome to the Deep Reinforcement Learning Nanodegree program!

Lesson 2

#### Getting Help

You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.

Lesson 3

Lesson 4

#### Learning Plan

Obtain helpful resources to accelerate your learning in this first part of the Nanodegree program.

Lesson 5

#### Introduction to RL

Reinforcement learning is a type of machine learning where the machine or software agent learns how to maximize its performance at a task.

Lesson 6

#### The RL Framework: The Problem

Learn how to mathematically formulate tasks as Markov Decision Processes.

Lesson 7

#### The RL Framework: The Solution

In reinforcement learning, agents learn to prioritize different decisions based on the rewards and punishments associated with different outcomes.

Lesson 8

#### Monte Carlo Methods

Write your own implementation of Monte Carlo control to teach an agent to play Blackjack!

Lesson 9

#### Temporal-Difference Methods

Learn about how to apply temporal-difference methods such as SARSA, Q-Learning, and Expected SARSA to solve both episodic and continuing tasks.

Lesson 10

#### Solve OpenAI Gym's Taxi-v2 Task

With reinforcement learning now in your toolbox, you're ready to explore a mini project using OpenAI Gym!

Lesson 11

Lesson 12

#### What's Next?

In the next parts of the Nanodegree program, you'll learn all about how to use neural networks as powerful function approximators in reinforcement learning.

Course 2 4 weeks

### Value-Based Methods

Apply deep learning architectures to reinforcement learning tasks. Train your own agent that navigates a virtual world from sensory data.

Lesson 1

#### Study Plan

This lesson covers the study plan and prerequisites for this course.

Lesson 2

#### Deep Q-Networks

Extend value-based reinforcement learning methods to complex problems using deep neural networks.

Lesson 3 • Project

Train an agent to navigate a large world and collect yellow bananas, while avoiding blue bananas.

Course 3 4 weeks

### Policy-Based Methods

Lesson 1

#### Study Plan

Obtain helpful resources to accelerate your learning in the third part of the Nanodegree program.

Lesson 2

#### Introduction to Policy-Based Methods

Policy-based methods try to directly optimize for the optimal policy.

Lesson 3

Lesson 4

#### Proximal Policy Optimization

Learn what Proximal Policy Optimization (PPO) is and how it can improve policy gradients. Also learn how to implement the algorithm by training a computer to play the Atari Pong game.

Lesson 5

#### Actor-Critic Methods

Miguel Morales explains how to combine value-based and policy-based methods, bringing together the best of both worlds, to solve challenging reinforcement learning problems.

Lesson 6

#### Deep RL for Finance (Optional)

Learn how to apply deep reinforcement learning techniques for optimal execution of portfolio transactions.

Lesson 7 • Project

#### Continuous Control

Train a double-jointed arm to reach target locations.

Course 4 3 weeks

### Multi-Agent Reinforcement Learning

Lesson 1

#### Study Plan

Obtain helpful resources to accelerate your learning in the fourth part of the Nanodegree program.

Lesson 2

Lesson 3

#### Case Study: AlphaZero

Lesson 4 • Project

#### Collaboration and Competition

Train a pair of agents to play tennis.

