Udacity part of Accenture logo

Planning for High Availability and Incident Response

Course

In this course, we will look at how SREs view availability and reliability for their infrastructure. We'll learn how to create effective monitoring using SLOs and SLIs. We will create dashboards in Grafana. Next, we'll identify all our IT assets, ensure they are configured for high availability. And then we will craft a disaster recovery plan to make sure failover is seamless and automated. After that, we'll deploy the infrastructure to AWS using Terraform. We'll learn the benefits of infrastructure as code. We'll see how easy it is to deploy to multiple regions. Finally, we'll learn how to make databases highly available and disaster recovery ready. We'll look at recovery strategies and implement them in AWS via Terraform.

In this course, we will look at how SREs view availability and reliability for their infrastructure. We'll learn how to create effective monitoring using SLOs and SLIs. We will create dashboards in Grafana. Next, we'll identify all our IT assets, ensure they are configured for high availability. And then we will craft a disaster recovery plan to make sure failover is seamless and automated. After that, we'll deploy the infrastructure to AWS using Terraform. We'll learn the benefits of infrastructure as code. We'll see how easy it is to deploy to multiple regions. Finally, we'll learn how to make databases highly available and disaster recovery ready. We'll look at recovery strategies and implement them in AWS via Terraform.

  • Intermediate

  • 3 weeks

  • Last Updated December 12, 2024

Skills you'll learn:

Data recoveryTerraform

Prerequisites:

No experience required

Intermediate

3 weeks

Last Updated December 12, 2024

Skills you'll learn:

Data recovery • Terraform • Prometheus • Data replication

Prerequisites:

No experience required

Course Lessons

Lesson 1

Course Introduction

Introduction to the course. We will look at how the topics all tie into being an SRE and what skills we'll learn and apply.

Lesson 2

SLOs and SLIs

In this lesson, we will learn about how SREs monitor using SLOs and SLIs. We will create queries in Prometheus and dashboard in Grafana.

Lesson 3

IT Assets, Availability and Disaster Recovery

In this lesson, we will identify all IT assets, make those assets highly available, and put together a disaster recovery plan for those assets.

Lesson 4

Creating and deploying HA and DR infrastructure using Terraform

In this lesson, we will deploy our HA/DR infrastructure using Terraform to AWS.

Lesson 5

High Availability and DR of Databases

In this lesson, we'll learn about database reliability and availability and how we can make databases more available. We will then deploy a replicated database cluster to AWS and also see a failover.

Lesson 6 • Project

Deploying High Availability Infrastructure

In this project, you will apply the skills you've learned in this course, by defining and implementing a resilient infrastructure in a cloud platform.

Taught By The Best

Photo of Travis Scotto

Travis Scotto

Site Reliability Engineer

Travis has been working in IT for over 10 years. He's also been adjunct teaching for over 5 years. He loves technology and sharing his knowledge with students. Travis brings his industry experience as an SRE to the table in teaching different classes. He blends industry expertise with step by step teaching to allow students to excel! Seeing students succeed is what he likes best.

The Udacity Difference

Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.

Demonstrate proficiency with practical projects

Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.

  • Gain proven experience

  • Retain knowledge longer

  • Apply new skills immediately

Top-tier services to ensure learner success

Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.

  • Get help from subject matter experts

  • Learn industry best practices

  • Gain valuable insights and improve your skills

Enroll in Planning for High Availability and Incident Response. Choose the plan that works for you

All Access monthly

  • Unlimited access to our top-rated courses

  • Personalized Career Services

  • Cancel Anytime

  • Real-world projects

  • Personalized project reviews

  • Program certificates

Best Value

All Access bundle1

  • All the same great benefits as our monthly plan

  • The most cost-effective way to develop the skills you want

  1. 1Discount applies to the first 4 months of membership, after which plans are converted to month-to-month.

Your subscription also includes:

Udacity Accenture logo

Company

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram

© 2011-2025 Udacity, Inc. "Nanodegree" is a registered trademark of Udacity. © 2011-2025 Udacity, Inc.
We use cookies and other data collection technologies to provide the best experience for our customers.