Skip to content

Site Reliability Engineer

Nanodegree Program

Master the job-ready skills you need to be a successful site reliability engineer and start designing systems to automate responses to software site issues.

Enroll Now


  • Estimated time
    4 Months

    At 5-10 hours/week

  • Enroll by
    December 7, 2022

    Get access to the classroom immediately upon enrollment

  • Prerequisites
    Python or Java, Bash or Powershell, Linux, UNIX Shell and SQL

What you will learn

  1. Site Reliability Engineer

    Estimated 4 months to complete

    Master the skills necessary to become a successful site reliability engineer. Learn to build automation tools that ensure designed solutions respond to requirements such as availability, performance, security, and maintainability.

    Prerequisite knowledge

    1. Foundations of Observability

      Get a practical introduction to what observability requires in terms of people and tools. Learn about site reliability engineering, its roles and responsibilities, and how those differ from other teams. See how the role helps an enterprise improve, discuss associated costs, learn the types of members and about the tools a team may use.

    2. Planning for High Availability and Incident Response

      This course will cover monitoring, high availability (HA) and disaster recovery (DR), infrastructure as code, and database recovery and availability. Learn the basics about SLOs and SLIs as well as how to translate them into queries and finally graphs. Also, learn how to design and deploy highly available databases to AWS.

    3. Self-Healing Architecture

      Learn how to deploy microservices or cloud architecture that is resilient enough to withstand failures, and predictable enough to resolve issues via automation without human intervention. Understand self-healing system design fundamentals, deployment strategies, implementation steps, and use cases. Learn cloud automation to increase the resiliency of systems.

    4. Establishing a Culture of Reliability

      Learn how to develop processes and frameworks that drive workplaces toward putting reliability first by working through the incident management process and how to have effective on-calls. Understand how to perform reliability reviews on various phases of your system, how to effectively manage system capacity, and how to reduce toil.

All our programs include:

  • Real-world projects from industry experts

    With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

  • Technical mentor support

    Our knowledgeable mentors guide your learning and are focused on answering your questions, motivating you, and keeping you on track.

  • Career services

    You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.

  • Flexible learning program

    Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

Program offerings

  • Class content

    • Real-world projects
    • Project reviews
    • Project feedback from experienced reviewers
  • Student services

    • Technical mentor support
    • Student community
  • Career services

    • Github review
    • LinkedIn profile optimization

Succeed with personalized services.

We provide services customized for your needs at every step of your learning journey to ensure your success.

Get timely feedback on your projects.

  • Personalized feedback
  • Unlimited submissions and feedback loops
  • Practical tips and industry best practices
  • Additional suggested resources to improve
  • 1,400+

    project reviewers

  • 2.7M

    projects reviewed

  • 88/100

    reviewer rating

  • 1.1 hours

    avg project review turnaround time

Learn with the best.

Learn with the best.

  • Nathan Anderson, MBA

    Global Cloud Architect

    Nathan is a Certified Six Sigma Black Belt and has 10+ years of experience in IT in multiple industries. He is also the Instructor for two other Udacity courses: Ensuring Quality Releases and Azure Performance.

  • Travis Scotto

    Site Reliability Engineer

    Travis Scotto has worked in technology for 10 years. He has worked in various infrastructure roles: virtualization, databases, and monitoring. As an SRE, he employs automation and monitoring daily. He also has adjunct taught IT classes for 4.5 years.

  • Emmanuel Apau

    CTO of

    Emmanuel is co-founder of the Black Code Collective and DC's RealLIST Engineer award recipient. An AWS Certified DevSecOps specialist with 12 years of experience, he has spent his career developing innovative solutions using DevSecOps & Site reliability best practices.

  • Sonny Sevin

    Site Reliability Engineer

    Sonny is an SRE with a varied background. He has dabbled in research at Lawrence Berkeley National Labs before moving into site reliability engineering to have a more hands on role. He has been published in several computing journals, as well as taught introductory programming courses.

Site Reliability Engineer

Get started today

  • Monthly access

    Pay as you go




    Enroll now
    • Maximum flexibility to learn at your own pace.
    • Cancel anytime.
  • - access

    Pay upfront and save an extra 0%

    for - access

    Enroll now
    • Save an extra 0% vs. pay as you go.
    • 4 months is the average time to complete this course.
    • Switch to monthly price after if more time is needed.
    • Cancel anytime.
    Best Value
  • Learn

  • Average Time

  • Benefits include

Program details

Program overview: Why should I take this program?
  • Why should I enroll?
  • What jobs will this program prepare me for?
  • How do I know if this program is right for me?
Enrollment and admission
  • Do I need to apply? What are the admission criteria?
  • What are the prerequisites for enrollment?
  • If I do not meet the requirements to enroll, what should I do?
Tuition and term of program
  • How is this Nanodegree program structured?
  • How long is this Nanodegree program?
  • Can I switch my start date? Can I get a refund?
Software and hardware: What do I need for this program?
  • What software and versions will I need in this program?

Site Reliability Engineer

Enroll Now