Real-world projects from industry experts
With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.
Master the job-ready skills you need to be a successful site reliability engineer and start designing systems to automate responses to software site issues.
01Days06Hrs50Min56Sec
At 5-10 hours/week
Get access to the classroom immediately upon enrollment
Master the skills necessary to become a successful site reliability engineer. Learn to build automation tools that ensure designed solutions respond to requirements such as availability, performance, security, and maintainability.
Python or Java, Bash or Powershell, Linux, UNIX Shell and SQL.
Get a practical introduction to what observability requires in terms of people and tools. Learn about site reliability engineering, its roles and responsibilities, and how those differ from other teams. See how the role helps an enterprise improve, discuss associated costs, learn the types of members and about the tools a team may use.
This course will cover monitoring, high availability (HA) and disaster recovery (DR), infrastructure as code, and database recovery and availability. Learn the basics about SLOs and SLIs as well as how to translate them into queries and finally graphs. Also, learn how to design and deploy highly available databases to AWS.
Learn how to deploy microservices or cloud architecture that is resilient enough to withstand failures, and predictable enough to resolve issues via automation without human intervention. Understand self-healing system design fundamentals, deployment strategies, implementation steps, and use cases. Learn cloud automation to increase the resiliency of systems.
Learn how to develop processes and frameworks that drive workplaces toward putting reliability first by working through the incident management process and how to have effective on-calls. Understand how to perform reliability reviews on various phases of your system, how to effectively manage system capacity, and how to reduce toil.
With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.
Our knowledgeable mentors guide your learning and are focused on answering your questions, motivating you, and keeping you on track.
You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.
Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.
We provide services customized for your needs at every step of your learning journey to ensure your success.
project reviewers
projects reviewed
reviewer rating
avg project review turnaround time
Nathan is a Certified Six Sigma Black Belt and has 10+ years of experience in IT in multiple industries. He is also the Instructor for two other Udacity courses: Ensuring Quality Releases and Azure Performance.
Travis Scotto has worked in technology for 10 years. He has worked in various infrastructure roles: virtualization, databases, and monitoring. As an SRE, he employs automation and monitoring daily. He also has adjunct taught IT classes for 4.5 years.
Emmanuel is co-founder of the Black Code Collective and DC's Technical.ly RealLIST Engineer award recipient. An AWS Certified DevSecOps specialist with 12 years of experience, he has spent his career developing innovative solutions using DevSecOps & Site reliability best practices.
Sonny is an SRE with a varied background. He has dabbled in research at Lawrence Berkeley National Labs before moving into site reliability engineering to have a more hands on role. He has been published in several computing journals, as well as taught introductory programming courses.
How to create and implement technical solutions by utilizing site reliability engineering principles.
On average, successful students take 4 months to complete this program.
This program is designed to help you take advantage of the growing need for skilled site reliability engineers. Prepare to meet the demand for qualified site reliability engineers that can respond to real-life, high-stakes workplace challenges.
The skills you will gain from this Nanodegree program will qualify you for jobs in several industries as countless companies are trying to incorporate better site reliability practices into their organizations.
The program is for individuals who are looking to advance their site reliability engineering careers with skills in a burgeoning field.
No. This Nanodegree program accepts all applicants regardless of experience and specific background.
A well-prepared learner is already able to:
Students who do not feel comfortable in the above may consider taking any of the web development Nanodegrees (Cloud Developer, Cloud Developer using Microsoft Azure, or Full Stack Web Developer).
The Site Reliability Nanodegree program consists of content and curriculum to support 4 projects. We estimate that students can complete the program in 4 months working 5-10 hours per week.
Each project will be reviewed by the Udacity reviewer network. Feedback will be provided and if you do not pass the project, you will be asked to resubmit the project until it passes.
Access to this Nanodegree program runs for the length of time specified above. If you do not graduate within that time period, you will continue learning with month-to-month payments. See the Terms of Use and FAQs for other policies regarding the terms of access to our Nanodegree programs.
Please see the Udacity Program FAQs for policies on enrollment in our programs.
There are no software and version requirements to complete this Nanodegree program. All coursework and projects can be completed in the Udacity online classroom. Udacity’s basic tech requirements can be found here.