Real-world projects from industry experts
With real world projects and immersive content built in partnership with top tier companies, you’ll master the tech skills companies want.
Master the job-ready skills you need to be a successful site reliability engineer and start designing systems to automate responses to software site issues.
05Days06Hrs13Min38Sec
At 5-10 hours/week
Get access to the classroom immediately upon enrollment
Master the skills necessary to become a successful site reliability engineer. Learn to build automation tools that ensure designed solutions respond to requirements such as availability, performance, security, and maintainability.
Python or Java, Bash or Powershell, Linux, UNIX Shell and SQL.
Get a practical introduction to what observability requires in terms of people and tools. Learn about site reliability engineering, its roles and responsibilities, and how those differ from other teams. See how the role helps an enterprise improve, discuss associated costs, learn the types of members and about the tools a team may use.
This course will cover monitoring, high availability (HA) and disaster recovery (DR), infrastructure as code, and database recovery and availability. Learn the basics about SLOs and SLIs as well as how to translate them into queries and finally graphs. Also, learn how to design and deploy highly available databases to AWS.
Learn how to deploy microservices or cloud architecture that is resilient enough to withstand failures, and predictable enough to resolve issues via automation without human intervention. Understand self-healing system design fundamentals, deployment strategies, implementation steps, and use cases. Learn cloud automation to increase the resiliency of systems.
Learn how to develop processes and frameworks that drive workplaces toward putting reliability first by working through the incident management process and how to have effective on-calls. Understand how to perform reliability reviews on various phases of your system, how to effectively manage system capacity, and how to reduce toil.
With real world projects and immersive content built in partnership with top tier companies, you’ll master the tech skills companies want.
Our knowledgeable mentors guide your learning and are focused on answering your questions, motivating you and keeping you on track.
You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.
Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.
We provide services customized for your needs at every step of your learning journey to ensure your success.
project reviewers
projects reviewed
reviewer rating
avg project review turnaround time
technical mentors
median response time
Nathan is a Certified Six Sigma Black Belt and has 10+ years of experience in IT in multiple industries. He is also the Instructor for two other Udacity courses: Ensuring Quality Releases and Azure Performance.
Travis Scotto has worked in technology for 10 years. He has worked in various infrastructure roles: virtualization, databases, and monitoring. As an SRE, he employs automation and monitoring daily. He also has adjunct taught IT classes for 4.5 years.
Emmanuel is co-founder of the Black Code Collective and DC's Technical.ly RealLIST Engineer award recipient. An AWS Certified DevSecOps specialist with 12 years of experience, he has spent his career developing innovative solutions using DevSecOps & Site reliability best practices.
Sonny is an SRE with a varied background. He has dabbled in research at Lawrence Berkeley National Labs before moving into site reliability engineering to have a more hands on role. He has been published in several computing journals, as well as taught introductory programming courses.
Pay as you go
per
/
/
Pay upfront and save an extra 0%
for - access
A well-prepared learner is already able to: