Lesson 1
Introduction to Establishing a Culture of Reliability
In this lesson, we cover some introductory material to help you start with a solid foundation.
Course
This course is all about how to foster a culture that is based on reliability. We will learn how to utilize best practices for several key areas of being a Site Reliability Engineer (SRE) and how they contribute to a culture of reliability. We will cover how to have balanced and effective on-call rotations as well as how to handle incidents. Next, we will discuss how to review your system throughout its lifecycle to find and mitigate any potential risk factors. Managing system capacity at all phases of a system's lifecycle is another major component to ensuring that everything is operating at maximum reliability. We will round out this course by discussing a thorn in every SRE's side: toil. We will discuss how to identify and reduce toil to maximize time spent performing operational work.
This course is all about how to foster a culture that is based on reliability. We will learn how to utilize best practices for several key areas of being a Site Reliability Engineer (SRE) and how they contribute to a culture of reliability. We will cover how to have balanced and effective on-call rotations as well as how to handle incidents. Next, we will discuss how to review your system throughout its lifecycle to find and mitigate any potential risk factors. Managing system capacity at all phases of a system's lifecycle is another major component to ensuring that everything is operating at maximum reliability. We will round out this course by discussing a thorn in every SRE's side: toil. We will discuss how to identify and reduce toil to maximize time spent performing operational work.
Intermediate
4 weeks
Real-world Projects
Completion Certificate
Last Updated July 19, 2024
Prerequisites:
No experience required
Lesson 1
In this lesson, we cover some introductory material to help you start with a solid foundation.
Lesson 2
Having a solid on-call is very important to achieving peak reliability. This lesson discusses how to have balanced on-call shifts with a solid incident management process that your team can follow.
Lesson 3
In this lesson, we learn how to review your system from the start to prepare for a release. It is important that you have systems in place to find potential risks and develop mitigations for them.
Lesson 4
System capacity is an essential part of ensuring reliability. This lesson discusses how to balance system capacity with costs to ensure that resources and money are not being wasted.
Lesson 5
Toil is the bane of every SRE team, and this lesson is all about how to reduce toil to allow your team to focus on operational work that improves reliability.
Lesson 6 • Project
To wrap everything up, you will complete the final project, where you will be participating in three scenarios that will tie everything you have learned together.
Site Reliability Engineer
Sonny is an SRE with a varied background. He has dabbled in research at Lawrence Berkeley National Labs before moving into site reliability engineering to have a more hands on role. He has been published in several computing journals, as well as taught introductory programming courses.
Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.
Demonstrate proficiency with practical projects
Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.
Gain proven experience
Retain knowledge longer
Apply new skills immediately
Top-tier services to ensure learner success
Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.
Get help from subject matter experts
Learn industry best practices
Gain valuable insights and improve your skills
Full Catalog Access
One subscription opens up this course and our entire catalog of projects and skills.
Average time to complete a Nanodegree program
(9)
4 months
, Intermediate
6 hours
, Fluency
4 weeks
, Intermediate
3 weeks
, Advanced
4 weeks
, Intermediate
4 weeks
, Intermediate
4 weeks
, Beginner
4 weeks
, Intermediate
4 weeks
, Intermediate
4 weeks
, Beginner
4 weeks
, Intermediate
3 weeks
, Advanced
4 weeks
, Intermediate
4 weeks
, Intermediate
8 hours
, Beginner
(47)
4 months
, Intermediate
Establishing a Culture of Reliability
(9)
4 months
, Intermediate
6 hours
, Fluency
4 weeks
, Intermediate
3 weeks
, Advanced
4 weeks
, Intermediate
4 weeks
, Intermediate
4 weeks
, Beginner
4 weeks
, Intermediate
4 weeks
, Intermediate
4 weeks
, Beginner
4 weeks
, Intermediate
3 weeks
, Advanced
4 weeks
, Intermediate
4 weeks
, Intermediate
8 hours
, Beginner
(47)
4 months
, Intermediate