Sociotechnical considerations after a blown SLO

ChaosWheel

About the speaker

Liz Fong Jones

Liz Fong Jones

Principal Developer Advocate,

Honeycomb

Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 16+ years of experience. She is an advocate at Honeycomb.io for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights.

About the talk

What do you do when you've had a few too many incidents and blown your error budget? Or had a pile of near-misses that burned the team out even though the user-facing SLO wasn't violated? What if the incident trigger was the infrastructure refactoring meant to improve, not harm, reliability & maintainability? We'll discuss two incidents that shaped how we think about error budgets and emergency stops.

Pragmatic tips for incident response
Pragmatic tips for incident response
Survival Guide: Black Swan Events
Survival Guide: Black Swan Events

Videos

by Experts

Checkout our videos from the latest conferences and events

Our Videos

Related Blog

Litmus 2.0

Read

Aug 15, 2021

6 Min Read

Uma's Blog on Litmus 2