How-to guideoperations10–12 minIntermediate
Writing Effective Runbooks for SRE Hand-off
Structure your runbooks for clarity, safety, and repeatability.
SREEng leadershipLast updated 2025-11-26
runbooksoperations
Share:
Recommended runbook structure
- Summary and when to use this runbook.
- Quick checks / triage steps.
- Safe remediation steps with copy-paste commands where appropriate.
- Escalation instructions and business impact notes.
Minimal YAML example
Runbook YAML skeleton
name: K8s ingress 5xx spike
match: |
labels.service == "api" && labels.env == "prod"
steps:
- title: Check current error rate
command: kubectl -n ingress logs deploy/ingress-nginx --since=10m | grep "500"
- title: Verify upstream pods
command: kubectl -n api get pods -o wide