Skip to content

Latest commit

 

History

History
60 lines (46 loc) · 1.13 KB

File metadata and controls

60 lines (46 loc) · 1.13 KB

Fault Drill Runbook Template

1. Drill Metadata

  • Drill Name:
  • Date:
  • Operator:
  • Environment:
  • Version / Commit:

2. Objective

  • Primary SLO/SLA objective:
  • Hypothesis:
  • Success criteria:

3. Scope and Preconditions

  • Services in scope:
  • Dependencies in scope:
  • Baseline metrics snapshot completed: yes/no
  • Stakeholders notified: yes/no
  • Rollback owner confirmed: yes/no

4. Drill Procedure

  1. Start time:
  2. Trigger action:
  3. Observe key metrics:
  4. Observe logs/traces:
  5. Validate alerting path:
  6. Recovery action:
  7. End time:

5. Evidence

  • Dashboard screenshots:
  • Alert screenshots (firing + resolved):
  • Trace IDs:
  • Key log snippets:
  • k6 summary/report paths:
    • reports/performance/distributed-k6-summary.json
    • reports/performance/k6-report.md

6. Results

Indicator Baseline Drill Peak Target Result
p95 latency (ms)
error rate (%)
queue lag
ingestion failure ratio (%)

7. Findings and Follow-up

  • What worked:
  • What failed:
  • Root cause:
  • Follow-up actions:
  • Owner and due date: