Canarying limits blast radius and enables fast rollback if metrics regress.
Key Metrics
- Latency
- Error rate
- Conversion
Rollout Playbook
- Start at 1%
- Hold until stable
- Ramp up progressively
Choosing Cohorts
- Internal users first
- Low‑risk geos
- Single region to reduce noise
Observability Hooks
- Emit exposure events
- Tag traces with variant
- Create canary dashboards
Case Study: Canary of Search Indexer
At 1% traffic, error rate rose 1.2pp. The team rolled back instantly using a kill switch, fixed a null edge case, and resumed canary successfully.
Conclusion
Canaries plus flags enable reversible, measured change—your best defense against regressions.
