What This Template Is For
Traditional incident response playbooks were designed for deterministic systems: the server is down, the database is corrupted, the API returns errors. AI incidents are fundamentally different. Your system can be "up" with 200 OK responses while simultaneously generating harmful content, hallucinating facts, or making biased decisions. Standard monitoring may show green dashboards while your AI causes real damage.
This template provides a structured response plan for AI-specific incident categories: hallucinations that spread misinformation, bias patterns that discriminate, safety boundary violations, model performance degradation, and data leakage through generated outputs. Each category has its own detection signals, severity criteria, and response procedures.
The AI PM Handbook covers AI operations and monitoring in depth. For understanding the metrics that trigger incidents, see hallucination rate. Use the AI Ethics Scanner to proactively identify risk areas before incidents occur.
How to Use This Template
- Customize the incident categories for your product. The five categories in this template cover the most common AI failures, but your product may have domain-specific incident types.
- Define severity levels with your team before any incident occurs. During an incident is the worst time to debate whether something is P0 or P1.
- Assign incident commanders for each severity level. AI incidents often require ML expertise that traditional on-call engineers may not have.
- Run a tabletop exercise using the scenarios in the template. Walk through each scenario with your team and verify the response procedures work.
- Review and update quarterly based on actual incidents and near-misses. Every real incident should result in a template update.
The Template
Incident Classification
- ☐ Define what constitutes an AI incident (vs. a general system incident)
- ☐ Customize severity levels for your product's risk profile
- ☐ Assign incident commanders for each severity level
- ☐ Set up on-call rotation that includes ML engineering
## AI Incident Severity Matrix
| Severity | Definition | Response Time | Incident Commander | Communication |
|----------|-----------|---------------|-------------------|---------------|
| SEV-0 | AI causes or risks causing physical harm, legal liability, or widespread reputational damage | < 30 min | VP Engineering | Executive team + Legal + PR within 1 hour |
| SEV-1 | AI generates harmful, biased, or severely wrong content affecting multiple users | < 2 hours | Engineering Manager | Product + ML team within 2 hours |
| SEV-2 | AI quality degrades measurably or model leaks internal information | < 8 hours | ML Engineer on-call | PM + Engineering lead within 24 hours |
| SEV-3 | Individual AI quality report or minor edge case failure | < 48 hours | ML Engineer on-call | Logged in tracking system |
Incident Category 1: Hallucination Events
- ☐ Define hallucination detection signals (user reports, automated checks)
- ☐ Set hallucination rate alert thresholds
- ☐ Document the response procedure for hallucination spikes
- ☐ Define rollback criteria for hallucination incidents
## Hallucination Incident Response
### Detection Signals
- User feedback: spike in "incorrect" or "made up" reports
- Automated: hallucination rate exceeds [X%] over [Y] minute window
- Automated: citation verification failure rate exceeds [X%]
- Manual: QA sample review finds fabricated information
### Response Procedure
1. **Assess scope**: How many users were affected? What content was hallucinated?
2. **Contain**: Enable stricter guardrails or reduce model temperature
3. **Investigate**: Identify the input pattern or prompt that triggered hallucinations
4. **Mitigate**: Add the failing inputs to the evaluation test suite
5. **Resolve**: Deploy prompt fix or model configuration change
6. **Communicate**: Notify affected users if hallucinated content led to actions
7. **Follow up**: Add monitoring for the specific hallucination pattern
Incident Category 2: Bias and Discrimination
- ☐ Define bias detection signals (disparity metrics, user reports, audits)
- ☐ Set maximum acceptable disparity thresholds
- ☐ Document the response procedure for bias incidents
- ☐ Define immediate mitigation actions
## Bias Incident Response
### Detection Signals
- User report: complaint about discriminatory AI behavior
- Automated: output quality variance > [X%] across demographic groups
- Audit: scheduled bias evaluation reveals new disparity
- External: media report or regulatory inquiry about AI bias
### Response Procedure
1. **Assess severity**: Is the bias affecting decisions with real impact (hiring, lending, access)?
2. **Contain**: Disable the AI feature for the affected user segment if SEV-0/SEV-1
3. **Investigate**: Analyze model outputs across demographic groups
4. **Document**: Record the bias pattern, affected groups, and scope
5. **Mitigate**: Adjust prompts, add training data, or implement output constraints
6. **Verify**: Run bias evaluation suite to confirm the fix
7. **Communicate**: Internal report to leadership. External disclosure if users were affected.
8. **Prevent**: Update bias evaluation suite to catch this pattern going forward
Incident Category 3: Safety Boundary Violations
- ☐ Define safety boundaries (content the AI must never generate)
- ☐ Set up automated safety classifiers with alert thresholds
- ☐ Document the response procedure for safety violations
- ☐ Define immediate containment actions
## Safety Violation Response
### Detection Signals
- Automated: safety classifier flags output content
- User report: user reports harmful, dangerous, or illegal AI output
- Internal: team member discovers safety bypass during testing
- External: content surfaces on social media or in press coverage
### Response Procedure
1. **Contain immediately**: If SEV-0, disable the AI feature within 30 minutes
2. **Preserve evidence**: Save the input, output, model version, and prompt version
3. **Assess impact**: How many users received the harmful content?
4. **Investigate root cause**: Prompt injection? Model failure? Missing guardrail?
5. **Implement fix**: Add safety filter, update system prompt, or restrict input
6. **Verify fix**: Run adversarial test suite to confirm boundary holds
7. **Communicate**: Notify legal. Notify affected users if content was actionable.
8. **Post-mortem**: Full incident review within 48 hours
Incident Category 4: Model Performance Degradation
- ☐ Define performance baselines (accuracy, latency, cost)
- ☐ Set degradation alert thresholds for each metric
- ☐ Document the response procedure for performance drops
- ☐ Plan fallback behavior during degradation
## Performance Degradation Response
### Detection Signals
- Automated: accuracy drops > [X%] from baseline over [Y] hour window
- Automated: p99 latency exceeds [X] seconds
- Automated: cost per request exceeds [X] dollars
- Automated: error rate exceeds [X%]
- User feedback: satisfaction rate drops below [X%]
### Response Procedure
1. **Assess scope**: Is degradation affecting all users or a segment?
2. **Check external dependencies**: Is the model provider experiencing issues?
3. **Contain**: Activate fallback model or circuit breaker if quality is below minimum
4. **Investigate**: Check recent changes (prompt updates, data pipeline changes, model version updates)
5. **Mitigate**: Roll back to last known good configuration
6. **Monitor**: Verify recovery with production metrics for 24 hours
7. **Document**: Record root cause and preventive measures
Incident Category 5: Data Leakage
- ☐ Define what constitutes data leakage (PII, internal data, system prompts)
- ☐ Set up detection mechanisms for data leakage patterns
- ☐ Document the response procedure
- ☐ Plan notification procedures for affected parties
## Data Leakage Response
### Detection Signals
- User report: AI output contains another user's data
- Automated: PII detection in model outputs
- Testing: system prompt extractable via prompt injection
- Automated: internal identifiers or API keys in outputs
### Response Procedure
1. **Contain**: Disable the AI feature immediately for SEV-0/SEV-1
2. **Assess scope**: What data was exposed? How many users were affected?
3. **Preserve evidence**: Save all relevant logs and outputs
4. **Notify**: Inform legal and privacy team immediately
5. **Investigate**: Determine whether leakage was via model memorization, RAG retrieval, or prompt injection
6. **Mitigate**: Implement output filtering, remove leaked data from training/retrieval
7. **Notify affected parties**: Comply with data breach notification requirements
8. **Review**: Update data governance and privacy controls
Post-Incident Review
- ☐ Schedule post-mortem within 48 hours of incident resolution
- ☐ Complete the incident report template
- ☐ Identify root cause and contributing factors
- ☐ Define and assign preventive actions
- ☐ Update this incident response template based on learnings
## Post-Incident Report
**Incident ID**: [AI-INC-YYYY-NNN]
**Severity**: [SEV-0/1/2/3]
**Category**: [Hallucination / Bias / Safety / Degradation / Data Leakage]
**Duration**: [Start time to resolution]
**Users Affected**: [Number and segments]
**Incident Commander**: [Name]
### Timeline
| Time | Event |
|------|-------|
| [HH:MM] | [First detection signal] |
| [HH:MM] | [Incident declared, commander assigned] |
| [HH:MM] | [Containment action taken] |
| [HH:MM] | [Root cause identified] |
| [HH:MM] | [Fix deployed] |
| [HH:MM] | [Incident resolved, monitoring confirmed] |
### Root Cause
[What caused the incident]
### Contributing Factors
- [Factor 1]
- [Factor 2]
### Preventive Actions
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| [Action 1] | [Name] | [Date] | [Open / In Progress / Done] |
| [Action 2] | [Name] | [Date] | [Open / In Progress / Done] |
Filled Example
Incident: AI customer service chatbot started recommending a discontinued product as a solution to billing issues, citing a non-existent "billing relief program."
Severity: SEV-1 (harmful misinformation affecting multiple users)
Category: Hallucination
Duration: 3 hours 15 minutes (10:30 AM detection to 1:45 PM resolution)
Users Affected: ~140 users received hallucinated responses
Root Cause: A knowledge base update removed the discontinued product page but left orphaned references in related FAQ documents. The RAG retrieval pulled these orphaned references, and the model confabulated a "billing relief program" from the fragmented context.
Fix: Removed orphaned references from the knowledge base. Added a validation step to the knowledge base update pipeline that checks for broken cross-references. Added "billing relief program" as a known hallucination pattern to the output filter.
Preventive Actions: Automated cross-reference integrity check in knowledge base pipeline. Weekly sample review of chatbot responses mentioning specific programs or products.
