What This Template Is For
When something goes wrong, teams often fix the symptom and move on. The feature adoption is low, so they redesign the UI. The deployment broke, so they revert the commit. The metric dropped, so they run a campaign. These fixes address what happened but not why it happened. Without understanding the root cause, the same problem returns in a different form.
Root cause analysis (RCA) is a structured method for tracing a problem back to its origin. This template provides two methods: the 5 Whys (fast, conversational, good for straightforward issues) and the fishbone diagram (broader, categorical, good for complex problems with multiple contributing factors). Both methods produce actionable findings that prevent recurrence.
Product teams should run an RCA whenever a launch misses its target by more than 30%, a production incident affects customers, or a recurring problem consumes team time. For tracking the metrics that trigger an RCA, see the KPI Dashboard Template. The Product Operations Handbook covers how RCA fits into a broader operational excellence program. If you are deciding which issues to investigate first, the RICE Calculator helps prioritize based on impact.
How to Use This Template
- Define the problem clearly. State what happened, when, and the measurable impact. Avoid vague statements like "the launch did not go well."
- Gather data before the analysis session. Pull metrics, logs, timelines, and user feedback. RCA without data is guesswork.
- Run the analysis with 3-5 people who were directly involved. Include cross-functional participants (PM, Eng, Design, CS) when the issue spans teams.
- Use the 5 Whys for single-thread problems. Use the fishbone diagram for multi-factor problems.
- For each root cause identified, define a corrective action with an owner and a due date.
- Follow up at 30 days to verify the corrective action was implemented and the problem has not recurred.
The Template
Problem Statement
| Field | Details |
|---|---|
| What happened | [Specific, measurable description of the problem] |
| When it happened | [Date/time or time period] |
| Who was affected | [Users, teams, stakeholders] |
| Impact | [Quantified: revenue lost, users affected, hours wasted, SLA missed] |
| How it was detected | [Alert, user report, metric dashboard, manual review] |
| Immediate fix applied | [What was done to stop the bleeding] |
Method 1: 5 Whys
Start with the problem statement and ask "Why?" repeatedly until you reach a cause that, if fixed, would prevent recurrence.
| Level | Question | Answer | Evidence |
|---|---|---|---|
| Problem | [What happened?] | [Problem statement] | [Link to data] |
| Why 1 | Why did [problem] happen? | [Direct cause] | [Link to data] |
| Why 2 | Why did [Why 1 answer] happen? | [Deeper cause] | [Link to data] |
| Why 3 | Why did [Why 2 answer] happen? | [Deeper cause] | [Link to data] |
| Why 4 | Why did [Why 3 answer] happen? | [Deeper cause] | [Link to data] |
| Why 5 | Why did [Why 4 answer] happen? | [Root cause] | [Link to data] |
Root cause statement: [One sentence summary of the root cause]
Stop check. You have found the root cause when: (a) fixing it would prevent the problem from recurring, (b) it is within your team's control, and (c) going deeper would reach organizational or philosophical territory that you cannot change.
Method 2: Fishbone (Ishikawa) Diagram
Categorize contributing factors across six categories. Not every category will have factors. Focus on the ones relevant to your problem.
People
- ☐ [Contributing factor, e.g., "New PM unfamiliar with launch process"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
Process
- ☐ [Contributing factor, e.g., "No QA sign-off required for Tier 2 launches"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
Technology
- ☐ [Contributing factor, e.g., "Feature flag system did not support gradual rollout"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
Data
- ☐ [Contributing factor, e.g., "Success metric was not instrumented before launch"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
Communication
- ☐ [Contributing factor, e.g., "CS team not informed of launch until after it went live"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
External
- ☐ [Contributing factor, e.g., "Competitor launched similar feature 2 days earlier"]
- ☐ [Contributing factor]
- ☐ [Contributing factor]
Primary root cause: [Which category and factor is the dominant cause?]
Contributing factors: [Which other factors made it worse?]
Corrective Actions
| # | Action | Root Cause Addressed | Owner | Due Date | Status |
|---|---|---|---|---|---|
| 1 | [Specific action to prevent recurrence] | [Root cause it fixes] | [Name] | [Date] | Not started |
| 2 | [Specific action] | [Root cause] | [Name] | [Date] | Not started |
| 3 | [Specific action] | [Root cause] | [Name] | [Date] | Not started |
Follow-Up
- ☐ 30-day follow-up scheduled for [date]
- ☐ Corrective actions verified as implemented
- ☐ Problem has not recurred since corrective actions
- ☐ RCA findings shared with the broader team
- ☐ Process or documentation updated based on findings
Filled Example: Failed Feature Launch RCA
Problem Statement (Filled)
| Field | Details |
|---|---|
| What happened | "Quick Export" feature launched to 100% of users. Adoption after 2 weeks: 1.2% (target was 15%) |
| When | February 3-17, 2026 |
| Who was affected | All 12,000 active users saw the feature; 144 used it |
| Impact | 6 weeks of engineering time produced minimal value. Opportunity cost: delayed payment workflow improvement |
| How it was detected | PM reviewed adoption dashboard at Day 14 |
| Immediate fix | None applied. Feature is live but underperforming |
5 Whys (Filled)
| Level | Question | Answer | Evidence |
|---|---|---|---|
| Problem | Why is adoption at 1.2%? | Users do not discover the feature | Heatmap shows 0.3% click rate on export button |
| Why 1 | Why do users not discover it? | The button is inside a settings submenu | UX audit: 4 clicks to reach the feature |
| Why 2 | Why is it buried in settings? | Design placed it there to avoid cluttering the main UI | Design spec from sprint planning |
| Why 3 | Why was "avoid clutter" prioritized over discoverability? | No usability testing was done before launch | Sprint timeline did not include testing |
| Why 4 | Why was usability testing skipped? | The team classified this as a Tier 3 (minor) launch | Launch tier doc: "Tier 3 = no user research required" |
| Why 5 | Why was it classified as Tier 3? | Tier classification was based on engineering effort, not user impact | Launch tier criteria doc (outdated since Q3 2025) |
Root cause: Launch tier classification criteria use engineering effort as the primary dimension. User impact is not factored in, causing features with high user value but low engineering effort to skip validation steps.
Corrective Actions (Filled)
| # | Action | Root Cause | Owner | Due Date |
|---|---|---|---|---|
| 1 | Update launch tier criteria to include user impact as a scoring dimension | Tier classification | PM Lead | March 1 |
| 2 | Require discoverability review for any new UI element, regardless of tier | Feature placement | Design Lead | March 1 |
| 3 | Move Quick Export button to the main toolbar and measure adoption change | Low discovery | PM (Alex) | March 15 |
Key Takeaways
- State the problem with specific, measurable impact before analyzing causes
- Use 5 Whys for single-thread problems, fishbone for multi-factor problems
- Stop at a root cause you can actually fix, not at "humans make mistakes"
- Every root cause needs a corrective action with an owner and a due date
- Follow up at 30 days to verify the fix worked and the problem has not returned
About This Template
Created by: Tim Adair
Last Updated: 3/4/2026
Version: 1.0.0
License: Free for personal and commercial use
