What This Template Is For
Most process improvements fail not because the ideas are bad, but because nobody tracks whether they worked. A team identifies a bottleneck, proposes a fix, implements it, and moves on. Six months later, nobody can say whether the fix actually helped. The bottleneck may have returned, or a new one emerged in its place.
Continuous improvement (sometimes called kaizen) is the practice of making small, tracked improvements on a regular cadence. Instead of large transformation projects that take months, you run short improvement cycles (2-4 weeks) with clear hypotheses, measurable outcomes, and honest retrospectives. Over time, these small wins compound.
This template provides a structure for identifying improvement opportunities, running experiments, measuring results, and building an improvement backlog. For the operational context behind improvement programs, the Product Operations Handbook covers process maturity and scaling. Teams using OKRs can tie improvement targets to quarterly objectives. For tracking metrics that trigger improvement cycles, use the KPI Dashboard Template.
How to Use This Template
- At the start of each cycle (bi-weekly or monthly), review the improvement backlog with the team.
- Pick 1-2 improvements to run as experiments. Do not attempt more than two at once.
- Define a clear hypothesis, baseline metric, target metric, and measurement method for each experiment.
- Run the experiment for the defined period (2-4 weeks minimum).
- At the end of the cycle, measure results and decide: adopt (make permanent), iterate (run another cycle with adjustments), or abandon (did not work, move on).
- Log the results regardless of outcome. Failed experiments are valuable data.
The Template
Improvement Backlog
Capture every improvement idea. Score by impact and effort. Pull from the top of the backlog each cycle.
| # | Improvement Idea | Source | Process Affected | Expected Impact | Effort | Score (Impact x Effort) | Status |
|---|---|---|---|---|---|---|---|
| 1 | [Idea] | [Who proposed it] | [Process name] | [Hours saved / errors reduced] | Low / Med / High | [1-9] | Backlog |
| 2 | [Idea] | [Who proposed it] | [Process name] | [Hours saved / errors reduced] | Low / Med / High | [1-9] | Backlog |
| 3 | [Idea] | [Who proposed it] | [Process name] | [Hours saved / errors reduced] | Low / Med / High | [1-9] | Backlog |
| 4 | [Idea] | [Who proposed it] | [Process name] | [Hours saved / errors reduced] | Low / Med / High | [1-9] | Backlog |
| 5 | [Idea] | [Who proposed it] | [Process name] | [Hours saved / errors reduced] | Low / Med / High | [1-9] | Backlog |
Scoring. Impact: High = 3, Medium = 2, Low = 1. Effort: Low = 3 (easy to do), Medium = 2, High = 1 (hard to do). Multiply for score. Higher scores are better candidates.
Active Experiment Card
Use one card per experiment per cycle.
| Field | Details |
|---|---|
| Experiment name | [Descriptive name] |
| Cycle | [e.g., March 2026, Cycle 1] |
| Owner | [Name] |
| Process being improved | [Process name] |
| Current pain point | [What is not working and why] |
Hypothesis. If we [change], then [metric] will improve from [baseline] to [target] within [timeframe].
Baseline measurement.
| Metric | Current Value | Data Source | Measurement Date |
|---|---|---|---|
| [Primary metric] | [Value] | [Where the data comes from] | [Date] |
| [Secondary metric] | [Value] | [Where the data comes from] | [Date] |
What we will change.
- ☐ [Specific change 1]
- ☐ [Specific change 2]
- ☐ [Specific change 3]
What we will NOT change. [Hold these variables constant to isolate the effect of the experiment.]
Duration. [X weeks]
Start date. [Date]
End date. [Date]
Results Log
| Metric | Baseline | Target | Actual | Delta | Hit Target? |
|---|---|---|---|---|---|
| [Primary metric] | [Value] | [Value] | [Value] | [+/- %] | Yes / No |
| [Secondary metric] | [Value] | [Value] | [Value] | [+/- %] | Yes / No |
Qualitative observations. [What did the team notice during the experiment? Any unexpected side effects?]
Decision.
- ☐ Adopt (make this change permanent)
- ☐ Iterate (run another cycle with adjustments: [describe adjustments])
- ☐ Abandon (did not work. Reason: [why])
Cycle Summary
Track all experiments across cycles to see trends.
| Cycle | Experiment | Outcome | Key Metric Change | Decision |
|---|---|---|---|---|
| [Month, Cycle #] | [Name] | [Success / Partial / Failed] | [+/- X%] | Adopt / Iterate / Abandon |
| [Month, Cycle #] | [Name] | [Success / Partial / Failed] | [+/- X%] | Adopt / Iterate / Abandon |
| [Month, Cycle #] | [Name] | [Success / Partial / Failed] | [+/- X%] | Adopt / Iterate / Abandon |
| [Month, Cycle #] | [Name] | [Success / Partial / Failed] | [+/- X%] | Adopt / Iterate / Abandon |
Cumulative impact. [Total hours saved, error rate reduction, or other aggregate metric across all adopted improvements.]
Filled Example: PM Team Quarterly Improvement Cycle
Backlog (Filled)
| # | Improvement Idea | Source | Process | Impact | Effort | Score | Status |
|---|---|---|---|---|---|---|---|
| 1 | Add structured intake form for feature requests | CS Lead | Intake | High (3) | Low (3) | 9 | Completed |
| 2 | Automate weekly sprint status report | PM standup | Reporting | Med (2) | Med (2) | 4 | Active |
| 3 | Create PRD peer review checklist | Eng Lead | PRD workflow | Med (2) | Low (3) | 6 | Backlog |
| 4 | Move roadmap updates to async video format | VP Product | Communication | Low (1) | Med (2) | 2 | Backlog |
| 5 | Standardize experiment tracking format | Data team | Experimentation | High (3) | High (1) | 3 | Backlog |
Experiment: Structured Intake Form (Filled)
Hypothesis. If we replace unstructured Slack requests with a structured Typeform, then the percentage of requests requiring follow-up clarification will decrease from 72% to under 30% within 4 weeks.
| Metric | Baseline | Target | Actual | Delta | Hit Target? |
|---|---|---|---|---|---|
| Requests needing clarification | 72% | <30% | 18% | -54 points | Yes |
| Avg triage time per request | 25 min | <15 min | 11 min | -56% | Yes |
| Requester satisfaction (1-5) | 2.8 | >3.5 | 4.2 | +50% | Yes |
Decision. Adopt. The structured form exceeded all targets. Made permanent. Added link to the form in the #feature-requests Slack channel topic and in the CS team's escalation playbook.
Cycle Summary (Filled)
| Cycle | Experiment | Outcome | Key Metric Change | Decision |
|---|---|---|---|---|
| Jan 2026, C1 | Structured intake form | Success | -54 pts clarification rate | Adopt |
| Jan 2026, C2 | Async standup format | Partial | -10 min/week per PM | Iterate |
| Feb 2026, C1 | Async standup v2 (video) | Failed | No time savings, lower engagement | Abandon |
| Feb 2026, C2 | PRD peer review checklist | Success | -40% rework after review | Adopt |
Cumulative Q1 impact. 3 of 4 experiments completed. 2 adopted, 1 abandoned, 1 iterated. Net savings: 8 hours/week across the PM team. Feature request triage time cut by more than half.
Key Takeaways
- Track every improvement as an experiment with a hypothesis, baseline, and target metric
- Run 1-2 experiments per cycle, not five. Focus beats breadth
- Log results for every experiment, including failures. Failed experiments prevent future waste
- Review cumulative impact quarterly to maintain team motivation
- Source improvement ideas from retrospectives, not from management brainstorms
About This Template
Created by: Tim Adair
Last Updated: 3/4/2026
Version: 1.0.0
License: Free for personal and commercial use
