Overview
Every product team says they prioritize. Most do not. They negotiate. The loudest stakeholder wins, the most recent customer request jumps the queue, or the HiPPO (Highest Paid Person's Opinion) overrides the data. The result is a backlog that reflects politics, not strategy.
This workshop fixes that by putting the team's actual backlog through a structured scoring process. Not a theoretical exercise — real features, real scores, real debates about trade-offs. By the end, you will have a stack-ranked backlog that the team built together and can defend to anyone who asks "why is my feature not on the roadmap?"
Who this is for: Product managers facilitating prioritization with their cross-functional team (PM, engineering leads, design, and optionally one senior stakeholder).
Time required: 90 minutes
What participants will walk away with:
Pre-Work (Required — Send 72 Hours Before)
The workshop fails without preparation. Send this to all participants at least 3 days before:
For the facilitator (PM):
For participants:
Materials Needed
Spreadsheet template setup:
Create a Google Sheet with these columns before the workshop:
| Column | Pre-filled? | Notes |
|---|---|---|
| Feature Name | Yes | One-line description |
| User Story | Yes | Two sentences max |
| Reach (users/quarter) | Yes | PM fills from analytics |
| Impact (1-5) | No | Team scores in workshop |
| Confidence (%) | No | Team scores in workshop |
| Effort (person-months) | No | Team scores in workshop |
| RICE Score | Formula | =(Reach Impact Confidence) / Effort |
| Notes | No | Capture debate flags and rationale |
Share the spreadsheet link with all participants 30 minutes before the workshop so they can familiarize themselves with the feature list.
Part 1: Scoring Calibration (20 minutes)
What this accomplishes
If you skip calibration and jump straight into scoring, you will spend the entire workshop arguing about whether "High Impact" means 2 or 3 on your scale. Calibration gets the team aligned on what the numbers mean before you apply them to real features.
Facilitator instructions
Explain the RICE scoring system (5 minutes):
Walk through each dimension using the RICE framework definitions:
Calibration exercise (15 minutes):
Pick two features from the candidate list — one that everyone intuitively feels is "high priority" and one that feels "low priority."
For each feature, do a simultaneous reveal:
"Everyone, write down your Impact score for Feature A. Do not show anyone. 3, 2, 1 — hold up your number."
If scores vary by more than 1 point, pause and discuss:
"We have a 2 and a 4 on Impact. [Person with 2], what are you seeing? [Person with 4], what makes this high for you?"
Run through Impact, Confidence, and Effort for both calibration features. By the end, the team should have a shared sense of what each number means in the context of your product.
Key facilitation tip: The most common calibration failure is Impact inflation. Everyone wants their feature to be "high impact." Counter this by asking: "Compared to the last feature we shipped that you would call genuinely high impact — say, [name a specific recent win] — how does this feature compare?"
Calibration reference card (post this on the wall or share in chat):
| Score | Impact | Confidence | Effort |
|---|---|---|---|
| 5 / 100% / 0.5mo | Massive — users cannot do their job without it | High confidence — validated by data from 50+ users or controlled experiment | Tiny — one person, one week |
| 4 / 80% / 1mo | High — noticeably improves daily workflow | Some evidence — 10+ customer requests, qualitative interviews | Small — one person, one month |
| 3 / 50% / 2mo | Medium — nice to have, solves a real but non-critical pain | Educated guess — based on market research or analogous products | Medium — 2 people, one month |
| 2 | Low — marginal improvement most users would not notice | Low confidence — PM intuition only | Large — one quarter of a team |
| 1 | Minimal — affects edge cases only | Speculation — no supporting evidence | Very large — full team for a quarter |
Print this card for every participant. It eliminates 80% of the "what does a 3 mean?" debates.
Part 2: Batch Scoring (30 minutes)
What this accomplishes
This is the core of the workshop: scoring every candidate feature using the calibrated RICE framework. The batch approach prevents the common trap of spending 20 minutes on the first feature and rushing the last five.
Facilitator instructions
Set the pace (2 minutes per feature):
"We have [X] features to score in 30 minutes. That is roughly 2 minutes per feature. I will keep us on time. For each feature, I will read the description, we will do a simultaneous reveal on each dimension, and I will capture the consensus score. If there is a disagreement of more than 1 point, we will flag it for the debate round."
Scoring process for each feature:
Capture in the spreadsheet in real time. The RICE calculator auto-computes the final score.
Flagging disagreements:
When a score has a spread greater than 1, do not resolve it now. Write "DEBATE" next to the feature and move on. You will have a dedicated round for these.
Expect 3-5 features to get flagged. If more than half are flagged, your calibration was insufficient — pause and recalibrate on Impact specifically.
Facilitator tips for this section:
Example scoring output after Part 2:
| Feature | Reach (users/qtr) | Impact (1-5) | Confidence (%) | Effort (person-mo) | RICE Score | Notes |
|---|---|---|---|---|---|---|
| Self-serve data import | 3,200 | 4 | 80% | 2 | 5,120 | |
| Dashboard redesign | 8,000 | 3 | 50% | 4 | 3,000 | |
| SSO for enterprise | 400 | 5 | 100% | 3 | 667 | DEBATE (Impact split: 3 vs 5) |
| Mobile notifications | 5,000 | 2 | 50% | 1 | 5,000 | |
| Custom reporting | 1,200 | 3 | 50% | 3 | 600 | DEBATE (Effort split: 2 vs 5) |
Notice that the raw RICE score sometimes produces surprises. Mobile notifications — which nobody championed loudly — scores nearly as high as the self-serve data import because of its combination of high reach and low effort. That is the framework doing its job: revealing priorities the room's intuition would have missed.
Part 3: Debate Round (20 minutes)
What this accomplishes
The flagged features from Part 2 are the interesting ones. They are interesting because reasonable people looked at the same feature and reached different conclusions — which means there is information asymmetry in the room. This round surfaces that information.
Facilitator instructions
Triage the flagged features (2 minutes):
Look at the flagged items. If a flagged feature scores so low overall that resolving the disagreement would not change its rank position, skip it. Focus debate time on features near the cut line where the disagreement matters.
Structured debate format (4 minutes per feature, up to 4-5 features):
For each flagged feature:
Take the median of the new scores. Update the spreadsheet.
What to watch for in debates:
Part 4: Final Stack Rank and Commitments (20 minutes)
What this accomplishes
The RICE scores produce a rank. But scores alone do not account for dependencies, sequencing, or strategic themes. This final section takes the mathematical output and applies human judgment to produce the actual priority order.
Facilitator instructions
Review the scores (5 minutes):
Sort the spreadsheet by RICE score, highest to lowest. Share the screen.
"Here is what the math says. The top 5 features by RICE score are: [read them]. Before we finalize, I want to flag two things: first, are there any dependency chains where Feature B cannot ship without Feature A? Second, are there any strategic reasons to override the score?"
Adjustment round (5 minutes):
Allow the room to propose adjustments. Each adjustment requires a stated reason:
Limit adjustments to 2-3 moves. If the team wants to rearrange the entire list, the scoring was not rigorous enough and you should schedule a follow-up.
Document the final rank (5 minutes):
For each feature in the top 5 (or however many fit in the next quarter):
| Feature | RICE Score | Owner | Target Quarter | Key Assumption |
|---|---|---|---|---|
| Feature A | 42 | [Name] | Q2 2026 | Users will adopt self-serve onboarding |
The "Key Assumption" column is borrowed from the strategy alignment workshop. It keeps the team honest about why something is prioritized.
Close the loop (5 minutes):
"For every feature below the cut line: this does not mean 'never.' It means 'not this quarter, given what we know today.' If new data changes the Reach or Impact score, we will re-evaluate. For stakeholders who asked for specific features that did not make the cut, I will share the RICE score and the rationale. Any questions?"
Get explicit acknowledgment from each participant that they support the final rank.
Adapting for Different Team Sizes and Contexts
For small teams (3-5 people)
Skip the small-group split in Part 2. Instead, score as a full group with the spreadsheet projected. This is faster and eliminates the share-out step. You can comfortably score 15 features in 25 minutes with a small group. Cut total workshop time to 75 minutes.
For large teams (10+ people)
Split into two scoring groups of 5, each with their own spreadsheet. Each group scores all features independently. Then merge the spreadsheets by averaging scores. Where the two groups diverge by more than 20% on a feature's total score, that feature goes to the debate round. This parallel approach keeps the scoring phase tight even with large groups.
For teams using frameworks other than RICE
The workshop structure works with ICE (Impact, Confidence, Ease), weighted scoring, or MoSCoW with minor modifications. For ICE, replace the Reach dimension with a combined "Impact" score and rename "Effort" to "Ease" (invert the scale — higher is easier). For MoSCoW, replace the scoring phase with a sorting exercise: each participant independently places features into Must/Should/Could/Won't, then the group resolves conflicts category by category. The RICE vs ICE vs MoSCoW comparison covers when each framework works best.
For distributed teams running this quarterly
After the first workshop, subsequent quarterly sessions get faster. Create a "rolling backlog" spreadsheet where new feature requests are pre-scored by the PM as they come in during the quarter. In the quarterly workshop, the team validates the PM's pre-scores and only debates items where the team disagrees with the PM's assessment. This cuts the batch scoring phase from 30 to 15 minutes.
Next Steps for the Facilitator
Within 24 hours of the workshop:
For teams considering different scoring approaches, the RICE vs ICE vs MoSCoW comparison can help you decide if RICE remains the right framework for your team's maturity level.
Facilitator Tips
On managing debates:
On effort estimation:
On remote facilitation:
On stakeholder management after the workshop:
Common pitfalls: