AI products face a pricing challenge that traditional SaaS doesn't: marginal costs scale with usage. Every API call, every token, and every inference costs real money. This breaks the SaaS playbook where you can offer unlimited usage because serving more costs nearly nothing.
Three pricing models dominate: usage-based (pay per token/query), subscription (flat monthly fee), and hybrid (base subscription plus overages). Each solves different problems and works for different business models.
Usage-Based Pricing
How it works: Customers pay for what they consume. Tokens processed, API calls made, documents analyzed, or images generated. The bill fluctuates with usage.
Best for:
- Developer tools and APIs (OpenAI, Anthropic, Cohere)
- Products with highly variable usage patterns
- B2B customers who can predict and budget for consumption
- Companies with transparent unit economics who can pass costs through
Advantages:
- Revenue tracks costs perfectly (gross margins stay consistent)
- Low barrier to entry (users try the product without commitment)
- Fair for light users (they don't subsidize heavy users)
- Scales naturally with customer value
Disadvantages:
- Unpredictable revenue makes forecasting difficult
- Users fear bill shock and may limit usage
- High-touch sales required to educate buyers on cost modeling
- Churn risk when bills spike unexpectedly
Real examples:
OpenAI charges $10 per 1M input tokens for GPT-4 Turbo. A developer building a chatbot pays based on conversation volume. Light testing costs $5/month. Production deployment costs $500-5,000/month depending on traffic.
Anthropic uses tiered volume pricing: Claude 3.5 Sonnet costs $3 per 1M input tokens at baseline, dropping to $2.40 above 50M tokens monthly. Heavy users get automatic discounts.
When usage-based pricing fails: Consumer products where users can't predict costs. A writing assistant charging $0.05 per generation confuses non-technical users. They want to know monthly cost upfront, not calculate token consumption.
Subscription Pricing
How it works: Customers pay a fixed monthly or annual fee for defined usage limits or unlimited access. Predictable revenue, predictable costs (from the user's perspective).
Best for:
- B2C products where users value pricing predictability
- Features embedded in existing paid plans
- Products with relatively uniform usage across users
- Companies willing to subsidize power users to reduce friction
Advantages:
- Predictable revenue for forecasting and planning
- Simple to understand and purchase
- Removes usage anxiety (users maximize value without cost fear)
- Works with existing SaaS sales and billing infrastructure
Disadvantages:
- Gross margins vary wildly by user (power users destroy unit economics)
- Must set usage caps or risk unlimited losses
- Light users overpay (may churn to cheaper alternatives)
- Difficult to price correctly without usage data
Real examples:
GitHub Copilot charges $10/month for individuals, unlimited code completions. Their average user costs $15-20/month in API fees. They lose money on individual plans to drive enterprise adoption at $39/user/month with volume discounts.
Notion AI adds $10/month per user on top of base Notion subscriptions. Average users generate 30-50 AI queries monthly at $0.08 per query ($2.40-4.00 cost). Notion nets 60-75% gross margin on the AI tier.
Grammarly Premium includes unlimited AI writing suggestions in the $12/month tier. Power users (writers, marketers) generate $8-15/month in inference costs. Casual users cost $1-2/month. Cross-subsidy makes this work.
When subscription pricing fails: Developer APIs and infrastructure products. Developers need granular usage tracking and want to pay for actual consumption. Unlimited API calls at $50/month invites abuse and destroys margins.
Hybrid Pricing
How it works: Base subscription includes usage credits or limits. Overages charged at metered rates. Combines predictability with cost alignment.
Best for:
- Products transitioning from pure subscription to usage-aligned pricing.
- Mixed user bases (light users want simplicity, power users want fair pricing)
- Features with bimodal usage distributions
- Companies testing pricing models before committing
Advantages:
- Captures both casual users (subscription) and power users (overages)
- Mitigates risk of power users destroying margins
- Allows granular pricing without alienating price-sensitive users
- Flexibility to adjust base tier and overage rates separately
Disadvantages:
- More complex to explain and sell
- Requires sophisticated billing infrastructure
- Users may hit limits and churn instead of upgrading
- Can feel like bait-and-switch if limits are too low
Real examples:
Jasper AI charges $49/month for 50,000 words generated, then $0.005 per additional word. Most users stay under the cap. Power users (agencies, content teams) pay $100-300/month in overages.
Loom's AI features include 25 AI-generated summaries per month in Pro plans ($12.50/user). Additional summaries cost $1 each. This caps exposure while monetizing heavy use.
Claude.ai (Anthropic's consumer product) offers Claude Pro at $20/month with 5x higher usage limits versus free tier. Overages aren't charged. Users hit rate limits and wait. This protects margins without billing complexity.
When hybrid pricing fails: When the base tier limits are too restrictive. If 60% of users hit caps in their first month, they perceive it as a paywall, not a fair use limit. Limits should be set so 70-80% of users stay under them.
Side-by-Side Comparison
| Dimension | Usage-Based | Subscription | Hybrid |
|---|---|---|---|
| Revenue predictability | Low (varies with usage) | High (recurring) | Medium (base recurring + variable) |
| Gross margin stability | High (scales with revenue) | Low (varies by user) | Medium (protected by overages) |
| User price certainty | Low (bill varies) | High (fixed monthly) | Medium (base known, overages variable) |
| Barrier to entry | Low (pay as you go) | Medium (requires commitment) | Low-Medium (low base, scales up) |
| Best for heavy users | Fair (they pay more) | Great (unlimited value) | Good (fair price without cap anxiety) |
| Best for light users | Great (only pay for use) | Poor (overpay vs usage) | Good (stay in base tier) |
| Sales complexity | High (must explain unit economics) | Low (simple monthly price) | Medium (explain tier + overages) |
| Abuse risk | Low (costs passed through) | High (unlimited = exploited) | Medium (caps + overages limit exposure) |
Choosing the Right Model
Start with usage-based if:
- Your customers are developers or technical buyers who understand metered pricing
- Usage varies 10x+ across your user base
- You have transparent unit economics and can pass costs through
- You're selling infrastructure or APIs
Start with subscription if:
- Your customers are non-technical and value pricing simplicity
- Usage is relatively uniform across users (within 3x range)
- You're embedding AI in an existing product with subscription pricing
- You can afford to subsidize power users for growth
Start with hybrid if:
- You're unsure which model fits your usage distribution
- You have both casual and power users
- You want subscription simplicity with overage protection
- You're transitioning from subscription to usage-based
Common Pricing Mistakes
Setting subscription limits too low: Jasper originally capped Pro at 20,000 words/month. 40% of users hit limits in week one and churned. They 10x'd the cap to 200,000 words, tripled retention, and added overage charges.
Usage-based pricing without volume discounts: Linear per-token pricing penalizes growth. Users delay scaling because costs scale linearly. Add tiered pricing: first 1M tokens at $10/M, next 10M at $8/M, above 10M at $6/M.
Hybrid pricing where overages cost more than base tier: If base tier costs $0.10 per query and overages cost $0.15, users feel punished for using the product. Overage rates should match or slightly discount base tier economics.
Changing pricing models post-launch without grandfather clauses: Moving from unlimited to metered pricing requires grandfathering existing users or facing backlash. Give 6-12 months notice and offer migration incentives.
Not instrumenting usage data before launch: You cannot price correctly without knowing usage distribution. Soft-launch with permissive limits and log everything. Analyze for 60-90 days before setting final pricing.
How Top AI Companies Price
OpenAI (API): Pure usage-based. No subscriptions. Volume discounts at scale. Works because customers are developers who can model costs. ChatGPT Plus (consumer) uses subscription ($20/month) because consumers want simplicity.
Anthropic (API): Usage-based with automatic volume tiers. Claude Pro (consumer) is subscription with rate limits instead of hard caps. Protects margins without billing complexity.
Notion AI: Hybrid. $10/month added to Notion subscriptions, soft caps on usage. Most users stay under limits. Power users get notified and upgrade to higher tiers.
GitHub Copilot: Tiered subscription. $10/month individuals (unlimited, subsidized to drive adoption). $39/month enterprise (usage tracking, better margins via volume API discounts).
Grammarly: Pure subscription. Unlimited AI suggestions in $12/month tier. Cross-subsidy works because usage is predictable (professional writers use it daily but within bounded limits).
Pricing Strategy Across Product Lifecycle
Pre-PMF (months 0-6): Use generous free tier or low flat-rate pricing. Optimize for learning, not revenue. You need usage data to set correct pricing later.
Early PMF (months 6-18): Introduce usage-based or hybrid pricing. Instrument everything. Track unit economics by cohort. Adjust limits and rates monthly based on data.
Scale (18+ months): Optimize pricing for margin and growth. Add volume discounts for enterprise. Consider tiered models (Starter/Pro/Enterprise) with different usage caps.
Enterprise motion: Move to annual contracts with committed usage (e.g., $100K annual license with 50M tokens included, $0.08 per token overage). Predictable revenue for you, predictable costs for them.
When to Switch Pricing Models
Subscription → Hybrid: When power users destroy margins and churn is low. Add soft caps with overage charges.
Subscription → Usage-based: When usage distribution is bimodal (half of users barely use the product, 10% use 100x the median). Metered pricing captures value from power users without alienating light users.
Usage-based → Hybrid: When sales cycles lengthen because buyers can't forecast costs. Add base tiers with included usage to give budget certainty.
Hybrid → Subscription: When 90%+ of users stay under caps and billing complexity outweighs margin protection. Simplify to flat pricing.
Pricing and Unit Economics
Your pricing model must align with your unit economics. Calculate:
Break-even usage: How much usage can you support at each price point before gross margin goes negative?
Example: $20/month subscription with $0.05 per query in costs. Break-even is 400 queries/month. If median user generates 800 queries, your gross margin is 50%. If 90th percentile generates 2,000 queries, those users have negative margins.
Target margins by tier:
- Free tier: -50% to +10% (acquisition cost)
- Starter tier: 40-60% (growth engine)
- Pro tier: 60-75% (profit center)
- Enterprise: 70-85% (volume discounts offset by lower support costs)
Price each tier to hit these targets, or cap usage to prevent margin destruction.
Tools and Resources
- AI ROI Calculator - Model pricing scenarios and margins
- LLM Cost Estimator - Calculate inference costs across providers
- AI Unit Economics Framework - Build sustainable cost structure
- AI Product-Market Fit - Find PMF with sustainable pricing
Quick Decision Matrix
Choose Usage-Based if: Developer audience, API/infrastructure product, variable usage, transparent costs → Examples: OpenAI API, Anthropic API
Choose Subscription if: Consumer audience, embedded feature, uniform usage, value simplicity → Examples: Grammarly, Notion AI, ChatGPT Plus
Choose Hybrid if: Mixed audience, bimodal usage, want flexibility, mitigate margin risk → Examples: Jasper, Loom AI, Descript
The best pricing model aligns revenue with costs while matching user expectations. Don't copy competitors. Price based on your unit economics, usage distribution, and customer segment. Test, measure, and iterate until margins and growth both hit targets.