GitHub's internal analysis of 2,500+ agent configuration files found that the quality of the specification explained more variance in agent output quality than the choice of model. The spec was the bottleneck, not the AI.
This should matter to every PM reading this. The rise of AI coding agents (Claude Code, Cursor, GitHub Copilot, Windsurf, Devin) is shifting what "shipping" means. You no longer need to wait three sprints for an engineer to build your prototype. You can describe what you want and get working code in minutes. But only if your description is precise enough for an agent to execute.
The problem: most PMs write specs optimized for human readers. Humans fill in gaps with intuition, ask clarifying questions, and infer intent from context. AI agents do none of this. They take your words literally, make assumptions when information is missing, and build exactly what you described, even when what you described is incomplete.
This guide covers how to write specifications that AI coding agents can reliably turn into working software. Whether you are using agents for rapid prototyping, building internal tools, or shipping features faster with your engineering team.
Why Traditional PRDs Fail AI Agents
Traditional PRDs serve a different audience. They align stakeholders, document decisions, and create shared understanding among humans. They include context, rationale, and strategic framing that helps a human reader make good judgment calls during implementation.
AI coding agents need none of that context. What they need is precision. Consider the difference:
Human-readable PRD: "The dashboard should show key metrics with a clean, modern design. Users should be able to filter by date range and export data."
Agent-executable spec: "Render a dashboard at /dashboard with four metric cards in a 2x2 grid: total users (count), active users (count from last 7 days), revenue (USD, formatted with commas), and churn rate (percentage to one decimal). Include a date range picker defaulting to last 30 days that filters all four metrics. Add a CSV export button in the top right that downloads the currently filtered data with columns: metric_name, value, date_range_start, date_range_end."
The first version gives a human enough to start a conversation. The second gives an agent enough to start building. When you are writing a PRD for AI features, you are defining what the AI product should do for users. When you are writing specs for AI agents, you are defining what the AI builder should create for you.
The SCOPE Method for Agent-Ready Specs
After studying how top teams structure their agent specifications, including GitHub's internal findings and Addy Osmani's research at Google, a clear pattern emerges. Effective specs cover five areas. I call it the SCOPE method.
S: Structure and Stack
Tell the agent exactly what it is working with. Vague references to "the project" force the agent to guess.
Include:
- Framework, language, and version (e.g., "Next.js 14 with App Router, TypeScript 5.3, Tailwind CSS 3.4")
- Directory layout ("Components in
src/components/, API routes insrc/app/api/") - Key dependencies and their purposes ("Uses Zustand for client state, React Query for server state")
- Naming conventions ("PascalCase for components, camelCase for utilities, kebab-case for files")
Example:
Stack: Next.js 14, TypeScript, Tailwind CSS, Prisma ORM with PostgreSQL
Structure: App Router in /app, shared components in /components,
server actions in /app/actions, Prisma schema in /prisma/schema.prisma
Conventions: Server Components by default. Only add 'use client' when
the component needs interactivity. Use Zod for all input validation.
C: Constraints and Boundaries
This is where most specs fail. AI agents cannot infer what you do not want. If you skip this section, the agent might add authentication to a public page, install a new dependency you did not ask for, or restructure files you wanted left alone.
Use a three-tier boundary system:
- Always: Actions the agent should take without asking (e.g., "Always use TypeScript strict mode," "Always add error boundaries around async components")
- Ask first: Actions that need human approval (e.g., "Propose new dependencies before installing," "Flag any database schema changes for review")
- Never: Hard stops (e.g., "Never modify
.envfiles," "Never delete existing tests," "Never commit secrets or API keys")
This framework comes from GitHub's research into what makes agent files effective. It gives the AI a decision model instead of a wall of instructions.
O: Outcomes and Acceptance Criteria
Define what "done" looks like in terms the agent can verify. Acceptance criteria for AI agents should be testable. Statements like "the UI should feel responsive" are meaningless to an agent. Instead, write: "Page load under 2 seconds on 3G throttle. No layout shift after initial render. All interactive elements respond within 100ms."
For each feature, specify:
- Expected inputs and outputs with concrete examples
- Edge cases to handle (empty states, error states, boundary values)
- Tests that should pass ("Running
npm testshould show all green. Add tests for the new component that cover: rendering with data, rendering empty state, and click handler behavior.")
The best specs include sample test cases directly. When you provide test expectations upfront, the agent writes code that passes those tests on the first try rather than requiring multiple rounds of correction.
P: Phases and Dependencies
Break the work into ordered phases. This is the most counterintuitive shift for PMs used to writing holistic feature descriptions.
Traditional PRDs present features holistically because humans can mentally reorder work. AI agents work sequentially. If you describe a feature that depends on a database table that does not exist yet, the agent might try to build the UI first and fail.
Structure phases as:
- Foundation: Data models, schemas, types
- Core logic: Business rules, API endpoints, server actions
- Interface: Components, pages, layouts
- Polish: Error handling, loading states, edge cases
Each phase should represent a bounded unit of work (roughly 15 to 30 minutes of agent time). Include explicit dependencies: "Phase 2 requires the Prisma schema from Phase 1 to be migrated."
E: Examples and Evidence
Concrete examples are worth more than paragraphs of description. This applies to both input/output examples and code style examples.
Input/output examples:
Given a user with subscription status "trial" and 3 days remaining:
- Show yellow banner: "Your trial expires in 3 days"
- Show "Upgrade" button linking to /pricing
- Do NOT show "Cancel" button during trial
Code style examples:
// Follow this pattern for API route handlers:
export async function POST(request: Request) {
const body = await request.json()
const validated = schema.parse(body)
// ... business logic
return Response.json({ data: result })
}
One real code snippet beats a page of style guidelines. The agent pattern-matches against examples far more reliably than it interprets abstract descriptions.
Putting SCOPE Into Practice: A Real Example
Here is how the SCOPE method looks for a real feature: adding a feedback widget to a SaaS product.
# Feature: In-App Feedback Widget
## Structure
- Next.js 14 App Router, TypeScript, Tailwind CSS
- New component: /components/FeedbackWidget.tsx (client component)
- New API route: /app/api/feedback/route.ts
- New Prisma model: Feedback in /prisma/schema.prisma
- Store feedback in existing PostgreSQL database
## Constraints
Always: Use TypeScript strict mode. Use Tailwind for all styling.
Ask first: Any changes to existing components or database indexes.
Never: Modify user authentication. Add new npm dependencies.
## Outcomes
- Floating button (bottom-right, 48x48px) opens a modal
- Modal contains: rating (1-5 stars), text area (max 500 chars), submit button
- On submit: POST to /api/feedback, close modal, show toast "Thanks for your feedback"
- Error state: show inline error, keep modal open, allow retry
- Tests: component renders, star selection works, form submits,
error state displays, API route validates input and returns 201
## Phases
1. Prisma schema: add Feedback model (id, rating, comment, userId, createdAt)
2. API route: POST handler with Zod validation, return 201 on success
3. Component: FeedbackWidget with modal, star rating, form submission
4. Integration: Add FeedbackWidget to root layout, test end-to-end
## Examples
POST /api/feedback body: { "rating": 4, "comment": "Love the new dashboard" }
Response 201: { "id": "abc123", "status": "received" }
POST /api/feedback body: { "rating": 0 }
Response 400: { "error": "Rating must be between 1 and 5" }
This spec is roughly 40 lines. An agent can execute it from start to finish without asking a single clarifying question. Compare that to a 5-page PRD that still leaves implementation details ambiguous.
The PM's New Superpower: Specification Architecture
The shift toward AI coding agents is not replacing PMs. It is making specification writing the most valuable skill in the PM toolkit.
Miqdad Jaffer, Product Lead at OpenAI (and formerly at Shopify), frames this well. When his team built Shopify's "Auto Write" feature, the quality of their specifications directly determined the quality of the AI output. The same principle applies when the AI is building your product instead of powering it.
This means PMs who can write precise, testable, well-structured specs will ship faster than PMs who write vague briefs and wait for engineers to fill in the gaps. As Addy Osmani (Engineering Lead at Google) puts it: "A clear spec covering just enough nuance to guide the AI without overwhelming it" is the goal.
For PMs looking to build AI into their products, this skill is doubly important. You need specs that guide both the agents building your product and the agents inside your product.
The AI Product Management Guide covers the strategic layer of building with AI. What follows is the tactical layer: how your specifications need to change.
Five Mistakes PMs Make Writing Agent Specs
1. Including Strategic Context the Agent Cannot Use
Your spec does not need a market analysis section. The agent does not care about your competitive positioning or TAM. It needs to know what to build, not why you decided to build it. Keep strategy in your internal PRD for stakeholders. Keep the agent spec purely tactical.
2. Using Ambiguous Language
Words like "intuitive," "modern," "responsive," and "clean" mean nothing to an agent. Replace every subjective term with a measurable one. "Modern design" becomes "8px border radius, 16px padding, Inter font, gray-50 background." "Responsive" becomes "single column below 768px, two columns above."
3. Describing Features Instead of Behaviors
"The app should have search" is a feature description. "When the user types in the search input, debounce 300ms, then query GET /api/search?q={input}. Display results as a list of cards with title (h3), description (truncated to 120 chars), and a link. Show 'No results found' when the array is empty" is a behavior specification. Agents excel at implementing behaviors. They struggle with features.
4. Skipping Error States and Edge Cases
If your spec does not mention what happens when the API returns a 500, the agent will either ignore error handling entirely or invent its own approach. Neither outcome is what you want. For every happy path, document at least the error state, the empty state, and the loading state.
5. Writing One Monolithic Spec
Context windows have limits. A 10,000-word spec will lose coherence as the agent processes it. Break your spec into focused documents: one for data models, one for API endpoints, one for each major component. Reference them explicitly: "Implement the API routes defined in specs/api.md using the Prisma schema from specs/schema.md."
Tools and Workflows for Spec-Driven Development
The spec-driven development workflow follows four phases, as outlined by GitHub's research:
- Specify: Write the SCOPE document covering structure, constraints, outcomes, phases, and examples
- Plan: Have the agent generate an implementation plan from your spec. Review it for correctness before proceeding
- Execute: Let the agent build phase by phase, validating each phase before moving to the next
- Verify: Run the test suite, review the output against acceptance criteria, and update the spec if requirements changed
This workflow maps well to how you might already use prioritization frameworks to decide what to build. The difference is that the "build" step now takes minutes instead of weeks.
Tools That Support This Workflow
Claude Code lets you feed a spec file directly and have the agent implement it while respecting project context. Shopify reported 100% of their developers using it for daily work.
Cursor reads your codebase context and applies spec-driven changes across multiple files. Its Composer feature is particularly good at multi-file refactors guided by a spec.
GitHub Copilot Workspace takes an issue description (essentially a spec) and proposes a multi-file implementation plan before writing code.
ChatPRD is purpose-built for PMs who want to generate agent-ready specs from rough product briefs. It translates PM-speak into agent-speak.
For PMs evaluating which tools fit their workflow, the AI tools comparison covers pricing and capability differences across the major platforms.
Measuring Spec Quality
How do you know if your specs are good? Track three metrics:
First-pass success rate: What percentage of agent outputs are usable without major revisions? If you are rewriting more than 20% of the agent's code, your spec needs work. Teams with well-structured specs report 70 to 85% first-pass success rates.
Clarification requests: Count how many times the agent asks for more information or makes incorrect assumptions you need to correct. Zero is the target. Each clarification means a gap in your spec.
Time to working prototype: Measure the time from spec completion to working, tested code. Good specs with capable agents produce working prototypes in 30 to 90 minutes for moderately complex features. If it is taking longer, your spec may be too ambiguous or too large for a single session.
What This Means for the PM Role
The ability to write agent-executable specs does not replace product sense, user empathy, or strategic thinking. It amplifies them. A PM who deeply understands the user problem and can translate that understanding into a precise specification will ship 10x faster than a PM who writes vague briefs.
This is not about learning to code. It is about learning to specify. The mental model is closer to writing acceptance criteria for a QA engineer than writing Python. If you can break down a feature into observable behaviors, define edge cases, and describe expected outcomes, you can direct an AI coding agent.
For PMs building AI-powered features, the specification skill becomes even more critical. You are simultaneously writing specs for agents that build your product and defining requirements for agents that run inside your product.
The PMs who master this skill will not just be faster. They will be the ones who decide what gets built while the agents handle how.
Related Blog Posts
- How to Write a PRD That Engineers Actually Read
- How to Write a PRD for AI and LLM Features
- The AI SDLC: How to Implement Claude Code and Codex in Your Dev Workflow
- Context Engineering for Product Managers
Sources
- How to Write a Good Spec for AI Agents - Addy Osmani, Google Engineering Lead
- How to Write PRDs for AI Coding Agents - David Haberlah
- A Proven AI PRD Template - Miqdad Jaffer, OpenAI Product Lead
- Spec-Driven Development and AI Agents Explained - Augment Code
- Beyond Vibe Coding with Addy Osmani - The Pragmatic Engineer