Quick Answer (TL;DR)
AI agents are systems that take autonomous actions to achieve goals, rather than simply generating text in response to a prompt. As a PM, specifying agent behavior requires a fundamentally different approach than specifying traditional features. You need to define goals (not just tasks), set behavioral constraints (not just UI rules), specify tool-use policies (what the agent can and cannot do), and establish success criteria that account for the probabilistic nature of agent behavior. The spec is not a PRD in the traditional sense. It is closer to a set of operating rules for an autonomous system.
Summary: Specifying AI agents requires defining goals, constraints, tool policies, and success criteria that govern autonomous behavior rather than deterministic feature flows.
Key Steps:
Time Required: 3-5 days for a comprehensive agent spec; ongoing refinement
Best For: PMs building products with autonomous AI capabilities (coding assistants, research agents, workflow automation)
Table of Contents
What Makes Agents Different from Prompts
A prompt-based AI feature is reactive: user sends input, model generates output, done. An AI agent is proactive: it receives a goal, plans a sequence of actions, executes those actions using tools, observes results, and iterates until the goal is achieved or it determines it cannot proceed.
This distinction changes everything about how you spec the feature:
From Inputs/Outputs to Goals/Actions
Prompt-based feature: "Given a customer support ticket, generate a draft response."
Agent-based feature: "Resolve customer support tickets by researching the issue in our knowledge base, drafting a response, checking it against our quality standards, and sending it if confidence is high enough, or escalating to a human if not."
From Deterministic to Probabilistic
Traditional features follow predictable paths. An agent might take 3 steps to resolve one ticket and 12 steps for another. It might succeed on the first try or need to backtrack and try a different approach. Your spec must account for this variability.
From User-Triggered to Autonomous
Agents can act without user input at each step. This creates new categories of risk. A traditional feature cannot accidentally delete a customer's data because every action requires user confirmation. An agent with file system access could, unless you explicitly constrain it.
The Agent Spec Document
An agent spec is not a traditional PRD. It is a hybrid of product requirements, operating procedures, and safety policies. It needs to answer three fundamental questions:
Who Reads the Agent Spec
Your agent spec serves multiple audiences:
Defining Agent Goals
Goal Structure
Every agent needs a clearly defined goal hierarchy:
Primary goal: The overarching objective the agent is trying to achieve. This should be stated in terms of user or business outcomes, not agent actions.
Good: "Resolve customer support tickets so the customer's issue is fixed and they are satisfied."
Bad: "Generate responses to customer support tickets."
Sub-goals: The intermediate objectives the agent pursues to achieve the primary goal. These represent the steps in the agent's typical workflow.
Example sub-goals for a support agent:
Constraints on goals: Boundaries on how the agent pursues its goals.
"Resolve the issue in the fewest steps possible. Never ask the customer for information that is already available in their account data. Prioritize accuracy over speed."
Goal Clarity Checklist
For each goal, verify:
Setting Autonomy Levels
Not every agent action should have the same level of autonomy. The PM's job is to define which actions the agent can take independently, which require user confirmation, and which are strictly off-limits.
The Autonomy Spectrum
Level 1 - Suggest: Agent recommends an action but takes no action. The user must explicitly approve.
Example: "Suggest a response to this support ticket for the human agent to review and send."
Level 2 - Act then notify: Agent takes the action and informs the user after the fact. The user can undo.
Example: "Categorize and route incoming tickets automatically. Show the agent's categorization decision in the ticket metadata."
Level 3 - Act silently: Agent takes the action without notification. Used only for low-risk, high-confidence actions.
Example: "Update the ticket's internal priority score based on sentiment analysis."
Level 4 - Act and prevent undo: Agent takes irreversible actions. This level should be extremely rare and heavily constrained.
Example: Almost never appropriate for customer-facing actions. Might apply to internal cleanup tasks.
Assigning Autonomy Levels
For each action the agent can take, assign an autonomy level based on:
| Factor | Lower Autonomy | Higher Autonomy |
|---|---|---|
| Reversibility | Action is irreversible | Action is easily undone |
| Impact | Affects customer data or experience | Only affects internal systems |
| Confidence | Agent is uncertain | Agent has high confidence |
| Frequency | Rare occurrence | Routine, repetitive task |
| Stakes | Financial or legal implications | No external consequences |
Behavioral Constraints and Guardrails
Constraints are the most important part of your agent spec. Goals tell the agent what to achieve. Constraints tell it how to behave while achieving those goals.
Types of Constraints
Hard constraints (never violate):
Soft constraints (prefer but can flex):
Writing Effective Constraints
Constraints must be specific and testable. For each constraint, you should be able to write a test case that verifies the agent respects it.
Bad constraint: "Be safe."
Good constraint: "Never execute code that deletes, modifies, or overwrites files outside the designated workspace directory. If the user requests a file operation outside the workspace, explain the restriction and suggest an alternative."
Bad constraint: "Do not be biased."
Good constraint: "When comparing products or recommending solutions, present at least two options with pros and cons for each. Never recommend a single option without acknowledging its limitations."
The Constraint Hierarchy
When constraints conflict, the agent needs a priority order:
Tool Use Policies
Agents achieve goals by using tools: APIs, databases, file systems, web browsers, code interpreters. Your spec must define exactly which tools the agent has access to and the rules for using each one.
Tool Inventory
For each tool the agent can use, document:
Tool Use Rules
Least privilege principle: Give the agent access to the minimum set of tools required for its job. A support agent does not need access to the deployment pipeline. A coding assistant does not need access to the billing system.
Read before write: When the agent needs to modify something (database, file, configuration), it should read the current state first, confirm the modification is correct, and then write. This prevents accidental overwrites.
Confirm before destructive actions: Any tool call that deletes data, sends communications to customers, or modifies billing should require explicit confirmation (from the user or from a separate approval system).
Log everything: Every tool call the agent makes should be logged with the input, output, and reasoning. This creates an audit trail for debugging and compliance.
Escalation and Handoff Rules
Every agent must know when to stop acting autonomously and involve a human. Poorly defined escalation rules are the most common cause of agent failures in production.
Escalation Triggers
Define specific conditions that trigger escalation:
Confidence-based: "If the agent's confidence in its response is below 0.7, escalate to a human reviewer."
Content-based: "If the customer mentions legal action, regulatory complaints, or media attention, immediately escalate to a senior support manager."
Complexity-based: "If the agent has taken more than 10 actions without resolving the issue, escalate with a summary of what has been tried."
Error-based: "If a tool call fails twice in succession, escalate with the error details."
User-requested: "If the user asks to speak with a human at any point, immediately comply."
The Handoff Protocol
When escalating, the agent should:
Memory and Context Management
Agents that operate across multiple interactions or sessions need memory management policies.
Short-Term Memory (Within a Session)
Define what the agent retains during a single interaction:
Long-Term Memory (Across Sessions)
Define what persists across interactions:
What the Agent Must Forget
Equally important is defining what the agent must not retain:
Success Criteria and Evaluation
Measuring Agent Performance
Agent evaluation is fundamentally different from prompt evaluation. You are not just scoring individual outputs; you are assessing whether the agent achieved its goal through a sequence of actions.
Goal achievement rate: What percentage of tasks does the agent complete successfully without human intervention?
Efficiency: How many steps does the agent take on average? How long does it take? What is the cost per task?
Constraint compliance: How often does the agent violate its constraints? Zero tolerance for hard constraint violations.
User satisfaction: When the agent resolves an issue, how does the user rate the experience?
Escalation quality: When the agent escalates, does it provide sufficient context for the human to take over smoothly?
The Agent Scorecard
Track these metrics weekly:
| Metric | Target | Current | Trend |
|---|---|---|---|
| Goal achievement rate | > 85% | - | - |
| Avg steps per task | < 8 | - | - |
| Hard constraint violations | 0 | - | - |
| Soft constraint compliance | > 95% | - | - |
| User satisfaction (CSAT) | > 4.2/5 | - | - |
| Escalation rate | 10-20% | - | - |
| Avg resolution time | < 5 min | - | - |
The Agent Spec Template
Use this template as the starting point for any agent spec:
Section 1: Agent Overview
Section 2: Goal Hierarchy
Section 3: Autonomy Matrix
Section 4: Tool Inventory
Section 5: Behavioral Constraints
Section 6: Escalation Protocol
Section 7: Memory Policies
Section 8: Success Criteria
Section 9: Evaluation Plan
Common Mistakes
Mistake 1: Specifying actions instead of goals
Instead: Define what the agent should achieve, not the exact steps it should take. Let the agent reason about the best path to the goal.
Why: Overly prescriptive specs make agents brittle. They cannot adapt to unexpected situations because they are following a script, not pursuing a goal.
Mistake 2: Missing constraint prioritization
Instead: Explicitly rank your constraints so the agent knows which to preserve when they conflict.
Why: Without prioritization, the agent makes arbitrary choices when constraints conflict, leading to inconsistent and sometimes dangerous behavior.
Mistake 3: Setting autonomy too high at launch
Instead: Start with Level 1 (suggest only) for all actions. Increase autonomy gradually based on performance data.
Why: You can always give an agent more autonomy later. Taking autonomy away after a visible failure is a trust-destroying event for users.
Mistake 4: No escalation path
Instead: Define explicit escalation triggers and handoff protocols for every agent.
Why: An agent without an escalation path will either fail silently (bad UX) or keep trying increasingly creative solutions (dangerous).
Mistake 5: Ignoring tool interaction failures
Instead: Define fallback behavior for every tool the agent uses. What happens when the API times out? When the database returns unexpected data?
Why: In production, tools fail regularly. An agent without fallback behavior will either crash or hallucinate responses.
Getting Started Checklist
Week 1: Foundation
Week 2: Constraints and Safety
Week 3: Evaluation
Week 4: Iteration
Key Takeaways
Next Steps:
Related Guides
About This Guide
Last Updated: February 9, 2026
Reading Time: 14 minutes
Expertise Level: Intermediate to Advanced
Citation: Adair, Tim. "Specifying AI Agent Behaviors: A PM's Guide to Agent Design." IdeaPlan, 2026. https://ideaplan.io/guides/specifying-ai-agent-behaviors