Capabilities

Capabilities are the protocol's unit of authorization — server-offered actions that agents request and execute with scoped grants.

Capabilities are the protocol's unit of authorization. A capability is a server-offered action identified by a stable name and a human-readable description. Agents request capabilities at registration, and the server creates grants that track what each agent is allowed to do.

What is a capability?

A capability represents a single action a server offers — checking a balance, sending a message, deploying code, reading a file. Each capability has a name that agents use to request and execute it, and a description that humans read during approval flows.

It's useful to distinguish two related concepts:

Capability — what the server offers
Capability grant — what an agent has been authorized to execute

The same capability name can appear in different parts of the protocol: requested by an agent, granted by the server, included in a host's defaults, or used to restrict a JWT. The name identifies the action; the context determines how it's being used.

Capability fields

Field	Type	Description
`name`	string	Stable agent-facing identifier
`description`	string	Human-readable description of what this capability does
`input`	object	JSON Schema for execution arguments (optional)
`output`	object	JSON Schema for the return value (optional)
`grant_status`	string	Whether the capability has been granted to the requesting agent (optional)

{
  "name": "check_balance",
  "description": "Check the balance of a bank account",
  "input": {
    "type": "object",
    "required": ["account_id"],
    "properties": {
      "account_id": {
        "type": "string",
        "description": "The bank account ID to check"
      }
    }
  },
  "output": {
    "type": "object",
    "properties": {
      "account_id": { "type": "string" },
      "balance": { "type": "number" },
      "currency": { "type": "string" }
    }
  },
  "grant_status": "granted"
}

The input and output schemas let agents reason about what a capability needs and what it returns, enabling multi-step planning and chaining without trial and error.

Capability grants

When an agent is granted a capability, the server creates an agent capability grant — a record that tracks:

Which capability was granted
The grant status (active, pending, denied, revoked)
Any constraints on the grant (see scoped grants)
When the grant was created and last updated

Grants are separate from agent identity — they can be added, escalated, or revoked independently without affecting the agent's registration.

Default capabilities

Servers define a set of default capabilities — actions that are auto-granted to agents from trusted hosts without requiring user approval each time. When a linked host registers a new agent requesting only default capabilities, the server can approve immediately.

Default capabilities are typically low-risk, read-only operations. Higher-risk capabilities (transfers, deletions, admin actions) should always require explicit approval.

Capability escalation

An active agent can request additional capabilities at runtime via POST /agent/request-capability. This triggers a new approval flow — even from trusted hosts, escalated capabilities always require explicit user consent.

When an agent is reactivated after expiry, its capabilities reset to the host's defaults. Any previously escalated capabilities are dropped and must be requested again. This is a security checkpoint that prevents stale permissions from persisting.

Scoped grants (constraints)

A capability grant can carry constraints — restrictions on the input values an agent is authorized to supply. Constraints turn a broad capability into a narrow, specific authorization.

{
  "capability": "transfer_funds",
  "constraints": {
    "to": { "const": "acc_456" },
    "amount": { "maximum": 1000 },
    "currency": { "const": "USD" }
  }
}

Instead of granting "transfer money" with no limits, the server can grant "transfer up to $1,000 in USD to account acc_456." Constraints can be proposed by the agent, imposed by the server, or both — the tightest constraint from either source wins. The server must not widen constraints beyond what the agent requested without a new approval — it can only narrow them.

Constraint operators

Constraint values use one of two forms:

Exact value — "field": value. The agent must supply exactly this value. Shorthand for equality.
Operator object — "field": { "op": value }. The value must satisfy the operator.

Operator	Type	Meaning
`max`	number	Value must be ≤ `max`
`min`	number	Value must be ≥ `min`
`in`	array	Value must be one of the listed values
`not_in`	array	Value must not be one of the listed values

Operators can be combined on a single field: { "min": 0, "max": 1000 } means the value must be between 0 and 1000 inclusive. Exact values and operator objects cannot be combined on the same field.

If a server encounters an unknown constraint operator, it rejects the request with 400 unknown_constraint_operator rather than silently ignoring it — ignoring an unknown constraint could grant broader access than intended.

Describe capability

Servers expose two endpoints for capability discovery:

GET /capability/list — lightweight listing with name, description, and grant_status (when authenticated with an Agent JWT). Supports search via a query parameter and cursor-based pagination.
GET /capability/describe?name={name} — full detail for a single capability, including input and output schemas.

Agents call describe when they need to know a capability's input shape before executing, or to understand the shape of the data it returns. Both endpoints support three authentication modes:

Auth	What's returned
No auth	Public capabilities only
Host JWT	Capabilities available to the host's linked user
Agent JWT	All capabilities with per-agent grant status

Pagination

The list endpoint supports cursor-based pagination:

Parameter	Type	Description
`query`	string	Search query to filter capabilities
`cursor`	string	Opaque pagination cursor from the previous response
`limit`	number	Maximum capabilities to return

The response includes next_cursor (opaque cursor for the next page) and has_more (boolean). Servers with a small capability set may return all capabilities in a single response.

Capability location

Capabilities can specify a custom execution location — a URL where the capability is executed, separate from the server's main execute endpoint. This enables architectures where the auth server manages registration and grants while different backend services handle execution.

{
  "name": "check_balance",
  "description": "Check account balance",
  "location": "https://banking-api.example.com/agent/execute",
  "input": { ... }
}

When a capability has a location, the client sends the execute request to that URL and sets the Agent JWT's aud claim to match. When no location is set, the client uses the server's default_location from discovery. The service at the location validates the JWT — verifying that aud matches its own URL — checks grants (via introspection or local verification), and processes the request.

Capability naming

Capability names are opaque strings at the protocol layer. They should only include lowercase ASCII alphanumeric characters and underscores ([a-z0-9_]+) and use consistent snake_case within a provider.

Servers may remove or rename capabilities at any time. Existing grants for a removed capability become inoperative. When an agent attempts to execute a removed capability, the server returns 403 capability_not_granted.