Policies & budgets
Allowed model lists, blocked tools, rate limits, destructive-action gates, and run-level budget caps. Declared once, enforced server-side, applied to every call in the run.
A policy is a named set of declarative rules
Rungate enforces on every call in a run. Policies apply at the run
boundary — when a run starts, it picks up the policy of the agent
that initiated it (or an explicit
x-rungate-policy override). Every subsequent call in
that run is checked against the same policy.
Policies are managed through the dashboard at app.rungate.dev/policies or via the admin API.
What you can enforce
Run budget caps
Set a maximum cumulative spend per run, in USD. When cumulative
cost crosses the cap, the next call returns 402 Payment
Required. In-flight calls complete normally; no new calls
are dispatched on this run.
The 402 response includes a structured context object
with the run ID, cumulative spend, the limit, and the rule that
fired. Your agent should stop issuing calls on this run. A human
(or automation) resets the cap or starts a new run with
x-rungate-new-run: true.
HTTP/1.1 402 Payment Required
Content-Type: application/json
{
"error": {
"code": "budget_exceeded",
"message": "Run budget ceiling reached.",
"context": {
"run_id": "run_customer_refund_2026_04_26",
"cumulative_spend_usd": "1.03",
"limit_usd": "1.00",
"rule": "stop_on_budget",
"policy_id": "pol_prod_v4",
"policy_name": "prod-agents/v4",
"step_that_tripped": "llm.claude-sonnet-4"
}
}
} Allowed models
Restrict which models a policy permits. The first cheap and
deterministic check on every request — runs that try a model
outside the allowlist get 403 Forbidden with the rule
and the allowed list:
{
"error": {
"code": "policy_violation",
"message": "Model not in policy allowlist.",
"context": {
"policy_id": "pol_prod_v4",
"rule": "allowed_models",
"field": "model",
"requested": "gpt-3.5-turbo",
"allowed": [
"openai/gpt-4",
"anthropic/claude-sonnet-4",
"anthropic/claude-opus-4"
]
}
}
} Use this to lock production agents to vetted models, prevent accidental fallback to deprecated models, or enforce cost ceilings by model class.
Tool blocklists / approval gates
Specify tool calls that should be blocked outright (return 403) or pause the run for human review (return 202). See Approval gates for the gate flow.
Rate limits
Per-agent and per-policy rate limits. Rungate returns
429 with a Retry-After header. Rate
limits catch runaway loops — an agent calling the same tool 200
times per minute is almost always a bug. The rate limit is a
coarse safety net; the budget cap is the hard cost ceiling.
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1737126712 Routing strategy
Configure provider routing per policy: primary, fallback chain, circuit-breaker thresholds, latency-based selection. Cross-provider failover is transparent — if your primary provider errors or trips a circuit breaker, Rungate re-emits the request to the configured fallback (translating format if vendors differ) and the agent sees the response in the format it sent.
Per-run policy overrides
Each agent has a default policy. Override it on a per-run basis
with the x-rungate-policy header (policy id or name).
The override applies for the entire run.
x-rungate-policy: prod-agents/v4-strict Useful for: temporarily tightening rules during an incident, running an evaluation under a different policy, A/B testing two policies on the same agent.
Policy versioning
Policies are versioned. Each run records which policy version was active when it ran. Mutating a policy doesn't change history. If you want a hard pin, name a new version and reference it explicitly.
Dry-run mode
Set dry_run: true on a policy (or create a dedicated
dry-run policy) to validate the integration end-to-end without
spending real money on provider calls. Rungate accepts the
request, returns a mock response, and records the run. Use this to
test budget enforcement, approval gates, and webhook delivery
before flipping to a real policy.
Common patterns
Tight production budget, generous dev budget
Two policies — prod-agents with
$1.00/run hard cap, dev-agents with
$10.00/run. Production agents bind to the first; the
same agent can override to the second during an evaluation via
x-rungate-policy.
Model deprecation
Drop a model from the allowlist. The next call using that model
fails with 403 + the canonical "model not in
allowlist" rule + the new allowed list. Your agent's error
handler reads allowed and switches to a permitted
model on the retry.
Tool gating without breaking the agent
Block a tool with the approval-gate variant (returns 202) instead of an outright 403. The agent retries the same request after the operator approves; the workflow continues without changing the agent's code path. See Approval gates.