Skip to main content
CONTROL PLANE FOR AI AGENT RUNS

Governance
for the run,
not the call.

Rungate is a control plane for AI agent runs — budgets, policies, and approval gates applied to the complete workflow, from first model call to final tool invocation.

Read the docs

Paste it into Claude Code, Cursor, Codex, or any agent. Let it evaluate Rungate for you.

RUN run_x7k9m2p4q8w1
IN PROGRESS
agentclaude-code started14:11:58 elapsed0s
RUN BUDGET $0.00 / $1.00 0%
$0.00 80% alert threshold $1.00 hard stop
policy prod-agents/v4
rev 217 · enforced at run boundary
BLOCK · HIGH-VALUE REFUND
Any issue_refund over $500 pauses the run and requires human approval in #customer-ops.
ACTIVE
STOP ON BUDGET
Block next call if projected spend exceeds $1.00 run budget.
ACTIVE
ALLOWED MODELS
openai/gpt-4, anthropic/claude-sonnet-4, anthropic/claude-opus-4
ACTIVE
ALERT · 80%
Emit budget.alert event when cumulative spend crosses 80% of run budget.
ACTIVE
customer_history.json
4.2 KB
step 2 · tool.fn
order_records.pdf
1.8 MB
step 5 · tool.fn
resolution_notes.md
6.1 KB
step 7 · llm
trace otel://run/x7k9m2p4q8w1 8 spans · 1m 26s total
+0.0srun.init
120ms
+0.2sllm.gpt-4
2.1s
+2.3stool.search_web
0.8s
+3.1stool.issue_refund
38.0s
+41.1sllm.claude-sonnet-4
3.4s
+44.5stool.fetch_url
0.6s
+45.1sllm.claude-sonnet-4
3.1s
+48.2sllm.claude-sonnet-4
4.2s
+52.4sllm.claude-sonnet-4 (blocked)
POLICY CONTROLS

Stop agents from doing things they shouldn't.

Declarative rules enforced in the proxy, not your application code. Allowed models, tool blocklists, rate limits, destructive-action gates. Attached to a run at its start — every call in the workflow is checked against the policy before it goes out.

For agents
  • Per-run override via x-rungate-policy header — pin a run to a specific ruleset
  • Policy match context in error responses: which rule fired, why, how to recover
For humans
  • Catch recursive loops and unauthorized tool calls before they run up a bill
  • Policy versioning: pin agents to stable rulesets; drift alerts when they try something new
policy prod-agents/v4
rev 217 · enforced at run boundary
BLOCK · DESTRUCTIVE_TOOLS
Any call to drop_table, delete_user, or issue_refund over $500 is blocked or gated.
FIRED · 14:12:18
ALLOWED · MODELS
openai/gpt-4, anthropic/claude-sonnet-4, anthropic/claude-opus-4
ACTIVE
RATE LIMIT · PER RUN
Max 12 tool calls/min — catches runaway loops before they burn budget.
ACTIVE
ALERT · DRIFT
Emit policy.drift webhook when an agent tries a model or tool not on the allowlist.
ACTIVE
RUN-LEVEL BUDGETS

Caps that actually stop the run.

Every step — model call, tool invocation, retry — accrues against the run's budget. When the ceiling looms, Rungate alerts. When it's crossed, the next call is blocked at the workflow boundary. Not advice. Enforcement.

For agents
  • HTTP 402 on hard-stop with full budget context — clean, retryable after cap reset
  • Cross-call cost accrues automatically, no agent-side tracking needed
For humans
  • Set $X per run, per agent, or per team — one cap for the whole workflow
  • Alert at 80%, block at 100%. No post-hoc surprises from looping agents.
Run budget · across all calls
$0.85/ $1.00
Calls7
Tools3
Next call riskWould exceed cap · blocked at boundary
HUMAN APPROVAL GATES

Pause a run. Human approves. Run continues.

When a tool call matches a rule that needs human review, Rungate returns HTTP 202 instead of forwarding the request — the run pauses at the gate. An approver reviews the proposed action. On green, the agent's next retry of the same request succeeds and the run picks up from there. No error-handling gymnastics, no lost context.

For agents
  • 202 semantic: "paused, retry later" — one branch to handle, not a dozen
  • Approval context in the response body. Seamless resume on the next retry.
For humans
  • Approvals via Slack, email, webhook, or dashboard — wherever the approver lives
  • Full audit of who approved, when, and the exact proposed action
POLICY · HIGH_VALUE_REFUND
Agent proposed issue_refund(ord_2H4p..., $1,240.00)
Matched rule refund:over-$500.
Paused run. Asked @maya in Slack #customer-ops.
Approved 38s later. Run resumed.
TRACING

Every run, reconstructable.

The trace is the full reconstruction — every model call, every tool, every retry, every approval, every cost. OpenTelemetry-compatible exports plug into the observability stack you already use.

For agents
  • OTLP exports to the collector you already run — no new stack
  • Artifacts (files, search results, tool outputs) attached to the run
For humans
  • Compliance-ready audit: reconstruct any workflow end-to-end
  • Hand a run ID to legal or the board — they get the whole story, not a log scroll
otel://run/x7k9m2p4q8w1
+0.0srun.init120ms
+0.2sllm.gpt-4 · planning2.1s
+2.3stool.search_web0.8s
+3.1stool.issue_refund · gated38s
+41sllm.claude-sonnet-4 · reasoning3.4s
+45sllm.claude-sonnet-4 · synthesis4.2s
+52sllm.claude-sonnet-4 · blocked
TL;DR — for agents and agent operators
  • Rungate proxies AI agent requests to LLM providers (OpenAI, Anthropic) and enforces governance at the run level — not per-call.
  • A run is the complete unit of agent work: first call, every tool invocation, every retry, approval gates, final output.
  • Budget enforcement, per-run policies, HTTP 202 approval gates, and full audit trails all apply to the workflow, not to individual requests.
  • Point your agent at https://api.rungate.dev/v1 with an rg_agt_* token. Accepts OpenAI and Anthropic request formats unchanged.
  • Apache 2.0 open source. Self-host or managed cloud.