CONTROL PLANE FOR AI AGENT RUNS

Governance
for the run,
not the call.

Rungate is a control plane for AI agent runs — budgets, policies, and approval gates applied to the complete workflow, from first model call to final tool invocation.

Read the docs

Paste it into Claude Code, Cursor, Codex, or any agent. Let it evaluate Rungate for you.

RUN run_x7k9m2p4q8w1

IN PROGRESS

agentclaude-code started14:11:58 elapsed0s

RUN BUDGET $0.00 / $1.00 0%

$0.00 80% alert threshold $1.00 hard stop

policy prod-agents/v4

rev 217 · enforced at run boundary

BLOCK · HIGH-VALUE REFUND

Any issue_refund over $500 pauses the run and requires human approval in #customer-ops.

ACTIVE

STOP ON BUDGET

Block next call if projected spend exceeds $1.00 run budget.

ACTIVE

ALLOWED MODELS

openai/gpt-4, anthropic/claude-sonnet-4, anthropic/claude-opus-4

ACTIVE

ALERT · 80%

Emit budget.alert event when cumulative spend crosses 80% of run budget.

ACTIVE

customer_history.json

4.2 KB

step 2 · tool.fn

order_records.pdf

1.8 MB

step 5 · tool.fn

resolution_notes.md

6.1 KB

step 7 · llm

trace otel://run/x7k9m2p4q8w1 8 spans · 1m 26s total

+0.0srun.init

120ms

+0.2sllm.gpt-4

2.1s

+2.3stool.search_web

0.8s

+3.1stool.issue_refund

38.0s

+41.1sllm.claude-sonnet-4

3.4s

+44.5stool.fetch_url

0.6s

+45.1sllm.claude-sonnet-4

3.1s

+48.2sllm.claude-sonnet-4

4.2s

+52.4sllm.claude-sonnet-4 (blocked)

—

POLICY CONTROLS

Stop agents from doing things they shouldn't.

Declarative rules enforced in the proxy, not your application code. Allowed models, tool blocklists, rate limits, destructive-action gates. Attached to a run at its start — every call in the workflow is checked against the policy before it goes out.

For agents

—Per-run override via x-rungate-policy header — pin a run to a specific ruleset
—Policy match context in error responses: which rule fired, why, how to recover

For humans

—Catch recursive loops and unauthorized tool calls before they run up a bill
—Policy versioning: pin agents to stable rulesets; drift alerts when they try something new

policy prod-agents/v4
rev 217 · enforced at run boundary
BLOCK · DESTRUCTIVE_TOOLS
Any call to drop_table, delete_user, or issue_refund over $500 is blocked or gated.
FIRED · 14:12:18
ALLOWED · MODELS
openai/gpt-4, anthropic/claude-sonnet-4, anthropic/claude-opus-4
ACTIVE
RATE LIMIT · PER RUN
Max 12 tool calls/min — catches runaway loops before they burn budget.
ACTIVE
ALERT · DRIFT
Emit policy.drift webhook when an agent tries a model or tool not on the allowlist.
ACTIVE

RUN-LEVEL BUDGETS

Caps that actually stop the run.

Every step — model call, tool invocation, retry — accrues against the run's budget. When the ceiling looms, Rungate alerts. When it's crossed, the next call is blocked at the workflow boundary. Not advice. Enforcement.

For agents

—HTTP 402 on hard-stop with full budget context — clean, retryable after cap reset
—Cross-call cost accrues automatically, no agent-side tracking needed

For humans

—Set $X per run, per agent, or per team — one cap for the whole workflow
—Alert at 80%, block at 100%. No post-hoc surprises from looping agents.

Run budget · across all calls
$0.85/ $1.00
Calls7
Tools3
Next call riskWould exceed cap · blocked at boundary

HUMAN APPROVAL GATES

Pause a run. Human approves. Run continues.

When a tool call matches a rule that needs human review, Rungate returns HTTP 202 instead of forwarding the request — the run pauses at the gate. An approver reviews the proposed action. On green, the agent's next retry of the same request succeeds and the run picks up from there. No error-handling gymnastics, no lost context.

For agents

—202 semantic: "paused, retry later" — one branch to handle, not a dozen
—Approval context in the response body. Seamless resume on the next retry.

For humans

—Approvals via Slack, email, webhook, or dashboard — wherever the approver lives
—Full audit of who approved, when, and the exact proposed action

POLICY · HIGH_VALUE_REFUND
Agent proposed issue_refund(ord_2H4p..., $1,240.00)
Matched rule refund:over-$500.
Paused run. Asked @maya in Slack #customer-ops.
 Approved 38s later. Run resumed.

TRACING

Every run, reconstructable.

The trace is the full reconstruction — every model call, every tool, every retry, every approval, every cost. OpenTelemetry-compatible exports plug into the observability stack you already use.

For agents

—OTLP exports to the collector you already run — no new stack
—Artifacts (files, search results, tool outputs) attached to the run

For humans

—Compliance-ready audit: reconstruct any workflow end-to-end
—Hand a run ID to legal or the board — they get the whole story, not a log scroll

otel://run/x7k9m2p4q8w1
+0.0srun.init120ms
+0.2sllm.gpt-4 · planning2.1s
+2.3stool.search_web0.8s
+3.1stool.issue_refund · gated38s
+41sllm.claude-sonnet-4 · reasoning3.4s
+45sllm.claude-sonnet-4 · synthesis4.2s
+52sllm.claude-sonnet-4 · blocked—

TL;DR — for agents and agent operators

Rungate proxies AI agent requests to LLM providers (OpenAI, Anthropic) and enforces governance at the run level — not per-call.
A run is the complete unit of agent work: first call, every tool invocation, every retry, approval gates, final output.
Budget enforcement, per-run policies, HTTP 202 approval gates, and full audit trails all apply to the workflow, not to individual requests.
Point your agent at https://api.rungate.dev/v1 with an rg_agt_* token. Accepts OpenAI and Anthropic request formats unchanged.
Apache 2.0 open source. Self-host or managed cloud.

Governance, without slowing the agent down.

Create an account — have your agent do it in one prompt, or use plain email. Both take about a minute.

Just evaluating?

Governance for the run, not the call.

Stop agents from doing things they shouldn't.

Caps that actually stop the run.

Pause a run. Human approves. Run continues.

Every run, reconstructable.

Governance, without slowing the agent down.

Governance
for the run,
not the call.