Introducing Surfit — The Decision Layer for AI Agent Actions

Evals check what the model says. Sandboxes control where agents run. Nobody controls what actually happens when an agent acts on a real system with real credentials. That's the layer Surfit builds.

The Problem Nobody's Solving

AI agents are moving from advisory to autonomous. They're no longer just suggesting — they're posting to Slack, merging PRs, deploying infrastructure, sending emails, and modifying databases. Every one of these actions has real business consequences.

The industry has built tools for every layer of this stack except one:

Guardrails AI validates what the model says — toxicity, PII, hallucinations, schema compliance. It checks the output text. It never sees what the agent does with that text.

CTGT / Mentat controls how the model behaves — activation steering, representation-level policy enforcement. It makes the model more compliant. It doesn't sit between the agent and real systems.

IronClaw / NemoClaw controls what the agent can access — container isolation, network egress, permissions. Once access is granted, it has no opinion on whether THIS SPECIFIC action should happen.

CrowdStrike / Palo Alto / Okta protects infrastructure — endpoints, networks, identity. It operates below the application layer. It has no concept of business context.

Every one of these tools answers a version of the same question: "Is this allowed and safe?"

None of them answer: "Should this specific action actually happen right now for the business?"

That's the gap. And it's where Surfit sits.

What Surfit Does

Surfit is a control layer that sits between AI agents and the systems they act on. Every action the agent proposes passes through Surfit before it reaches any external system.

Surfit evaluates each action in business context — not just "is this safe?" but "is this the right action for the organization right now?" — and either auto-executes it or holds it based on risk.

Low-risk actions execute automatically with full logging. A Slack update to #eng-platform, a Notion log entry, a PR to a dev branch — these flow through instantly. No friction.

High-risk actions are held before execution. A PR merge to main that touches payment code. A post to the company X account. An AWS infrastructure change. These are intercepted and surfaced with full context.

Most actions never need a human. Surfit's risk classification determines what flows and what's caught — based on content, destination, and context, not static rules.

The Architectural Distinction

There's one thing that separates Surfit from every other tool in this space, and it's not a feature — it's an architectural decision:

Surfit controls the execution path. The agent doesn't execute directly.

In every current agent framework — LangChain, CrewAI, OpenClaw, custom builds — the agent has access to the credentials it needs to operate. It holds the Slack token, the GitHub PAT, the AWS keys. When it decides to act, it acts.

Surfit inverts this. The agent calls Surfit's API instead of calling the external system directly. Surfit evaluates the action in business context, and if approved, executes it — using credentials the agent never touches, whether managed by Surfit or integrated with the customer's own vault.

Evals (Inside the Agent)

Evals run inside the agent pipeline. They can flag, warn, or suggest. But the agent still controls execution. It can bypass, remove, or reconfigure the eval. Governance is advisory.

Surfit (Outside the Agent)

The agent does not execute directly. Every action must pass through Surfit. Surfit evaluates in business context and controls execution. Governance is enforceable.

This is the core distinction: if the agent controls execution, governance is advisory. If an external layer controls execution, governance is enforceable.

A Concrete Example

An agent wants to merge a PR to main.

Guardrails → output is fine ✓

CTGT → model is compliant ✓

IronClaw → permissions are safe ✓

So it goes through… into production. Payments break.

Every tool said yes. The merge was technically valid. But it was merging before a release, touching payment code, and the engineering lead hadn't reviewed it.

With Surfit:

→ Agent proposes the merge to Surfit's API (not GitHub directly)

→ Surfit evaluates: production branch, payment code, no prior review

→ Wave 5 — held. Business protected.

The five routine actions that agent attempted earlier — Slack updates, Notion entries, dev branch PRs — all flowed through automatically at Wave 1-3. No friction. Full audit trail. Only the high-risk action was caught.

Where Surfit Sits in the Stack

The AI control stack has five layers. Each answers a different question:

01 — Output Validation (Guardrails AI): "Is this output safe?"

02 — Model Behavior (CTGT / Mentat): "Is this model compliant?"

03 — Sandbox / Access (IronClaw / NemoClaw): "Can this agent access this?"

04 — Infrastructure (CrowdStrike / Palo Alto): "Is this process allowed?"

05 — Decision Authority (Surfit): "Should this action happen for the business right now?"

These layers don't compete. They complement. An organization could use all five simultaneously and they wouldn't overlap. But without layer 05, every agent action that passes through layers 01–04 executes directly on production with no business-level evaluation.

Why This Can't Be Built Into Existing Tools

The most common question we hear: "Why can't Guardrails / CTGT / OpenClaw just add this?"

They could build basic approval workflows. What they can't easily do: rearchitect their entire product to sit in the execution path and evaluate business context for every action.

Guardrails AI is a Python library that wraps around the agent's pipeline. Asking it to hold credentials and sit at the execution boundary means becoming a different product. CTGT operates at model inference — moving from model internals to system-level execution proxy is a company pivot. OpenClaw's entire architecture assumes the agent authenticates directly to systems.

Surfit was designed from the ground up for this layer. Execution authority isn't a feature — it's the foundation the entire product is built on.

What's Live Today

Surfit currently governs agent actions across ten system categories, including Slack, GitHub, X, Notion, Gmail, Outlook, and AWS. Every action is evaluated by the same risk classification engine, producing a consistent decision model across all systems.

The system includes a real-time operator dashboard, client-facing views with per-customer isolation, hash-chained execution receipts, and context-aware classification that evaluates actions by content, destination, and context — not just type.

See Surfit in action — a 2-minute demo showing real agent actions being evaluated, classified, and enforced across multiple systems.

Watch the Demo

Guardrails tests and filters. CTGT shapes models. Surfit decides what actually happens.

← Back to Blog