The AI Agent Control Stack

The Stack

The AI agent control stack has four layers. Each answers a different question. Every tool in the ecosystem lives in one of them.

Layer 1 is crowded. Layer 2 is very crowded — and getting more crowded every week with major players (NVIDIA, Cisco, Microsoft) all building here. Layer 3 is empty. Except for Surfit.

Layer 1: Model — "Is the output safe?"

Smart. These tools make models produce better, safer outputs.

Guardrails AI

Founded 2023$7.5M Seed (Feb 2024)~$1.1M revenue

Open-source Python framework for validating LLM text inputs and outputs. Checks for toxicity, PII leakage, hallucinations, schema compliance. Includes Guardrails Hub (community marketplace of validators) and Snowglobe (simulation engine for testing guardrails before production).

guardrailsai.com GitHub Snowglobe

Backed by Zetta Venture Partners, Bloomberg Beta, Pear VC, GitHub Fund. Angels: Ian Goodfellow, Logan Kilpatrick.

The gap: Validates what the model says — not what the agent does with it. Runs inside the agent pipeline. The agent still controls execution.

CTGT / Mentat

Founded 2023$7.7M Seed (Feb 2025)YC W24

Modifies model behavior at the representation level using mechanistic interpretability. Identifies and removes hallucinations, bias, and unwanted behavior without retraining. Mentat provides a runtime policy engine at the inference layer. Won VentureBeat Innovation Showcase 2025.

ctgt.ai Mentat Platform

Backed by Google Gradient, General Catalyst, Y Combinator. Angels: François Chollet, Paul Graham, Michael Seibel. Working with Fortune 10 companies.

The gap: Controls what the model produces — not what the agent does with that output. Operates at inference. The agent still controls execution.

Plano by Katanemo

AI-native proxy between agent and model provider with filter chains for input/output. Built on Envoy, Rust/WASM. Filters model traffic — not system actions.

planoai.dev GitHub

The gap: Sits between agent and model. Not between agent and systems. The agent still controls execution.

NVIDIA NeMo Guardrails

Programmable guardrails for LLM conversations using Colang, a domain-specific language. Controls what the model says and how it responds.

GitHub Docs

The gap: Conversational boundaries. Does not govern agent actions on external systems. The agent still controls execution.

Sleeper agent research has also demonstrated that model-layer safety is fundamentally unreliable — models can be trained to pass all safety evaluations while harboring hidden behaviors. This is a structural limitation of the entire layer.

Layer 2: Runtime — "Is this allowed?"

Safe. These tools sandbox, isolate, and enforce policy on agent runtime.

This layer exploded in March 2026. Major announcements from NVIDIA (NemoClaw at GTC), Cisco (DefenseClaw at RSA), and Microsoft (Agent 365). The pattern: sandbox the agent, control access, enforce static policy.

IronCurtain

Open sourceCreator: Niels Provos(ex-Google security lead)

Personal AI assistant security framework with V8 sandbox, policy engine, credential separation, and allow/deny/escalate decisions. Uses plain-English "constitution" compiled into enforceable policy. MITM proxy swaps fake credentials for real ones — agent never sees actual keys. Supports 128+ Google tools, GitHub (41 tools), Git (27 tools), filesystem, Signal.

ironcurtain.dev GitHub

Closest architectural pattern to Surfit. Does credential separation. Does allow/deny/escalate. But: personal/single-user, static policy, no business context evaluation.

The gap: Policy determines what's permitted. It doesn't evaluate whether THIS SPECIFIC action is correct for the business RIGHT NOW. Static rules can't encode timing, risk context, downstream impact, or organizational state. The agent passes policy. But was it the right action?

NVIDIA NemoClaw + OpenShell

Alpha: March 16, 2026GTC 2026 launchOpen source

Reference stack for running OpenClaw agents inside a secure sandbox. Rust-based container isolation, network egress control, filesystem restrictions, declarative YAML security policies. Includes managed inference with NVIDIA Nemotron models. Partnering with Cisco, CrowdStrike, Google, Microsoft Security.

GitHub nvidia.com/nemoclaw Docs

The gap: Controls where the agent can go — which endpoints, which files. Once access is granted, no opinion on whether this specific action should happen. The agent still controls execution.

Cisco DefenseClaw

Announced RSA 2026March 23, 2026Open source (expected)

Open-source secure agent framework with Skills Scanner, MCP Scanner, AI BoM, and CodeGuard. Part of Cisco's broader agentic security stack: Duo Agentic Identity (agent IAM), AI Defense Explorer Edition (red-teaming), Secure Access SSE (MCP policy enforcement), Splunk AI SOC agents. Integrates with NVIDIA OpenShell for runtime sandboxing.

Cisco Newsroom

The gap: Scanning, verification, and sandboxing. Security infrastructure — not business decision-making. The agent still controls execution.

Microsoft Agent 365

Available May 1, 2026$15/user/mo standalone$99/user/mo in M365 E7

Control plane for AI agents within Microsoft 365. Agent Registry, Agent ID in Entra, monitoring, governance. Extends Defender, Entra, and Purview to non-human identities. Provides agent identity, observability, and access control within the M365 ecosystem.

microsoft.com/agent-365

The gap: M365-only. Identity and observability — not cross-system business decisions. Answers "which agents exist and what can they access?" Not "should this action happen right now?" The agent still controls execution.

CrowdStrike · Palo Alto Networks · Okta

Infrastructure, endpoint, and identity security. Expanding into agentic AI security at RSA 2026. Secure the infrastructure below the application layer. No concept of business context for agent actions.

crowdstrike.com paloaltonetworks.com okta.com

Blaxel

MicroVM sandboxing as a service for AI agents. Kata Containers-based, <25ms resume times. Isolates agent execution environments.

blaxel.ai

AccuKnox

AI-SPM + runtime enforcement via KubeArmor/eBPF. ModelArmor sandboxing, Zero Trust CNAPP. Kernel-level agent runtime security.

accuknox.com

Composio

MCP gateway + hosted tool integrations + agent management. Managed auth, RBAC, tool execution, human-in-loop approval flows. Has approval workflows but no cross-system business context evaluation.

composio.dev GitHub

MintMCP by Lutra

Hosted MCP servers for email/calendar + MCP Gateway for telemetry and config management. Capability layer — connectors, not control.

mintmcp.com GitHub

Gravitee MCP Proxy

Centralized governance proxy for MCP traffic. Tool discovery, execution, access control, observability for MCP connections.

gravitee.io

Salesforce Agentforce

Enterprise MCP governance within Salesforce ecosystem. MCP allowlisting, trusted gateway, agent builder. Salesforce-only.

salesforce.com/agentforce

Entro Security

Identity governance for AI agents and non-human identities. AGA module, MCP activity monitoring, policy enforcement. Announced RSA 2026.

entrosecurity.com

Opal Security

AI-native access governance. Paladin (AI agent for access), OpalScript, OpalQuery. Announced RSA 2026.

opal.dev

Every tool in Layer 2 answers the same question: "Is this allowed?" They check permissions, enforce policy, sandbox execution, manage identity. These are important. But they all end the same way: the agent passes the check, and executes on its own.

Layer 3: Decision — "Should this happen right now?"

Correct. This is where business decisions about agent actions are made.

Layer 3 — The Decision Layer

Surfit

NOBODY ELSE IS HERE

Every tool in Layer 1 makes models smarter. Every tool in Layer 2 makes agents safer to run. Surfit is where those agents go when they need to actually do something.

The agent proposes an action. Surfit evaluates business context — not just whether it's permitted by policy, but whether it's the right action for the organization right now. Timing. Risk. Downstream impact. Organizational state. These are not things a policy file can encode.

Low-risk actions execute instantly. High-risk actions are held and routed for context. Everything produces a receipt. The agent never executes directly — Surfit controls the execution path.

This is not another policy engine. This is not another sandbox. This is not another access control tool. This is the layer where business decisions about agent actions are actually made — and enforced.

Business context evaluation — not static rules

Cross-system consistency — one decision model

Every action receipted and auditable

The Pattern

Every tool in this landscape is valuable. Every one solves a real problem. And every one ends the same way:

Guardrails validates the output. The agent still executes on its own.

CTGT constrains the model. The agent still executes on its own.

IronCurtain enforces policy. The agent still executes on its own.

NemoClaw sandboxes the runtime. The agent still executes on its own.

Cisco DefenseClaw scans and verifies. The agent still executes on its own.

Microsoft Agent 365 tracks identity. The agent still executes on its own.

CrowdStrike secures infrastructure. The agent still executes on its own.

Surfit is where that changes. The agent proposes. Surfit evaluates the business context. Surfit decides. Surfit executes. The agent never touches the system without Surfit in the path.

Why This Matters Now

Every new tool in Layer 1 and Layer 2 makes agents more capable and safer to run. More capable agents touching more systems means more need for Layer 3.

Every competitor in this landscape is building the case for Surfit without knowing it.

The more agents can do, the more important it becomes to have a layer that evaluates whether they should. That's not safety. That's not compliance. That's business judgment at the point of execution.

Smart. Safe. Correct.
Everyone is building the first two. Surfit is the third.

See how Surfit operates across multiple systems — evaluating, routing, and receipting agent actions in real time.

Watch the Demo