Zum Inhalt springen

Introducing ToolMesh — The Missing Control Layer for AI Agents

AI agents are reaching production systems faster than security and governance can keep up. Teams connect agents to CRMs, cloud APIs, internal databases, ticketing systems, and IoT devices — but the infrastructure between the agent and those systems is still held together with environment variables and good intentions.

The typical setup looks like this: an MCP server has direct access to API keys, every connected agent gets full access to every tool, and there is no record of what was called, when, or by whom. That works for a prototype. It does not work when the agent talks to Stripe, your HR system, and your cloud provider in the same session.

The problem compounds in enterprises. Organizations run dozens of backend systems across teams and departments. Each one needs its own integration, its own credentials, its own access policy. Without a central control layer, every new agent connection is a new unaudited trust relationship — and the attack surface grows with each one.

ToolMesh is our answer to that gap.

ToolMesh is the missing control layer between AI agents and enterprise systems. It sits between AI agents and backend systems, turning uncontrolled tool calls into a governed, auditable process — and connecting any REST API or MCP server in minutes, not months.

Every tool call flows through a strict pipeline: Auth → AuthZ → Credential Injection → Execution → Output Gate → Audit. If any step does not explicitly allow the request, nothing happens. The model never sees raw secrets. Every call is logged. Output can be filtered before it reaches the agent.

It runs as a single binary or Docker container. It is Apache 2.0 licensed. There is no SaaS dependency.

ToolMesh is built around six concerns that most agent setups handle inconsistently or not at all.

ToolMesh supports OAuth 2.1 with PKCE for interactive clients like Claude Desktop, API keys for programmatic access, and multi-user setups via a users.yaml configuration. Every request is authenticated before it enters the pipeline.

For quick starts, a single environment variable is enough. For production, you get per-user identity with roles, plans, and company context — without changing the architecture.

Access control is per-tool and per-user, powered by OpenFGA. DADL tools declare an access level — read, write, admin, or dangerous — and policy files map those levels to roles. If the caller is not authorized for that specific tool, the request is denied before execution.

ToolMesh also tracks CallerClass: which AI client triggered the request. A local Claude Code session can have different permissions than a hosted agent or a CI bot hitting the same API. Same backend, different trust levels, enforced automatically.

This is the concern most setups quietly ignore. In a typical MCP configuration, API keys sit in client configs or environment variables — visible to the model, scattered across machines, impossible to rotate centrally.

ToolMesh injects credentials at runtime, server-side. The model sees the tool interface and the filtered result. It never sees the token, the key, or the session cookie. Credentials are referenced by name in the DADL file and resolved from the credential store at execution time.

The default store reads from environment variables. The architecture supports pluggable backends for centralized secret management (planned: Infisical, HashiCorp Vault).

Not every API response should reach the model unchanged. Customer records may contain PII. Internal systems may return metadata that is irrelevant or sensitive. Error messages may leak infrastructure details.

The Output Gate runs JavaScript policies (via the goja engine) that can validate inputs before execution and filter outputs afterward. Use cases range from PII redaction to compliance filtering to rejecting dangerous inputs entirely. Policies are reviewable code, not black boxes.

ToolMesh connects two types of backends:

DADL backends describe REST APIs declaratively in YAML. A .dadl file defines endpoints, parameters, authentication, error handling, pagination, and retry logic. ToolMesh turns that description into MCP tools at runtime — no custom server code needed. The public DADL registry currently includes definitions for 14 APIs covering 1,100+ tools across providers like GitHub, Cloudflare, GitLab, Stripe, DeepL, Hetzner Cloud, and others.

MCP server backends connect existing MCP servers via HTTP or STDIO transport. If you already have a working MCP server, ToolMesh wraps it with the same authorization, credential, and audit pipeline.

Both types go through the same fail-closed pipeline. The backend type is an implementation detail — the security guarantees are the same.

Every tool call is recorded: who called it, which tool, with what parameters, what the result was, how long it took, and whether it succeeded. ToolMesh ships with two audit backends — structured logging via slog for simple setups, and an append-only SQLite store for queryable compliance audits.

When someone asks “what did that agent do last Tuesday,” you can answer with a SQL query instead of a shrug.

The integration problem in agent tooling is not the protocol. MCP solved that. The problem is that every new API still requires a new MCP server — a new codebase, a new runtime, a new maintenance burden. Most teams stop after a handful of integrations because the per-API cost is too high.

DADL (Declarative API Description Language) takes a different approach. Instead of writing a custom server for each API, you describe the API in YAML:

backend:
name: deepl
type: rest
base_url: https://api.deepl.com/v2
auth:
type: bearer
credential: deepl_auth_key
tools:
translate:
method: POST
path: /translate
access: write
description: "Translate text into a target language"
params:
text: { type: array, in: body, required: true }
target_lang: { type: string, in: body, required: true }

That is a complete tool definition. ToolMesh handles authentication, parameter mapping, error handling, retries, and credential injection at runtime.

Because DADL is compact and declarative, LLMs can generate working definitions from existing API documentation. Most of the 14 definitions in the registry were produced by Claude from API docs in under a minute, then reviewed and tuned. The format is designed to be AI-native: easy for machines to produce, easy for humans to review.

We wrote a separate deep-dive on this: Stop Rebuilding REST API Wrappers for MCP.

Connect 15 MCP servers to a single AI agent? Without ToolMesh, that simply does not work — the context window fills up, the client chokes. With ToolMesh, it is not a problem.

Instead of exposing hundreds of individual MCP tools, ToolMesh exposes two meta-tools: list_tools and execute_code. The model gets a compact TypeScript interface description (~1,000 tokens instead of 50,000+) and writes JavaScript against it. One code block can chain multiple API calls in a single round-trip.

That is the difference between “doesn’t work” and “just runs.”

ToolMesh is Apache 2.0 licensed. The binary is self-contained — no external dependencies for the core feature set. docker compose up gets you a running instance with OAuth, audit logging, and a set of DADL backends.

There is no cloud dependency. Your credentials, your audit logs, your policies — all on your infrastructure. The core is and will remain open source.

The fastest path from zero to a working setup:

  1. Clone the repo and run docker compose up
  2. Configure your MCP client to point at ToolMesh
  3. Make your first tool call

The Getting Started guide walks through this in detail. The Architecture overview explains how the pipeline works. The DADL registry has ready-to-use definitions for 14 APIs.

ToolMesh is on GitHub. Issues, feedback, and DADL contributions are welcome.