How to Run Claude Managed Agents with Vercel Sandbox
Learn how to securely run Claude managed agents with Vercel Sandbox. Build a production-ready, sandboxed execution environment for LLM-generated code using TypeScript.
Giving an LLM a terminal and telling it to solve a problem is incredibly powerful—and terrifying. If you run the LLM's generated code directly on your host machine or inside a shared container, a single hallucinated rm -rf / or a malicious package dependency can destroy your application.
To build production-grade agentic workflows, you need isolated, ephemeral execution environments that spin up in milliseconds and tear down automatically.
This tutorial shows how to build a fully autonomous, secure code-execution agent. We will pair Anthropic's Claude 3.5 Sonnet tool-calling capabilities with Vercel's secure, isolated Sandbox runtime.
Here is the exact end result we are building: an agent that can safely write, execute, debug, and run arbitrary TypeScript or Python scripts in an isolated micro-VM, returning the execution outputs back to the user in real-time.
import { ClaudeSandboxAgent } from "./agent";
const agent = new ClaudeSandboxAgent();
const result = await agent.run(
"Fetch the current price of Bitcoin from the Coingecko API, write a Node.js script to calculate the percentage change over the last 24 hours, execute it, and format the output."
);
console.log(result);
// Output: "The current price of Bitcoin is $96,420. The 24h change is +3.42%..."Why You Should Run Claude Managed Agents with Vercel Sandbox
When designing autonomous systems, the primary bottleneck is secure runtime isolation. Vercel Sandbox provides lightweight, micro-VM execution spaces (built on Firecracker technology) designed specifically for executing untrusted AI-generated code.
By offloading code execution to a secure Vercel Sandbox, you gain several architectural advantages:
- Hardware-level Isolation: Code executes inside an ephemeral micro-VM, preventing directory transversal attacks or access to your host system's environment variables.
- Pre-configured Runtimes: The sandbox comes pre-loaded with Node.js, Python, package managers (
npm,pip), and common system utilities. - Low Latency Boot times: Sandboxes spin up in less than 150 milliseconds, making them suitable for interactive user experiences.
If you are building complex agent architectures, you may also want to integrate these sandboxes with standardized tool APIs. Read our guide on Model Context Protocol Next.js integration to see how to expose these runtimes as reusable tools across different LLM clients.
Architecture Overview
The system operates via a continuous feedback loop. Claude acts as the brain (determining what code needs to run), while Vercel Sandbox acts as the hands (executing the code and returning stdout/stderr).
Step-by-Step Implementation
Step 1: Project Setup and Dependencies
Initialize a clean TypeScript project. You will need the Anthropic SDK, the Vercel Sandbox SDK, and auxiliary types.
npm init -y
npm install @anthropic-ai/sdk @vercel/sandbox dotenv
npm install --save-dev typescript @types/node tsx
npx tsc --initEnsure your tsconfig.json is configured for modern Node environments:
{
"compilerOptions": {
"target": "ES2022",
"module": "NodeNext",
"moduleResolution": "NodeNext",
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true
}
}Create a .env file with your API credentials:
ANTHROPIC_API_KEY="your-anthropic-key"
VERCEL_SANDBOX_API_KEY="your-vercel-sandbox-key"Step 2: Defining the Sandbox Tool Interfaces
To allow Claude to interact with the Vercel Sandbox, we must expose specific tools:
write_file: Writes code or configuration files to the sandbox workspace.execute_command: Executes bash commands (e.g., running scripts, installing npm packages).read_file: Reads execution outputs or logs generated by the scripts.
Let's define the schemas for these tools using the Anthropic SDK format. When writing these schemas, keep in mind how LLMs parse instructions. If you need a refresher on crafting optimal instructions, check out our guide on optimizing system prompts for TypeScript.
import { Tool } from "@anthropic-ai/sdk/resources/index.js";
export const sandboxTools: Tool[] = [
{
name: "write_file",
description: "Writes content to a file in the isolated sandbox environment.",
input_schema: {
type: "object",
properties: {
path: {
type: "string",
description: "The absolute or relative file path inside the workspace (e.g., 'index.js' or 'src/utils.py')."
},
content: {
type: "string",
description: "The raw file content to write."
}
},
required: ["path", "content"]
}
},
{
name: "execute_command",
description: "Executes a shell command inside the sandbox and returns stdout, stderr, and the exit code.",
input_schema: {
type: "object",
properties: {
command: {
type: "string",
description: "The bash command to run (e.g., 'node index.js' or 'pip install pandas')."
},
timeoutMs: {
type: "number",
description: "Optional execution timeout in milliseconds. Defaults to 10000."
}
},
required: ["command"]
}
},
{
name: "read_file",
description: "Reads the content of a file from the sandbox workspace.",
input_schema: {
type: "object",
properties: {
path: {
type: "string",
description: "The path of the file to read."
}
},
required: ["path"]
}
}
];Step 3: Implementing the Sandbox Execution
We need to build a wrapper class that manages the lifecycle of our Vercel Sandbox. This class will handle initializing the micro-VM, writing code assets, executing commands, and cleaning up resources once the agent completes its run.
Create a new file named sandbox.ts:
import { Sandbox } from "@vercel/sandbox";
export class SandboxManager {
private sandbox: Sandbox | null = null;
/**
* Initializes a fresh, isolated micro-VM.
*/
async initialize(): Promise<void> {
this.sandbox = await Sandbox.create({
apiKey: process.env.VERCEL_SANDBOX_API_KEY!,
});
}
/**
* Writes a file to the active sandbox workspace.
*/
async writeFile(path: string, content: string): Promise<string> {
if (!this.sandbox) throw new Error("Sandbox not initialized");
await this.sandbox.files.write({
path,
content,
});
return `Successfully wrote ${content.length} bytes to ${path}`;
}
/**
* Reads the contents of a file from the sandbox workspace.
*/
async readFile(path: string): Promise<string> {
if (!this.sandbox) throw new Error("Sandbox not initialized");
return await this.sandbox.files.read({ path });
}
/**
* Executes a shell command inside the micro-VM.
*/
async executeCommand(
command: string,
timeoutMs: number = 10000
): Promise<{ stdout: string; stderr: string; exitCode: number }> {
if (!this.sandbox) throw new Error("Sandbox not initialized");
const result = await this.sandbox.commands.run({
command,
timeout: timeoutMs,
});
return {
stdout: result.stdout,
stderr: result.stderr,
exitCode: result.exitCode,
};
}
/**
* Destroys the micro-VM instance, freeing up resources.
*/
async destroy(): Promise<void> {
if (this.sandbox) {
await this.sandbox.close();
this.sandbox = null;
}
}
}Step 4: Building the Claude Agent Loop
With our tools defined and our sandbox wrapper ready, we can implement the core agent loop. The agent must continuously process Claude's reasoning steps, execute any requested tools inside the sandbox, and feed the outputs back to Claude until the model arrives at its final answer.
Create a file named agent.ts:
import Anthropic from "@anthropic-ai/sdk";
import { sandboxTools } from "./tools.js";
import { SandboxManager } from "./sandbox.js";
import dotenv from "dotenv";
dotenv.config();
export class ClaudeSandboxAgent {
private anthropic: Anthropic;
constructor() {
this.anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
}
async run(prompt: string): Promise<string> {
const sandbox = new SandboxManager();
await sandbox.initialize();
// Maintain conversation history
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: prompt }
];
try {
let running = true;
let iterations = 0;
const MAX_ITERATIONS = 10; // Prevent infinite loops
while (running && iterations < MAX_ITERATIONS) {
iterations++;
const response = await this.anthropic.messages.create({
model: "claude-3-5-sonnet-20241022",
max_tokens: 4000,
system: `You are an autonomous, secure software agent.
You have access to an isolated Vercel Sandbox bash environment.
When asked to solve a problem, write the code to a file, execute it, inspect the output, and fix any errors you encounter.
Always verify your code runs successfully before presenting the final answer.`,
tools: sandboxTools,
messages: messages,
});
// Append the assistant's response to the history
messages.push({ role: "assistant", content: response.content });
if (response.stop_reason === "tool_use") {
const toolResults: Anthropic.Beta.Tools.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type === "tool_use") {
const { name, input, id } = block;
let executionResult: string;
try {
if (name === "write_file") {
const { path, content } = input as { path: string; content: string };
executionResult = await sandbox.writeFile(path, content);
} else if (name === "read_file") {
const { path } = input as { path: string };
executionResult = await sandbox.readFile(path);
} else if (name === "execute_command") {
const { command, timeoutMs } = input as { command: string; timeoutMs?: number };
const res = await sandbox.executeCommand(command, timeoutMs);
executionResult = JSON.stringify(res);
} else {
executionResult = `Error: Unknown tool ${name}`;
}
} catch (err: any) {
executionResult = `Execution Error: ${err.message}`;
}
toolResults.push({
type: "tool_result",
tool_use_id: id,
content: executionResult,
});
}
}
// Feed the execution results back to Claude
messages.push({ role: "user", content: toolResults });
} else {
// Claude finished its thinking process and did not request further tools
running = false;
}
}
// Extract and return the final text response from the last assistant message
const lastMessage = messages[messages.length - 1];
if (lastMessage && Array.isArray(lastMessage.content)) {
const textBlock = lastMessage.content.find((b) => b.type === "text");
return textBlock && "text" in textBlock ? textBlock.text : "Task completed.";
}
return "No response generated.";
} finally {
// Ensure the sandbox is always torn down, even if execution crashes
await sandbox.destroy();
}
}
}Step 5: Running the Agent
Create an entrypoint file named index.ts to test your new autonomous sandbox agent:
import { ClaudeSandboxAgent } from "./agent.js";
async function main() {
const agent = new ClaudeSandboxAgent();
console.log("Starting agent...");
const result = await agent.run(
"Generate a script that fetches the current price of Bitcoin from the Coingecko API, calculates the percentage change over the last 24 hours, and writes the output to a file named 'report.txt'. Read the file and output the final text."
);
console.log("\n--- Agent Final Output ---");
console.log(result);
}
main().catch(console.error);Run the agent using tsx:
npx tsx index.tsYou will see the agent automatically write the script, run it, capture the API response, handle any required package installations, and deliver the parsed report directly to your terminal.
Next Steps and Production Considerations
You now have a secure, sandboxed execution runtime for your Claude-managed agents. As you scale this architecture to production, keep these optimization strategies in mind:
- Implement Strict Timeouts: Ensure your sandbox commands have hard execution limits to prevent infinite loops from draining your API credits.
- Persistent vs. Ephemeral Sessions: For multi-turn user sessions, persist the sandbox instance across requests rather than calling
sandbox.destroy()immediately. - Network Access Control: Depending on your security posture, you may want to restrict outbound network access within the Vercel Sandbox to prevent agents from making unauthorized external requests.
Related Posts
How to Run Docker Containers Inside Vercel Sandbox for Secure Code Execution
Learn how to build, deploy, and run docker containers inside vercel sandbox to execute untrusted code securely with TypeScript.
Beyond the Chatbox: How to Build Generative UI with LLMs and TypeScript
Learn how to build generative ui with llms using TypeScript. Step-by-step guide to building a neural expressive interface that bypasses the dead chat-log pattern.
Building Generative UI with LLMs and React: Beyond the Chatbox
Move beyond boring text streaming. Learn how to build production-ready, dynamic generative UI using Gemini, React Server Components, and Zod schemas.