Running Docker Agent Inside a Sandbox

Run Docker Agent inside a microVM with one command. Hard VM isolation, workspace-only mount, and API keys that never cross the boundary.

When you give Docker Agent a filesystem toolset and a shell, you want isolation that you can actually trust. Docker Sandboxes give you exactly that. A microVM wraps the agent, your workspace is the only thing mounted in, and API credentials get injected by a proxy on the host rather than living on the agent's filesystem.

This post walks through running Docker Agent inside a sandbox end to end.

What Docker Agent Is

Docker Agent is an open-source framework for building teams of specialized AI agents. Instead of one generalist model trying to do everything, you define agents with specific roles in YAML and let them delegate to each other.

A simple two-agent debugger team:

agents:
  root:
    model: openai/gpt-5-mini
    description: Bug investigator
    instruction: |
      Analyze error messages, stack traces, and code to find bug root causes.
      Explain what's wrong and why it's happening.
      Delegate fix implementation to the fixer agent.
    sub_agents: [fixer]
    toolsets:
      - type: filesystem
      - type: mcp
        ref: docker:duckduckgo

  fixer:
    model: anthropic/claude-sonnet-4-5
    description: Fix implementer
    instruction: |
      Write fixes for bugs diagnosed by the investigator.
      Make minimal, targeted changes and add tests to prevent regression.
    toolsets:
      - type: filesystem
      - type: shell

Each agent has its own model, its own context, its own toolset. The framework handles coordination.

That filesystem toolset and that shell toolset are exactly what you want inside a sandbox.

Why microVM

Standard containers share the host kernel. For an agent that runs arbitrary commands and edits arbitrary files, you want a stronger boundary.

Docker Sandboxes wrap the agent in a microVM. Real virtualization between the agent and your machine. Credentials never sit on the agent's filesystem. The only thing mounted in is the workspace you choose.

The third panel is what you get with one sbx command.

What You Need

Docker Desktop 4.58 or later, with the sandbox feature enabled
An API key for your chosen model provider, exported in ~/.bashrc or ~/.zshrc

Setting Up Credentials

Docker Sandboxes run on a daemon process that doesn't inherit environment variables from your current shell. To make your API keys available to sandboxes, set them in your shell config:

export OPENAI_API_KEY=sk-proj-xxxxx
export ANTHROPIC_API_KEY=sk-ant-xxxxx

Then source the config and restart Docker Desktop so the daemon picks up the new variables:

$ source ~/.zshrc

When the sandbox starts, you will see confirmation of which credentials it found:

Using stored credentials for services: openai, anthropic

The proxy injects these into outbound API calls at request time.

Running Docker Agent in a Sandbox

The simplest invocation:

$ sbx run docker-agent ~/my-project

If the workspace doesn't exist yet, sbx will offer to create it. Then it pulls the template image, sets up the microVM, and launches the agent's TUI:

Creating new sandbox 'docker-agent-my-project'...
Using stored credentials for services: openai
Status: Downloaded newer image for docker/sandbox-templates:docker-agent-docker
✓ Created sandbox 'docker-agent-my-project'
  Workspace: /Users/ajeetraina/my-project (direct mount)
  Agent: docker-agent

Starting docker-agent agent in sandbox 'docker-agent-my-project'...
Workspace: /Users/ajeetraina/my-project

The sandbox name is auto-generated by combining the template name and the workspace basename. The template image includes a Docker daemon inside the microVM, so the agent can build and run containers without touching your host daemon.

The workspace is mounted directly into the VM. API keys live on the host and never cross the boundary; the proxy injects them into outbound calls at request time.

On subsequent runs, reconnect with:

$ sbx run docker-agent-my-project

The TUI

When the agent launches, you get a split-pane terminal UI. Left pane is the conversation, right pane is a session sidebar showing workspace path, token usage, the active agent, and tool count.

The default agent ships ready to use with 15 tools available out of the box.

Hotkeys are shown along the bottom: Ctrl+c to quit, Tab to switch focus, Ctrl+t/Ctrl+w for new/close tab, Ctrl+p/Ctrl+n for prev/next tab.

Choosing Your Model

To use a specific model, pass --model after a -- separator:

$ sbx run docker-agent-my-project -- --yolo --model openai/gpt-5-mini

The provider prefix is explicit: openai/, anthropic/, google/, or dmr/ for local models through Docker Model Runner. The sidebar will reflect the active provider and model:

Agent
▶ root
  A helpful AI assistant
  Provider: openai
  Model: gpt-5-mini

You can also bring your own YAML. Drop an agent.yaml in your workspace and pass it through:

$ sbx run docker-agent-my-project -- agent.yaml --yolo

This is how you run the multi-agent debugger team from earlier. The YAML lives in the workspace mount, so the agent inside the microVM picks it up directly.

A Practical Workflow

Putting it all together:

# One-time setup in ~/.zshrc
export OPENAI_API_KEY=sk-proj-...

# After editing the shell config, restart Docker Desktop once.

# Create and connect
$ sbx run docker-agent ~/my-project -- --yolo --model openai/gpt-5-mini

When you are done, the conversation persists with the sandbox. To wipe it and start fresh:

$ sbx rm docker-agent-my-project

Sandbox CLI Reference

Command	What it does
`sbx run <template> <workspace>`	Create a new sandbox and connect to it
`sbx run <sandbox-name>`	Reconnect to an existing sandbox
`sbx run <sandbox-name> -- <flags>`	Pass flags directly to the agent CLI
`sbx exec <sandbox-name> -- <cmd>`	Run a command in the sandbox without launching the agent
`sbx ls`	List existing sandboxes
`sbx rm <sandbox-name>`	Tear down a sandbox

Where to Go From Here

A few directions worth exploring:

Write your own agent team in YAML and run it inside a sandbox
Build custom sandbox templates with extra tools baked in
Wire the agent up to an MCP server through the MCP Gateway so it can reach external tools without credentials living in the VM
Package agent configs as OCI artifacts with docker agent push and pull them down on another machine to run inside a sandbox there

The combination of "I can define a team of agents in YAML" with "I can run that team inside a hard isolation boundary with one command" is what turns experimental agent work into something you can leave running.