Docker Sandboxes Tutorial and Cheatsheet

Docker Sandboxes lets AI coding agents like Claude Code run safely in isolated containers. Get full autonomy without compromising your localhost security. Docker Desktop 4.50+

Docker Sandboxes Tutorial and Cheatsheet

I've been running Claude Code for a few months now, and honestly? It's brilliant. But every time it runs npm install or modifies files outside my project, there's that moment of "wait, what did you just do?" Last week, I saw a Reddit thread where someone's home directory got wiped by an AI agent. That's the nightmare scenario.

AI coding agents have become incredibly powerful. Tools like Claude CodeGitHub Copilot, and Devin AI can write code, debug issues, and even manage entire development workflows. But there's a critical problem: running them locally introduces significant risks.

Let's discuss about the risks.

  1. Environment Pollution

AI agents can install packages and dependencies globally, creating conflicts with other projects. Imagine your agent installing a different version of Python or Node.js that breaks your existing applications.

# Agent installs globally
npm install -g some-package@beta

# Now your other projects using stable versions are broken
  1. Unintended File System Changes

An agent could mistakenly modify, move, or delete critical files outside the project workspace. One wrong command and your ~/.ssh keys, environment configs, or system files could be compromised.

Recent research from NVIDIA's AI Red Team (CVE-2024-12366) demonstrated how AI-generated code can escalate into remote code execution (RCE) when executed without proper isolation.

  1. Security Vulnerabilities

Giving an agent unrestricted network and file access could expose sensitive data or create security holes. According to a comprehensive survey by ACM Computing Surveys, insufficient isolation between agents and the host system poses one of the most significant security challenges in agentic AI systems.

The uncomfortable truth: Most LLM tools have full access to your machine by default, with only imperfect attempts at blocking risky behavior.

Docker Sandboxes fixes this

Docker Sandboxes solves these problems by isolating AI agents from your local machine while preserving a familiar development experience

It is an experimental feature in Docker Desktop 4.50+ that lets AI coding agents like Claude Code run safely in isolated containers while maintaining a seamless development experience. Your project directory is mounted at the same path, Git credentials are configured automatically, and your localhost stays protected. Your agent gets a container that looks exactly like your local environment—same paths, same Git config—but it can't touch anything outside the project folder. Let me show you how it works.

What actually happens when you run it

First, make sure you have Docker Desktop 4.50 or later. Then:

cd ~/my-project
docker sandbox run claude

That's it. First time, it'll ask you to authenticate with Claude. After that, credentials get stored in a Docker volume so you don't have to log in again.

Here's what Docker does behind the scenes:

  1. Spins up a container from docker/sandbox-templates:claude-code
  2. Mounts your current directory at the exact same path (so /Users/ajeet/my-project on your Mac is also /Users/ajeet/my-project inside the container)
  3. Injects your Git username and email so commits still show your name
  4. Stores the API key in a volume called docker-claude-sandbox-data

The path thing matters more than you'd think. When Claude gives you an error message with a file path, you can copy-paste it directly. No mental translation needed.

How Docker Sandboxes Differ from Regular Containers

You might be thinking: "Can't I just use docker run and mount my project?" Yes, but Docker Sandboxes handles several things automatically that you'd otherwise have to configure manually.

Think of a normal container like a rental car: you get the basic vehicle, but you have to adjust the mirrors, set the GPS, and bring your own charging cables every time you get a new one. A Docker Sandbox is like a dedicated personal car parked in a specific garage (your workspace); it already knows your seat settings, remembers your home address, and has your favorite sunglasses in the glovebox every time you step inside.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR (docker run)           PERSONAL CAR (sandbox)      │
│                                                                 │
│   ┌───────────────────────┐         ┌───────────────────────┐   │
│   │  🚗                     │        │  🚙                   │   |
│   │                       │         │                       │   │
│   │  • Adjust mirrors     │         │  ✓ Mirrors set        │   │
│   │  • Set GPS address    │         │  ✓ GPS knows home     │   │
│   │  • Bring your cables  │         │  ✓ Cables in glovebox │   │
│   │  • Configure Bluetooth│         │  ✓ Phone auto-connects│   │
│   │  • Reset every trip   │         │  ✓ Same setup always  │   │
│   │                       │         │                       │   │
│   └───────────────────────┘         └───────────────────────┘   │
│                                                                 │
│   Different car every time          Your car, your garage       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

State Persistence

💡
Rental car: The trunk gets emptied after every rental. Your umbrella, gym bag, and phone charger? Gone. Next time you rent, you're starting from scratch.

Personal car: Your stuff stays in the trunk. The umbrella you threw in last month is still there when it rains today.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR TRUNK                  PERSONAL CAR TRUNK          │
│                                                                 │
│   After Trip 1:                     After Trip 1:               │
│   ┌───────────────────┐             ┌───────────────────┐       │
│   │ 🎒 gym bag         │            │ 🎒 gym bag         │       │
│   │ ☂️  umbrella       │            │ ☂️  umbrella       │       │
│   │ 🔌 charger         │            │ 🔌 charger         │       │
│   └───────────────────┘             └───────────────────┘       │
│            │                                 │                  │
│            ▼                                 ▼                  │
│      [return car]                      [park at home]           │
│            │                                 │                  │
│            ▼                                 ▼                  │
│   Before Trip 2:                    Before Trip 2:              │
│   ┌───────────────────┐             ┌───────────────────┐       │
│   │                   │             │ 🎒 gym bag         │      │
│   │      EMPTY        │             │ ☂️  umbrella       │      │
│   │                   │             │ 🔌 charger         │      │
│   └───────────────────┘             └───────────────────┘       │
│                                                                 │
│   "Where's my charger?!"            "Right where I left it"     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Regular docker run: Container disappears when you exit. All installed packages, configs, and temp files are gone.

SESSION 1:
$ docker run -it node:20 bash
$ npm install express mongoose dotenv    # 200+ packages installed
$ exit

SESSION 2:
$ docker run -it node:20 bash
$ npm list
└── (empty)                              # 😭 Everything gone, start over

Docker Sandbox: State persists automatically per workspace.

SESSION 1:
$ docker sandbox run claude
> npm install express mongoose dotenv    # 200+ packages installed
> exit

SESSION 2:
$ docker sandbox run claude              # Same directory = same sandbox
> npm list
├── express@4.18.2                       # ✓ Still here!
├── mongoose@8.0.0                       # ✓ Still here!
└── dotenv@16.3.1                        # ✓ Still here!

Unlike a regular docker run that disappears when you exit, sandboxes persist. Run docker sandbox run claude in the same directory tomorrow, and you get the same container with all the packages Claude installed yesterday still there.

This is intentional. You want continuity—if Claude spent 10 minutes setting up your Python environment, you don't want to repeat that every session.

Path Matching

Rental car GPS: Addresses are generic. "Navigate to 123 Main St" works, but your mental shortcuts don't. When your friend texts "meet me at the usual spot," you can't just tap a button—you have to translate.

Personal car GPS: It knows "Home," "Work," "Mom's house," and "the usual spot." Same names you use in real life.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR GPS                    PERSONAL CAR GPS            │
│                                                                 │
│   Friend texts: "Meet at the        Friend texts: "Meet at the  │
│   usual coffee shop"                usual coffee shop"          │
│                                                                 │
│   ┌───────────────────┐             ┌───────────────────────┐   │
│   │                   │             │                       │   │
│   │  📍 Enter address │             │  📍 "Usual coffee"    │   │
│   │                   │             │     [TAP TO NAVIGATE] │   │
│   │  Street: ________ │             │                       │   │
│   │  City: __________ │             │  ✓ Same name you use  │   │
│   │                   │             │                       │   │
│   └───────────────────┘             └───────────────────────┘   │
│                                                                 │
│   🤔 "Wait, what's the actual       ✓ No translation needed     │
│       address again?"                                           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Regular docker run: You mount to an arbitrary path like /workspace. Error messages reference container paths that don't match your host.

HOST:       /Users/ajeet/projects/myapp/src/index.js
CONTAINER:  /workspace/src/index.js

ERROR: "Cannot find module '/workspace/src/utils.js'"

🤔 "Where is /workspace? Oh right, that's /Users/ajeet/projects/myapp..."

Docker Sandbox: Mounts at the exact same absolute path.

HOST:       /Users/ajeet/projects/myapp/src/index.js
SANDBOX:    /Users/ajeet/projects/myapp/src/index.js   ← SAME!

ERROR: "Cannot find module '/Users/ajeet/projects/myapp/src/utils.js'"

✓ Copy-paste the path directly. No mental translation.

Git Configuration Injection

Rental car: The toll transponder isn't linked to your account. You drive through the toll booth and get a bill addressed to "UNKNOWN DRIVER" or the rental company charges you a $50 admin fee to figure out who was driving.

Personal car: Toll transponder is linked to your account. Charges automatically go to the right person, correctly attributed, no questions asked.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR TOLLS                  PERSONAL CAR TOLLS          │
│                                                                 │
│   ┌───────────────────┐             ┌───────────────────────┐   │
│   │                   │             │                       │   │
│   │   TOLL INVOICE    │             │   TOLL INVOICE        │   │
│   │                   │             │                       │   │
│   │   Driver: ???     │             │   Driver: Ajeet Raina │   │
│   │   Vehicle: ???    │             │   Account: *****1234  │   │
│   │                   │             │                       │   │
│   │   ⚠️  UNIDENTIFIED│             │   ✓ AUTO-CHARGED      │   │
│   │                   │             │                       │   │
│   └───────────────────┘             └───────────────────────┘   │
│                                                                 │
│   "Please call to verify           "Thanks for using FastTag"  │
│    your identity"                                               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Regular docker run: Git doesn't know who you are. Commits show up as root@container-id or Git refuses to commit entirely.

$ docker run -it -v $(pwd):/workspace node bash
$ git commit -m "Fix bug"

⚠️  Author identity unknown
Please tell me who you are:
  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"

Docker Sandbox: Automatically reads your host Git config and injects it.

$ docker sandbox run claude
> git commit -m "Fix bug"
[main abc1234] Fix bug
 Author: Ajeet Raina <ajeet@docker.com>   ← ✓ Correct attribution!

Credential Storage

Rental car: You leave your garage door opener in the cupholder. When you return the car, either (a) you forget it and the next renter gets access to your garage, or (b) you have to remember to take it out every single time.

Personal car: Garage door opener lives in your car permanently. It's secure because the car is in your locked garage. You never have to think about it.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR                        PERSONAL CAR                │
│                                                                 │
│   ┌───────────────────┐             ┌───────────────────────┐   │
│   │    Cupholder      │             │    Visor clip         │   │
│   │                   │             │                       │   │
│   │  🔘 Garage opener  │            │  🔘 Garage opener     │   │
│   │                   │             │                       │   │
│   │  ⚠️  Don't forget  │            │  ✓ Always here        │   │
│   │     to remove!    │             │  ✓ Car is in garage   │   │
│   │                   │             │  ✓ Both are secure    │   │
│   └───────────────────┘             └───────────────────────┘   │
│                                                                 │
│   Risk: Next renter finds          Secure: Only you have        │
│   your garage opener               access to car + garage       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Regular docker run: Credentials end up in your project directory or home folder. Risk of committing to git, exposed to other apps.

~/my-project/
├── src/
├── .env                    ← API keys here? 😬
└── .claude_credentials     ← Might accidentally commit!

~/.config/claude/
└── credentials.json        ← Readable by any process on host

Docker Sandbox: Credentials stored in an isolated Docker volume, separate from your filesystem.

~/my-project/
├── src/
└── (no credentials here!)

DOCKER VOLUME: docker-claude-sandbox-data
└── credentials.json        ← Isolated, managed by Docker

✓ Can't accidentally commit to git
✓ Not exposed to other apps on host  
✓ Persists across sandbox rebuilds
✓ Easy cleanup: docker volume rm ...

One Sandbox Per Workspace

Rental car: Different car every time. After a week, you're in a parking garage thinking "Was it a silver Toyota on level 3? Or the white Honda on level 5?" You have three key fobs in your pocket and none of them work.

Personal car: One car, one garage, one address. Your car is at home. Always. You never wonder where it is.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   RENTAL CAR CHAOS                  PERSONAL CAR CLARITY        │
│                                                                 │
│   "Which rental was mine again?"    "My car is in my garage"    │
│                                                                 │
│   ┌─────────────────────────┐       ┌─────────────────────┐     │
│   │  PARKING GARAGE         │       │  YOUR HOME          │     │
│   │                         │       │                     │     │
│   │  🚗 Level 2, Spot 34?   │       │  ┌───────────────┐  │     │
│   │  🚙 Level 3, Spot 12?   │       │  │   🚙 YOUR     │  │     │
│   │  🚕 Level 5, Spot 8?    │       │  │      CAR      │  │     │
│   │                         │       │  └───────────────┘  │     │
│   │  🤔 🔑🔑🔑 ???           │       │                     │     │
│   │                         │       │  ✓ 123 Main St      │     │
│   └─────────────────────────┘       └─────────────────────┘     │
│                                                                 │
│   "Try all the key fobs"           "I know exactly where it is" │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Regular docker run: Creates a new container every time. Easy to end up with dozens of orphaned containers.

$ docker run -it node bash    # Creates container #1
$ docker run -it node bash    # Creates container #2  
$ docker run -it node bash    # Creates container #3

$ docker ps -a
CONTAINER ID   IMAGE   NAMES
a1b2c3d4e5f6   node    eager_tesla
b2c3d4e5f6a1   node    angry_curie
c3d4e5f6a1b2   node    zen_hopper
d4e5f6a1b2c3   node    modest_darwin
...47 more...

🤔 "Which one had my packages? Let me check each one..."

Docker Sandbox: One sandbox per directory. Docker tracks it for you.

$ cd ~/project-a
$ docker sandbox run claude     # Creates sandbox for project-a
$ docker sandbox run claude     # Reuses same sandbox
$ docker sandbox run claude     # Reuses same sandbox

$ cd ~/project-b
$ docker sandbox run claude     # Creates sandbox for project-b

$ docker sandbox ls
ID        WORKSPACE           STATUS     AGENT
sb-a1b2   ~/project-a         running    claude
sb-c3d4   ~/project-b         running    claude

✓ Clear 1:1 mapping
✓ No duplicates
✓ Always know which is which

Future: MicroVM Isolation (Roadmap)

Current (containers): All cars share the same road. If one car spills oil, others might slip on it. There are lane dividers, but it's still one shared surface.

Future (microVMs): Each car gets its own private tunnel. What happens in your tunnel stays in your tunnel. Complete physical separation.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   SHARED ROAD (containers)          PRIVATE TUNNELS (microVMs)  │
│                                                                 │
│   ┌─────────────────────────┐       ┌─────────────────────────┐ │
│   │  ═══════════════════    │       │  ┌─────────────────┐    │ │
│   │    🚗   │   🚙   │   🚕 │       │  │ 🚗 Tunnel A     │    │ │
│   │  ═══════════════════    │       │  └─────────────────┘    │ │
│   │                         │       │  ┌─────────────────┐    │ │
│   │  Same surface           │       │  │ 🚙 Tunnel B     │    │ │
│   │  Lane dividers only     │       │  └─────────────────┘    │ │
│   │                         │       │  ┌─────────────────┐    │ │
│   │  🛢️  Oil spill affects   │       │  │ 🚕 Tunnel C     │    │ │
│   │     everyone            │       │  └─────────────────┘    │ │
│   │                         │       │                         │ │
│   └─────────────────────────┘       │  Physical separation    │ │
│                                     └─────────────────────────┘ │
│                                                                 │
│   Isolation: painted lines          Isolation: concrete walls   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Current: Sandboxes run as containers inside Docker Desktop's VM. They share a kernel.

┌───────────────────────────────────────────────────────────┐
│  Docker Desktop VM                                        │
│                                                           │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐              │
│  │ Sandbox A │  │ Sandbox B │  │ Container │              │
│  │ (claude)  │  │ (gemini)  │  │           │              │
│  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘              │
│        └──────────────┼──────────────┘                    │
│                       │                                   │
│                SHARED KERNEL                              │
│                                                           │
└───────────────────────────────────────────────────────────┘

Isolation: namespaces + cgroups (software boundaries)

Planned: Each sandbox gets its own microVM with its own kernel.

┌───────────────────────────────────────────────────────────┐
│  Docker Desktop                                           │
│                                                           │
│  ┌─────────────────────┐    ┌─────────────────────┐       │
│  │  MicroVM A          │    │  MicroVM B          │       │
│  │  ┌───────────────┐  │    │  ┌───────────────┐  │       │
│  │  │  Sandbox A    │  │    │  │  Sandbox B    │  │       │
│  │  │  (claude)     │  │    │  │  (gemini)     │  │       │
│  │  └───────────────┘  │    │  └───────────────┘  │       │
│  │    OWN KERNEL       │    │    OWN KERNEL       │       │
│  └─────────────────────┘    └─────────────────────┘       │
│                                                           │
└───────────────────────────────────────────────────────────┘

Isolation: hardware-level (separate virtual machines)

✓ Stronger security boundary
✓ Kernel-level separation between agents
✓ Safer for running Docker inside sandbox

Let's summarise the analogy

FeatureRental Car (docker run)Personal Car (sandbox)
StateTrunk emptied every returnYour stuff stays in the trunk
NavigationGeneric addresses onlyKnows "Home," "Work," "Mom's"
TollsUnknown driver, manual billingAuto-linked to your account
Garage openerRisk leaving it for next renterSecure in your car + garage
Finding itWhich car? Which lot? Which level?Your garage, your address
Road safetyShared road, lane dividersPrivate tunnel (coming soon)

Getting Started in 60 Seconds

Prerequisites

  • Docker Desktop 4.50 or later (Download)
  • A Claude Code subscription (or other supported AI agent)

Run Your First Sandboxed Agent

# Navigate to your project
cd ~/sandbox-testing

# Start the sandbox
docker sandbox run claude

That's it!

On first run, Claude prompts you to enter your Anthropic API key. The credentials are stored in a persistent Docker volume named docker-claude-sandbox-data. All future Claude sandboxes automatically use these stored credentials, and they persist across sandbox restarts and deletion.

What Just Happened?

The docker sandbox run command automated several key steps:

  1. Container Creation: Created from docker/sandbox-templates:claude-code
  2. Workspace Mounting: Your current directory mounted at the exact same path
  3. Git Configuration: Your host's Git user.name and user.email injected automatically
  4. Persistent Credentials: API key stored in docker-claude-sandbox-data volume

Listing the Sandboxes

docker sandbox ls
SANDBOX ID     TEMPLATE                               NAME                               WORKSPACE                            STATUS    CREATED
275d94b417bf   docker/sandbox-templates:claude-code   claude-sandbox-2026-01-11-004116   /Users/ajeetsraina/sandbox-testing   running   2026-01-10 19:12:10

Under the Hood: How It Works

The Anatomy of a Sandbox

The docker/sandbox-templates:claude-code image includes Claude Code with automatic credential management, plus development tools (Docker CLI, GitHub CLI, Node.js, Go, Python 3, Git, ripgrep, jq). It runs as a non-root agent user with sudo access and launches Claude with --dangerously-skip-permissions by default.

┌─────────────────────────────────────────┐
│         Host Machine (Protected)        │
│                                         │
│  ┌────────────────────────────────────┐ │
│  │   Sandbox Container (Isolated)     │ │
│  │                                    │ │
│  │  ┌──────────────────────────────┐  │ │
│  │  │   AI Agent (Claude Code)     │  │ │
│  │  └──────────────────────────────┘  │ │
│  │            ↕ Mounted               │ │
│  │  ┌──────────────────────────────┐  │ │
│  │  │   Project Workspace          │←─┼─┼─┐
│  │  │   /Users/dev/project         │  │ │ │
│  │  └──────────────────────────────┘  │ │ │
│  └────────────────────────────────────┘ │ │
│                                         │ │
│  ┌────────────────────────────────────┐ │ │
│  │   Your Actual Project Files        │ │ │
│  │   /Users/dev/project               │◄┘ │
│  └────────────────────────────────────┘   │
└───────────────────────────────────────────┘

One Sandbox Per Workspace

Docker enforces one sandbox per workspace. Running docker sandbox run again in the same directory reuses the existing container. This means:

  • Installed packages persist across sessions
  • Environment changes are maintained
  • Temporary files remain between runs

Important: To modify a sandbox's configuration, you must remove and recreate it.

Recreating Sandboxes

Since Docker enforces one sandbox per workspace, the same sandbox is reused each time you run docker sandbox run in a given directory. To create a fresh sandbox, you need to remove the existing one first:

docker sandbox ls  # Find the sandbox ID
docker sandbox rm <sandbox-id>
docker sandbox run <agent>  # Creates a new sandbox

When to recreate Sandboxes?

Sandboxes remember their initial configuration and don't pick up changes from subsequent docker sandbox run commands. You must recreate the sandbox to modify:

  • Environment variables (the -e flag)
  • Volume mounts (the -v flag)
  • Docker socket access (the --mount-docker-socket flag)
  • Credentials mode (the --credentials flag)

Advanced Configuration

Managing Your Sandboxes

  • Inspect a sandbox's configuration (JSON output)
docker sandbox inspect 275d94b417bf
[
  {
    "id": "275d94b417bf8f4c29f6f3c7317f20f6b9636b3f3121d303149a066d8330428e",
    "name": "claude-sandbox-2026-01-11-004116",
    "workspace": "/Users/ajeetsraina/sandbox-testing",
    "created_at": "2026-01-10T19:12:10.888151834Z",
    "status": "running",
    "template": "docker/sandbox-templates:claude-code",
    "labels": {
      "com.docker.sandbox.agent": "claude",
      "com.docker.sandbox.credentials": "sandbox",
      "com.docker.sandbox.workingDirectory": "/Users/ajeetsraina/sandbox-testing",
      "com.docker.sandbox.workingDirectoryInode": "186434127",
      "com.docker.sandboxes": "templates",
      "com.docker.sandboxes.base": "ubuntu:questing",
      "com.docker.sandboxes.flavor": "claude-code",
      "com.docker.sdk": "true",
      "com.docker.sdk.client": "0.1.0-alpha011",
      "com.docker.sdk.container": "0.1.0-alpha012",
      "com.docker.sdk.lang": "go",
      "docker/sandbox": "true",
      "org.opencontainers.image.ref.name": "ubuntu",
      "org.opencontainers.image.version": "25.10"
    }
  }
]

This shows the sandbox's configuration, including environment variables, volumes, and creation time.


# Remove a specific sandbox
docker sandbox rm <sandbox-id>

# Pro Tip: Remove all sandboxes at once
docker sandbox rm $(docker sandbox ls -q)

Environment Variables

Use the -e flag to pass environment variables directly into the sandbox.

Example: Full Development Environment Setup

docker sandbox run \
  -e NODE_ENV=development \
  -e DATABASE_URL=postgresql://localhost/myapp_dev \
  -e DEBUG=true \
  claude

Example: API Keys for Testing

docker sandbox run -e STRIPE_TEST_KEY=sk_test_xxx claude

⚠️ Caution: Only use test or development API keys in sandboxes. Never expose production keys.

Volume Mounts

Use the -v flag to mount host directories into the sandbox. Syntax: host-path:container-path[:ro]

Example: Machine Learning Workflow

docker sandbox run \
  -v ~/datasets:/data:ro \
  -v ~/models:/models \
  -v ~/.cache/pip:/root/.cache/pip \
  claude

This provides:

  • Read-only access to datasets (prevents accidental modifications)
  • Read-write access to save trained models
  • Persistent pip cache for faster package installs

Custom Templates

Instead of installing tools every time, build a custom Docker image with everything pre-installed.

Step 1: Create a Dockerfile

# syntax=docker/dockerfile:1
FROM docker/sandbox-templates:claude-code

# Install the 'ruff' linter using 'uv'
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
    . ~/.local/bin/env && \
    uv tool install ruff@latest

Step 2: Build and Run

# Build your custom template image
docker build -t my-python-env .

# Run the agent using your new template
docker sandbox run --template my-python-env claude

Security Considerations

Docker Socket Access (Use With Extreme Caution)

The --mount-docker-socket flag gives the agent full access to your Docker daemon.

docker sandbox run --mount-docker-socket claude

⚠️ SECURITY WARNING

Mounting the Docker socket grants the agent root-level privileges on your system.

  • Can start/stop any container
  • Access volumes and networks
  • Potentially escape the sandbox

Only use this option when you fully trust the code the agent is working with.

When It's Useful

  • Building images from a Dockerfile
  • Running multi-container applications with Docker Compose
  • Testing and validating containerized applications

Authentication Strategies

--credentials=sandbox (Default)

Securely stores your API key in a managed Docker volume for reuse across sandboxes.

docker sandbox run claude  # Uses sandbox mode by default

--credentials=none

No automatic credential management. You must authenticate manually inside the container for each new sandbox.

docker sandbox run --credentials=none claude

Best Practices

Based on research from Martin Fowler's team and NVIDIA's AI security guidelines:

  1. Least Privilege: Start with read-only access for AI agents
  2. Never store production credentials in files accessible to agents
  3. Use temporary tokens with limited scopes
  4. Review all AI-generated code before committing
  5. Limit Docker socket access to trusted workflows only
  6. Monitor resource usage to detect anomalies

The Future of AI Agent Security

Docker Sandboxes represents a critical step forward in making AI agents both powerful and safe. As recent vulnerabilities in tools like OpenAI Codex CLI (CVE-2025-61260) demonstrate, the security of AI coding assistants is an evolving challenge.


Conclusion

Docker Sandboxes solves the fundamental tension between AI agent autonomy and system security. By providing true isolation with zero-overhead development experience, it enables developers to harness the full power of AI coding assistants without compromising their machines.

The three principles that make it work:

  1. Security through isolation - Containers protect your host
  2. Familiarity through path mounting - Same paths, same workflows
  3. Power through customization - Adapt to any use case

As AI agents become more sophisticated and autonomous, proper sandboxing isn't optional—it's essential. Docker Sandboxes makes it practical.

References

  1. Docker Sandboxes Official Documentation
  2. How Code Execution Drives Key Risks in Agentic AI Systems - NVIDIA
  3. AI Agents Under Threat: A Survey - ACM Computing Surveys
  4. Agentic AI and Security - Martin Fowler
  5. Security of AI Agents - arXiv
  6. The Hidden Security Risks of SWE Agents - Pillar Security