Docker Sandboxes Tutorial and Cheatsheet

Docker Sandboxes lets AI coding agents like Claude Code run safely in isolated containers. Get full autonomy without compromising your localhost security. Docker Desktop 4.50+

Docker Sandboxes Tutorial and Cheatsheet
AI Coding Agents and Docker Sandboxes

I've been running Claude Code for a few months now, and honestly? It's brilliant. But every time it runs npm install or modifies files outside my project, there's that moment of "wait, what did you just do?" Last week, I saw a Reddit thread where someone's home directory got wiped by an AI agent. That's the nightmare scenario.

AI coding agents have become incredibly powerful. Tools like Claude CodeGitHub Copilot, and Devin AI can write code, debug issues, and even manage entire development workflows. But there's a critical problem: running them locally introduces significant risks.

Let's discuss about the risks.

  1. Environment Pollution

AI agents can install packages and dependencies globally, creating conflicts with other projects. Imagine your agent installing a different version of Python or Node.js that breaks your existing applications.

# Agent installs globally
npm install -g some-package@beta

# Now your other projects using stable versions are broken
  1. Unintended File System Changes

An agent could mistakenly modify, move, or delete critical files outside the project workspace. One wrong command and your ~/.ssh keys, environment configs, or system files could be compromised.

Recent research from NVIDIA's AI Red Team (CVE-2024-12366) demonstrated how AI-generated code can escalate into remote code execution (RCE) when executed without proper isolation.

  1. Security Vulnerabilities

Giving an agent unrestricted network and file access could expose sensitive data or create security holes. According to a comprehensive survey by ACM Computing Surveys, insufficient isolation between agents and the host system poses one of the most significant security challenges in agentic AI systems.

The uncomfortable truth: Most LLM tools have full access to your machine by default, with only imperfect attempts at blocking risky behavior.

Docker Sandboxes fixes this

Docker Sandboxes solves these problems by isolating AI agents from your local machine while preserving a familiar development experience.

It is an experimental feature in Docker Desktop 4.50+ that lets AI coding agents like Claude Code run safely in isolated containers while maintaining a seamless development experience. Your project directory is mounted at the same path, Git credentials are configured automatically, and your localhost stays protected. Your agent gets a container that looks exactly like your local environmentβ€”same paths, same Git configβ€”but it can't touch anything outside the project folder. Let me show you how it works.

What actually happens when you run it

First, make sure you have Docker Desktop 4.50 or later. Then:

cd ~/my-project
docker sandbox run claude

That's it. First time, it'll ask you to authenticate with Claude. After that, credentials get stored in a Docker volume so you don't have to log in again.

Here's what Docker does behind the scenes:

  1. Spins up a container from docker/sandbox-templates:claude-code
  2. Mounts your current directory at the exact same path (so /Users/ajeet/my-project on your Mac is also /Users/ajeet/my-project inside the container)
  3. Injects your Git username and email so commits still show your name
  4. Stores the API key in a volume called docker-claude-sandbox-data

The path thing matters more than you'd think. When Claude gives you an error message with a file path, you can copy-paste it directly. No mental translation needed.

How Docker Sandboxes Differ from Regular Containers

You might be thinking: "Can't I just use docker run and mount my project?" Yes, but Docker Sandboxes handles several things automatically that you'd otherwise have to configure manually.

Think of a normal container like a rental car: you get the basic vehicle, but you have to adjust the mirrors, set the GPS, and bring your own charging cables every time you get a new one. A Docker Sandbox is like a dedicated personal car parked in a specific garage (your workspace); it already knows your seat settings, remembers your home address, and has your favorite sunglasses in the glovebox every time you step inside.

State Persistence

πŸ’‘
Rental car: The trunk gets emptied after every rental. Your umbrella, gym bag, and phone charger? Gone. Next time you rent, you're starting from scratch.

Personal car: Your stuff stays in the trunk. The umbrella you threw in last month is still there when it rains today.

Regular docker run: Container disappears when you exit. All installed packages, configs, and temp files are gone.

SESSION 1:
$ docker run -it node:20 bash
$ npm install express mongoose dotenv    # 200+ packages installed
$ exit

SESSION 2:
$ docker run -it node:20 bash
$ npm list
└── (empty)                              # 😭 Everything gone, start over

Docker Sandbox: State persists automatically per workspace.

SESSION 1:
$ docker sandbox run claude
> npm install express mongoose dotenv    # 200+ packages installed
> exit

SESSION 2:
$ docker sandbox run claude              # Same directory = same sandbox
> npm list
β”œβ”€β”€ express@4.18.2                       # βœ“ Still here!
β”œβ”€β”€ mongoose@8.0.0                       # βœ“ Still here!
└── dotenv@16.3.1                        # βœ“ Still here!

Unlike a regular docker run that disappears when you exit, sandboxes persist. Run docker sandbox run claude in the same directory tomorrow, and you get the same container with all the packages Claude installed yesterday still there.

This is intentional. You want continuityβ€”if Claude spent 10 minutes setting up your Python environment, you don't want to repeat that every session.

Path Matching

Rental car GPS: Addresses are generic. "Navigate to 123 Main St" works, but your mental shortcuts don't. When your friend texts "meet me at the usual spot," you can't just tap a buttonβ€”you have to translate.

Personal car GPS: It knows "Home," "Work," "Mom's house," and "the usual spot." Same names you use in real life.

Regular docker run: You mount to an arbitrary path like /workspace. Error messages reference container paths that don't match your host.

HOST:       /Users/ajeet/projects/myapp/src/index.js
CONTAINER:  /workspace/src/index.js

ERROR: "Cannot find module '/workspace/src/utils.js'"

πŸ€” "Where is /workspace? Oh right, that's /Users/ajeet/projects/myapp..."

Docker Sandbox: Mounts at the exact same absolute path.

HOST:       /Users/ajeet/projects/myapp/src/index.js
SANDBOX:    /Users/ajeet/projects/myapp/src/index.js   ← SAME!

ERROR: "Cannot find module '/Users/ajeet/projects/myapp/src/utils.js'"

βœ“ Copy-paste the path directly. No mental translation.

Git Configuration Injection

Rental car: The toll transponder isn't linked to your account. You drive through the toll booth and get a bill addressed to "UNKNOWN DRIVER" or the rental company charges you a $50 admin fee to figure out who was driving.

Personal car: Toll transponder is linked to your account. Charges automatically go to the right person, correctly attributed, no questions asked.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                 β”‚
β”‚   RENTAL CAR TOLLS                  PERSONAL CAR TOLLS          β”‚
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β”‚   TOLL INVOICE    β”‚             β”‚   TOLL INVOICE        β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β”‚   Driver: ???     β”‚             β”‚   Driver: Ajeet Raina β”‚   β”‚
β”‚   β”‚   Vehicle: ???    β”‚             β”‚   Account: *****1234  β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β”‚   ⚠️  UNIDENTIFIEDβ”‚             β”‚   βœ“ AUTO-CHARGED      β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                 β”‚
β”‚   "Please call to verify           "Thanks for using FastTag"  β”‚
β”‚    your identity"                                               β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Regular docker run: Git doesn't know who you are. Commits show up as root@container-id or Git refuses to commit entirely.

$ docker run -it -v $(pwd):/workspace node bash
$ git commit -m "Fix bug"

⚠️  Author identity unknown
Please tell me who you are:
  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"

Docker Sandbox: Automatically reads your host Git config and injects it.

$ docker sandbox run claude
> git commit -m "Fix bug"
[main abc1234] Fix bug
 Author: Ajeet Raina <ajeet@docker.com>   ← βœ“ Correct attribution!

Credential Storage

Rental car: You leave your garage door opener in the cupholder. When you return the car, either (a) you forget it and the next renter gets access to your garage, or (b) you have to remember to take it out every single time.

Personal car: Garage door opener lives in your car permanently. It's secure because the car is in your locked garage. You never have to think about it.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                 β”‚
β”‚   RENTAL CAR                        PERSONAL CAR                β”‚
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚    Cupholder      β”‚             β”‚    Visor clip         β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β”‚  πŸ”˜ Garage opener  β”‚            β”‚  πŸ”˜ Garage opener     β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚                       β”‚   β”‚
β”‚   β”‚  ⚠️  Don't forget  β”‚            β”‚  βœ“ Always here        β”‚   β”‚
β”‚   β”‚     to remove!    β”‚             β”‚  βœ“ Car is in garage   β”‚   β”‚
β”‚   β”‚                   β”‚             β”‚  βœ“ Both are secure    β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                 β”‚
β”‚   Risk: Next renter finds          Secure: Only you have        β”‚
β”‚   your garage opener               access to car + garage       β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Regular docker run: Credentials end up in your project directory or home folder. Risk of committing to git, exposed to other apps.

~/my-project/
β”œβ”€β”€ src/
β”œβ”€β”€ .env                    ← API keys here? 😬
└── .claude_credentials     ← Might accidentally commit!

~/.config/claude/
└── credentials.json        ← Readable by any process on host

Docker Sandbox: Credentials stored in an isolated Docker volume, separate from your filesystem.

~/my-project/
β”œβ”€β”€ src/
└── (no credentials here!)

DOCKER VOLUME: docker-claude-sandbox-data
└── credentials.json        ← Isolated, managed by Docker

βœ“ Can't accidentally commit to git
βœ“ Not exposed to other apps on host  
βœ“ Persists across sandbox rebuilds
βœ“ Easy cleanup: docker volume rm ...

One Sandbox Per Workspace

Rental car: Different car every time. After a week, you're in a parking garage thinking "Was it a silver Toyota on level 3? Or the white Honda on level 5?" You have three key fobs in your pocket and none of them work.

Personal car: One car, one garage, one address. Your car is at home. Always. You never wonder where it is.

Regular docker run: Creates a new container every time. Easy to end up with dozens of orphaned containers.

$ docker run -it node bash    # Creates container #1
$ docker run -it node bash    # Creates container #2  
$ docker run -it node bash    # Creates container #3

$ docker ps -a
CONTAINER ID   IMAGE   NAMES
a1b2c3d4e5f6   node    eager_tesla
b2c3d4e5f6a1   node    angry_curie
c3d4e5f6a1b2   node    zen_hopper
d4e5f6a1b2c3   node    modest_darwin
...47 more...

πŸ€” "Which one had my packages? Let me check each one..."

Docker Sandbox: One sandbox per directory. Docker tracks it for you.

$ cd ~/project-a
$ docker sandbox run claude     # Creates sandbox for project-a
$ docker sandbox run claude     # Reuses same sandbox
$ docker sandbox run claude     # Reuses same sandbox

$ cd ~/project-b
$ docker sandbox run claude     # Creates sandbox for project-b

$ docker sandbox ls
ID        WORKSPACE           STATUS     AGENT
sb-a1b2   ~/project-a         running    claude
sb-c3d4   ~/project-b         running    claude

βœ“ Clear 1:1 mapping
βœ“ No duplicates
βœ“ Always know which is which

Future: MicroVM Isolation (Roadmap)

Current (containers): All cars share the same road. If one car spills oil, others might slip on it. There are lane dividers, but it's still one shared surface.

Future (microVMs): Each car gets its own private tunnel. What happens in your tunnel stays in your tunnel. Complete physical separation.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                 β”‚
β”‚   SHARED ROAD (containers)          PRIVATE TUNNELS (microVMs)  β”‚
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚   β”‚  ═══════════════════    β”‚       β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚ β”‚
β”‚   β”‚    πŸš—   β”‚   πŸš™   β”‚   πŸš• β”‚       β”‚  β”‚ πŸš— Tunnel A     β”‚    β”‚ β”‚
β”‚   β”‚  ═══════════════════    β”‚       β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ β”‚
β”‚   β”‚                         β”‚       β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚ β”‚
β”‚   β”‚  Same surface           β”‚       β”‚  β”‚ πŸš™ Tunnel B     β”‚    β”‚ β”‚
β”‚   β”‚  Lane dividers only     β”‚       β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ β”‚
β”‚   β”‚                         β”‚       β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚ β”‚
β”‚   β”‚  πŸ›’οΈ  Oil spill affects   β”‚       β”‚  β”‚ πŸš• Tunnel C     β”‚    β”‚ β”‚
β”‚   β”‚     everyone            β”‚       β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ β”‚
β”‚   β”‚                         β”‚       β”‚                         β”‚ β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚  Physical separation    β”‚ β”‚
β”‚                                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                 β”‚
β”‚   Isolation: painted lines          Isolation: concrete walls   β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Current: Sandboxes run as containers inside Docker Desktop's VM. They share a kernel.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Docker Desktop VM                                        β”‚
β”‚                                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚ Sandbox A β”‚  β”‚ Sandbox B β”‚  β”‚ Container β”‚              β”‚
β”‚  β”‚ (claude)  β”‚  β”‚ (gemini)  β”‚  β”‚           β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜              β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
β”‚                       β”‚                                   β”‚
β”‚                SHARED KERNEL                              β”‚
β”‚                                                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Isolation: namespaces + cgroups (software boundaries)

Planned: Each sandbox gets its own microVM with its own kernel.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Docker Desktop                                           β”‚
β”‚                                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚  MicroVM A          β”‚    β”‚  MicroVM B          β”‚       β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚       β”‚
β”‚  β”‚  β”‚  Sandbox A    β”‚  β”‚    β”‚  β”‚  Sandbox B    β”‚  β”‚       β”‚
β”‚  β”‚  β”‚  (claude)     β”‚  β”‚    β”‚  β”‚  (gemini)     β”‚  β”‚       β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚       β”‚
β”‚  β”‚    OWN KERNEL       β”‚    β”‚    OWN KERNEL       β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Isolation: hardware-level (separate virtual machines)

βœ“ Stronger security boundary
βœ“ Kernel-level separation between agents
βœ“ Safer for running Docker inside sandbox

Let's summarise the analogy

FeatureRental Car (docker run)Personal Car (sandbox)
StateTrunk emptied every returnYour stuff stays in the trunk
NavigationGeneric addresses onlyKnows "Home," "Work," "Mom's"
TollsUnknown driver, manual billingAuto-linked to your account
Garage openerRisk leaving it for next renterSecure in your car + garage
Finding itWhich car? Which lot? Which level?Your garage, your address
Road safetyShared road, lane dividersPrivate tunnel (coming soon)

Getting Started in 60 Seconds

Prerequisites

  • Docker Desktop 4.50 or later (Download)
  • A Claude Code subscription (or other supported AI agent)

Run Your First Sandboxed Agent

# Navigate to your project
cd ~/sandbox-testing

# Start the sandbox
docker sandbox run claude

That's it!

On first run, Claude prompts you to enter your Anthropic API key. The credentials are stored in a persistent Docker volume named docker-claude-sandbox-data. All future Claude sandboxes automatically use these stored credentials, and they persist across sandbox restarts and deletion.

What Just Happened?

The docker sandbox run command automated several key steps:

  1. Container Creation: Created from docker/sandbox-templates:claude-code
  2. Workspace Mounting: Your current directory mounted at the exact same path
  3. Git Configuration: Your host's Git user.name and user.email injected automatically
  4. Persistent Credentials: API key stored in docker-claude-sandbox-data volume

Listing the Sandboxes

docker sandbox ls
SANDBOX ID     TEMPLATE                               NAME                               WORKSPACE                            STATUS    CREATED
275d94b417bf   docker/sandbox-templates:claude-code   claude-sandbox-2026-01-11-004116   /Users/ajeetsraina/sandbox-testing   running   2026-01-10 19:12:10

Under the Hood: How It Works

The Anatomy of a Sandbox

The docker/sandbox-templates:claude-code image includes Claude Code with automatic credential management, plus development tools (Docker CLI, GitHub CLI, Node.js, Go, Python 3, Git, ripgrep, jq). It runs as a non-root agent user with sudo access and launches Claude with --dangerously-skip-permissions by default.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Host Machine (Protected)        β”‚
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚   Sandbox Container (Isolated)     β”‚ β”‚
β”‚  β”‚                                    β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚ β”‚
β”‚  β”‚  β”‚   AI Agent (Claude Code)     β”‚  β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚ β”‚
β”‚  β”‚            ↕ Mounted               β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚ β”‚
β”‚  β”‚  β”‚   Project Workspace          │←─┼─┼─┐
β”‚  β”‚  β”‚   /Users/dev/project         β”‚  β”‚ β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚ β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚                                         β”‚ β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚  β”‚   Your Actual Project Files        β”‚ β”‚ β”‚
β”‚  β”‚   /Users/dev/project               β”‚β—„β”˜ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

One Sandbox Per Workspace

Docker enforces one sandbox per workspace. Running docker sandbox run again in the same directory reuses the existing container. This means:

  • Installed packages persist across sessions
  • Environment changes are maintained
  • Temporary files remain between runs

Important: To modify a sandbox's configuration, you must remove and recreate it.

Recreating Sandboxes

Since Docker enforces one sandbox per workspace, the same sandbox is reused each time you run docker sandbox run in a given directory. To create a fresh sandbox, you need to remove the existing one first:

docker sandbox ls  # Find the sandbox ID
docker sandbox rm <sandbox-id>
docker sandbox run <agent>  # Creates a new sandbox

When to recreate Sandboxes?

Sandboxes remember their initial configuration and don't pick up changes from subsequent docker sandbox run commands. You must recreate the sandbox to modify:

  • Environment variables (the -e flag)
  • Volume mounts (the -v flag)
  • Docker socket access (the --mount-docker-socket flag)
  • Credentials mode (the --credentials flag)

Advanced Configuration

Managing Your Sandboxes

  • Inspect a sandbox's configuration (JSON output)
docker sandbox inspect 275d94b417bf
[
  {
    "id": "275d94b417bf8f4c29f6f3c7317f20f6b9636b3f3121d303149a066d8330428e",
    "name": "claude-sandbox-2026-01-11-004116",
    "workspace": "/Users/ajeetsraina/sandbox-testing",
    "created_at": "2026-01-10T19:12:10.888151834Z",
    "status": "running",
    "template": "docker/sandbox-templates:claude-code",
    "labels": {
      "com.docker.sandbox.agent": "claude",
      "com.docker.sandbox.credentials": "sandbox",
      "com.docker.sandbox.workingDirectory": "/Users/ajeetsraina/sandbox-testing",
      "com.docker.sandbox.workingDirectoryInode": "186434127",
      "com.docker.sandboxes": "templates",
      "com.docker.sandboxes.base": "ubuntu:questing",
      "com.docker.sandboxes.flavor": "claude-code",
      "com.docker.sdk": "true",
      "com.docker.sdk.client": "0.1.0-alpha011",
      "com.docker.sdk.container": "0.1.0-alpha012",
      "com.docker.sdk.lang": "go",
      "docker/sandbox": "true",
      "org.opencontainers.image.ref.name": "ubuntu",
      "org.opencontainers.image.version": "25.10"
    }
  }
]

This shows the sandbox's configuration, including environment variables, volumes, and creation time.


# Remove a specific sandbox
docker sandbox rm <sandbox-id>

# Pro Tip: Remove all sandboxes at once
docker sandbox rm $(docker sandbox ls -q)

Environment Variables

Use the -e flag to pass environment variables directly into the sandbox.

Example: Full Development Environment Setup

docker sandbox run \
  -e NODE_ENV=development \
  -e DATABASE_URL=postgresql://localhost/myapp_dev \
  -e DEBUG=true \
  claude

Example: API Keys for Testing

docker sandbox run -e STRIPE_TEST_KEY=sk_test_xxx claude

⚠️ Caution: Only use test or development API keys in sandboxes. Never expose production keys.

Volume Mounts

Use the -v flag to mount host directories into the sandbox. Syntax: host-path:container-path[:ro]

Example: Machine Learning Workflow

docker sandbox run \
  -v ~/datasets:/data:ro \
  -v ~/models:/models \
  -v ~/.cache/pip:/root/.cache/pip \
  claude

This provides:

  • Read-only access to datasets (prevents accidental modifications)
  • Read-write access to save trained models
  • Persistent pip cache for faster package installs

Custom Templates

Instead of installing tools every time, build a custom Docker image with everything pre-installed.

Step 1: Create a Dockerfile

# syntax=docker/dockerfile:1
FROM docker/sandbox-templates:claude-code

# Install the 'ruff' linter using 'uv'
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
    . ~/.local/bin/env && \
    uv tool install ruff@latest

Step 2: Build and Run

# Build your custom template image
docker build -t my-python-env .

# Run the agent using your new template
docker sandbox run --template my-python-env claude

Security Considerations

Docker Socket Access (Use With Extreme Caution)

The --mount-docker-socket flag gives the agent full access to your Docker daemon.

docker sandbox run --mount-docker-socket claude

⚠️ SECURITY WARNING

Mounting the Docker socket grants the agent root-level privileges on your system.

  • Can start/stop any container
  • Access volumes and networks
  • Potentially escape the sandbox

Only use this option when you fully trust the code the agent is working with.

When It's Useful

  • Building images from a Dockerfile
  • Running multi-container applications with Docker Compose
  • Testing and validating containerized applications

Authentication Strategies

--credentials=sandbox (Default)

Securely stores your API key in a managed Docker volume for reuse across sandboxes.

docker sandbox run claude  # Uses sandbox mode by default

--credentials=none

No automatic credential management. You must authenticate manually inside the container for each new sandbox.

docker sandbox run --credentials=none claude

Best Practices

Based on research from Martin Fowler's team and NVIDIA's AI security guidelines:

  1. Least Privilege: Start with read-only access for AI agents
  2. Never store production credentials in files accessible to agents
  3. Use temporary tokens with limited scopes
  4. Review all AI-generated code before committing
  5. Limit Docker socket access to trusted workflows only
  6. Monitor resource usage to detect anomalies

Docker Sandboxes Labs and Tutorials for Beginners - a Step by Step Guide

  1. Create a Directory
mkdir -p /Users/ajeetsraina/sandbox-testing
cd /Users/ajeetsraina/sandbox-testing

2. Run the Sandbox

docker sandbox run

docker: 'docker sandbox run' requires at least 1 argument

Usage:  docker sandbox run [options] <agent> [agent-options]

See 'docker sandbox run --help' for more information

Available Agents:
  claude          Run Claude AI agent inside a sandbox
  gemini          Run Gemini AI agent inside a sandbox
docker sandbox run claude

3. List and Inspect Sandboxes

docker sandbox ls

SANDBOX ID     TEMPLATE                               NAME                               WORKSPACE                            STATUS    CREATED
275d94b417bf   docker/sandbox-templates:claude-code   claude-sandbox-2026-01-11-004116   /Users/ajeetsraina/sandbox-testing   running   2026-01-10 19:12:10
docker sandbox inspect 275d94b417bf[
  {
    "id": "275d94b417bf8f4c29f6f3c7317f20f6b9636b3f3121d303149a066d8330428e",
    "name": "claude-sandbox-2026-01-11-004116",
    "workspace": "/Users/ajeetsraina/sandbox-testing",
    "created_at": "2026-01-10T19:12:10.888151834Z",
    "status": "running",
    "template": "docker/sandbox-templates:claude-code",
    "labels": {
      "com.docker.sandbox.agent": "claude",
      "com.docker.sandbox.credentials": "sandbox",
      "com.docker.sandbox.workingDirectory": "/Users/ajeetsraina/sandbox-testing",
      "com.docker.sandbox.workingDirectoryInode": "186434127",
      "com.docker.sandboxes": "templates",
      "com.docker.sandboxes.base": "ubuntu:questing",
      "com.docker.sandboxes.flavor": "claude-code",
      "com.docker.sdk": "true",
      "com.docker.sdk.client": "0.1.0-alpha011",
      "com.docker.sdk.container": "0.1.0-alpha012",
      "com.docker.sdk.lang": "go",
      "docker/sandbox": "true",
      "org.opencontainers.image.ref.name": "ubuntu",
      "org.opencontainers.image.version": "25.10"
    }
  }
]
Note: The docker/sandbox-templates:claude-code image includes Claude Code with automatic credential management, plus development tools (Docker CLI, GitHub CLI, Node.js, Go, Python 3, Git, ripgrep, jq). It runs as a non-root agent user with sudo access and launches Claude with --dangerously-skip-permissions by default.

4. Managing Sandboxes

Since Docker enforces one sandbox per workspace, the same sandbox is reused each time you run docker sandbox run <agent> in a given directory. To create a fresh sandbox, you need to remove the existing one first:

docker sandbox ls           # Find the sandbox ID
docker sandbox rm <sandbox-id>
docker sandbox run <agent>  # Creates a new sandbox

Verify the Isolation

Test 1: Check if SSH Directory Exists

ls -la ~/.ssh/

Result:

Bash(ls -la ~/.ssh/)
  ⎿  Error: Exit code 2
     ls: cannot access '/home/agent/.ssh/': No such file or directory

βœ… That's the sandbox working!

Notice the path: /home/agent/.ssh/ β€” the sandbox can't see your host's SSH keys at all. They simply don't exist inside the container.


Test 2: Try to Access AWS Credentials

ls -la ~/.aws/

Result:

Bash(ls -la ~/.aws/)
  ⎿  Error: Exit code 2
     ls: cannot access '/home/agent/.aws/': No such file or directory

βœ… AWS credentials protected!


Test 3: Try to Access Documents Folder

ls ~/Documents/

Result:

● The ~/Documents/ directory doesn't exist in this environment.

  This is a sandboxed environment with a minimal filesystem. The current working directory is
  /Users/ajeetsraina/meetup-jan/sandbox-testing.
  
  Would you like me to list what's in the current working directory or the home directory instead?

βœ… Documents folder blocked!

The sandbox is "secure by default" β€” you don't have to think about what to exclude. It only sees your project folder.


Test 4: Confirm Project Folder Access

ls -la /Users/ajeetsraina/meetup-jan/sandbox-testing/

Result:

Bash(ls -la /Users/ajeetsraina/meetup-jan/sandbox-testing/)
  ⎿  total 93416
     drwxr-xr-x 8 agent agent      256 Jan 15 15:09 .
     drwxr-xr-x 3 root  root      4096 Jan 15 15:52 ..
     … +7 lines (ctrl+o to expand)

βœ… It works!

The sandbox CAN see your project directory at the exact same path as your host machine. That's path matching in action.


Test 5: The Nuclear Option

Can it delete your home directory?

rm -rf ~/

Result:

I won't run that command. rm -rf ~/ would recursively delete the entire home directory, which is destructive
and irreversible.

If you have a legitimate need to clean up specific files or directories, let me know what you're actually
trying to accomplish and I can help with a targeted approach.

Claude's safety guardrails kicked in! But even if it ran, see Test 5b for why it wouldn't matter.


Test 5b: Sandbox Home Directory Contents

ls -la ~/

Result:

Bash(ls -la ~/)
  ⎿  total 64
     drwxr-xr-x 1 agent agent 4096 Jan 15 16:23 .
     drwxr-xr-x 1 root  root  4096 Jan 10 00:23 ..
     … +9 lines (ctrl+o to expand)

● The home directory contains:
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚     Item     β”‚ Type β”‚     Description     β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .cache/      β”‚ dir  β”‚ Cache files         β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .claude/     β”‚ dir  β”‚ Claude Code config  β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .claude.json β”‚ file β”‚ Claude settings     β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .docker/     β”‚ dir  β”‚ Docker config       β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .gitconfig   β”‚ file β”‚ Git configuration   β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .local/      β”‚ dir  β”‚ Local user data     β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ .npm/        β”‚ dir  β”‚ npm cache           β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ workspace/   β”‚ dir  β”‚ Workspace directory β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🎯 This is the key insight!

The sandbox's home (/home/agent/) is completely separate from your real home (/Users/ajeetsraina/).

If rm -rf ~/ runs...Without SandboxWith Sandbox
What gets deletedYour real home: Documents, SSH keys, years of workJust /home/agent/: cache, npm, empty sandbox stuff
ImpactCatastrophic πŸ’€Nothing important βœ…

Test 6: Verify Git Identity Injection

cat ~/.gitconfig

Result:

Read(~/.gitconfig)
  ⎿  Read 3 lines

● [safe]
      directory = /Users/ajeetsraina/meetup-jan/sandbox-testing

  The gitconfig just has a safe.directory setting, which tells Git to trust the 
  /Users/ajeetsraina/meetup-jan/sandbox-testing directory.

Test 6b: Check full Git config

git config --global --list

Result:

safe.directory=/Users/ajeetsraina/meetup-jan/sandbox-testing

⚠️ Note: Git identity (user.nameuser.email) was not auto-injected despite being configured on the host. Will work with Sandboxes team and get it fixed.


Test 7: Path Matching

Path matching ensures that file paths are identical inside and outside the sandbox. This is crucial for:

  • Error messages that make sense
  • Copy-paste paths that work
  • IDE integration
  • Git path consistency

Why Path Matching Matters

Without Path MatchingWith Path Matching (Docker Sandboxes)
Host: /Users/ajeet/project/src/Button.tsxHost: /Users/ajeet/project/src/Button.tsx
Container: /workspace/src/Button.tsxContainer: /Users/ajeet/project/src/Button.tsx βœ…
Error messages show /workspace/... β€” confusing!Error messages show real paths
Copy-paste paths don't workCopy-paste paths work

Step 1: Create a File on HOST

# On your host terminal

mkdir -p ~/meetup-jan/sandbox-testing/src/components

Verify it exists:

cat ~/meetup-jan/sandbox-testing/src/components/Button.tsx

Result:

export const Button = () => <button>Click me</button>

Step 2: Start the Sandbox

cd ~/meetup-jan/sandbox-testing
docker sandbox run claude

Step 3: Access File Using FULL PATH Inside Sandbox

Inside the sandbox, use the exact same path as your host:

cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/components/Button.tsx

Result:

● Bash(cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/components/Button.tsx)
  ⎿  export const Button = () => <button>Click me</button>

βœ… Same path works inside the sandbox!

Step 4: Verify Working Directory

pwd

Result:

● Bash(pwd)
  ⎿  /Users/ajeetsraina/meetup-jan/sandbox-testing

βœ… Working directory matches your host path!

Step 5: Access with Relative Path

cat src/components/Button.tsx

Result:

● Bash(cat src/components/Button.tsx)
  ⎿  export const Button = () => <button>Click me</button>

βœ… Relative paths work too!

Step 6: Create a File INSIDE Sandbox

Create a new file using the full path:

echo "console.log('created inside sandbox')" > /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js

Verify inside sandbox:

cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js

Result:

● Bash(cat /Users/ajeetsraina/meetup-jan/sandbox-testing/src/utils.js)
  ⎿  console.log('created inside sandbox')

Step 7: Verify File Exists on HOST

Exit the sandbox:

exit

Check on your host:

cat ~/meetup-jan/sandbox-testing/src/utils.js

Result:

console.log('created inside sandbox')

βœ… File created inside sandbox appears on host at the same path!

Visual Comparison

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    REGULAR DOCKER CONTAINER                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                         β”‚
β”‚  HOST                              CONTAINER                            β”‚
β”‚  /Users/ajeet/project/             /workspace/                          β”‚
β”‚  β”œβ”€β”€ src/                          β”œβ”€β”€ src/                             β”‚
β”‚  β”‚   └── app.js                    β”‚   └── app.js                       β”‚
β”‚  └── package.json                  └── package.json                     β”‚
β”‚                                                                         β”‚
β”‚  ❌ Paths are DIFFERENT                                                 β”‚
β”‚  ❌ Error: "File not found at /workspace/src/app.js"                    β”‚
β”‚  ❌ You think: "Where is /workspace? That's not my path!"               β”‚
β”‚                                                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      DOCKER SANDBOXES                                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                         β”‚
β”‚  HOST                              SANDBOX                              β”‚
β”‚  /Users/ajeet/project/             /Users/ajeet/project/                β”‚
β”‚  β”œβ”€β”€ src/                          β”œβ”€β”€ src/                             β”‚
β”‚  β”‚   └── app.js                    β”‚   └── app.js                       β”‚
β”‚  └── package.json                  └── package.json                     β”‚
β”‚                                                                         β”‚
β”‚  βœ… Paths are IDENTICAL                                                 β”‚
β”‚  βœ… Error: "File not found at /Users/ajeet/project/src/app.js"          β”‚
β”‚  βœ… You think: "I know exactly where that is!"                          β”‚
β”‚                                                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Path Matching Summary

TestResult
Full path access from sandboxβœ… Working
Working directory matches hostβœ… Working
Relative paths workβœ… Working
Files created in sandbox appear on hostβœ… Working
Files created on host appear in sandboxβœ… Working

Test 8: State Persistence

Step 1: Install a Package

npm install -g cowsay

Then test it works:

cowsay "Hello from sandbox"

Result:

● Bash(cowsay "hello from sandbox")
  ⎿   ____________________
     < hello from sandbox >
      --------------------
            \   ^__^
             \  (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||

Step 2: Exit the Sandbox

exit

Or type /exit in Claude Code.

Step 3: Re-enter and Verify

docker sandbox run claude

Then test if cowsay is still there:

cowsay "I persisted!"

Result:

● Done! The cow has spoken.

βœ… State persistence confirmed!

Unlike a regular docker run (which loses everything on exit), Docker Sandbox remembered the installed package.


Test 9: Environment Variables

Environment variables must be set at sandbox creation time.

Step 1: Remove Existing Sandbox

# On your host terminal
docker sandbox ls
docker sandbox rm <sandbox-id>

Step 2: Create Sandbox with Environment Variables

docker sandbox run -e MY_SECRET=supersecret123 -e APP_ENV=development claude

Step 3: Verify Inside Sandbox

echo $MY_SECRET
echo $APP_ENV

Result:

● Bash(echo $MY_SECRET)
  ⎿  supersecret123

● Bash(echo $APP_ENV)
  ⎿  development

Step 4: Confirm Full Environment Access

printenv | grep -E "MY_SECRET|APP_ENV"

Result:

● Bash(printenv | grep -E "MY_SECRET|APP_ENV")
  ⎿  MY_SECRET=supersecret123
     APP_ENV=development

βœ… Environment variables working!

⚠️ Important Limitation: You cannot hot-reload environment variables. To change them, you must remove and recreate the sandbox (which loses installed packages).

Test 10: Docker Socket Access

This allows the agent to run Docker commands inside the sandbox.

⚠️ Security Warning: Mounting the Docker socket grants the agent full access to your Docker daemon, which has root-level privileges. Only use this when necessary.

Step 1: Remove Existing Sandbox

# On your host terminal
exit
docker sandbox rm <sandbox-id>

Step 2: Create Sandbox with Docker Socket

docker sandbox run --mount-docker-socket claude

Step 3: Test Docker Access

docker ps

Result:

● Bash(docker ps)
  ⎿  Error: Exit code 1
     permission denied while trying to connect to the docker API at unix:///var/run/docker.sock

Docker socket requires sudo inside the sandbox:

sudo docker ps

Result:

● Bash(sudo docker ps)
  ⎿  CONTAINER ID   IMAGE                                  COMMAND                  CREATED              STATUS
     dbab95b2ae42   docker/sandbox-templates:claude-code   "sh -c 'sleep 5; if …"   About a minute ago   Up About a minute
     … +9 lines

βœ… Docker socket access working!

The agent can now:

  • List and manage containers
  • Build Docker images
  • Run docker compose commands
  • Execute integration tests with Testcontainers

Test 11: Real-World Demo - Playwright Browser Testing

This demonstrates a practical use case: running browser tests in the sandbox without polluting your host system.

Step 1: Create Project Directory

# On host
mkdir -p ~/meetup-jan/playwright-demo
cd ~/meetup-jan/playwright-demo

# Remove any existing sandbox
docker sandbox ls
docker sandbox rm <sandbox-id-if-exists>

# Start fresh sandbox
docker sandbox run claude

Step 2: Create and Run Playwright Test

Inside the sandbox, ask Claude to create a Playwright test:

Create a simple Playwright test that:
1. Goes to https://collabnix.com
2. Takes a screenshot
3. Verifies the page title contains "Collabnix"

Claude will:

  • Initialize the project with npm init
  • Install Playwright and browsers (~500MB)
  • Create the test file
  • Run the test

Result:

● Bash(npx playwright test) timeout: 1m 0s
  ⎿  Running 1 test using 1 worker
       βœ“  1 tests/collabnix.spec.ts:3:5 β€Ί Collabnix homepage (5.2s)

● Bash(ls -la collabnix-screenshot.png)
  ⎿  -rw-r--r-- 1 agent agent 666902 Jan 15 18:03 collabnix-screenshot.png

● Done. The Playwright test has been created and executed successfully.

  playwright-demo/
  β”œβ”€β”€ package.json
  β”œβ”€β”€ playwright.config.ts
  β”œβ”€β”€ tests/
  β”‚   └── collabnix.spec.ts
  └── collabnix-screenshot.png  (generated)

  Result: 1 test passed in 6.3s

Step 3: Verify Isolation on Host

Exit the sandbox and check your host:

exit

Check what's on your host:

# Screenshot IS in your project (shared via mount) βœ…
ls -la ~/meetup-jan/playwright-demo/collabnix-screenshot.png

# Playwright browsers are NOT on your host βœ…
ls ~/.cache/ms-playwright/

Result:

LocationOn Host?Why?
collabnix-screenshot.pngβœ… YesProject folder is mounted
node_modules/βœ… YesProject folder is mounted
~/.cache/ms-playwright/ (500MB browsers)❌ NoIsolated in sandbox
~/.npm/ cache❌ NoIsolated in sandbox

βœ… This is the power of Docker Sandboxes!

  • Your project files are accessible and shared
  • Heavy dependencies (browsers, caches) stay in the sandbox
  • Your host system stays clean
  • Re-enter the sandbox later and Playwright is still installed

Test Summary

FeatureExpectedResult
πŸ”’ SSH keys blockedBlockedβœ… Working
πŸ”’ AWS credentials blockedBlockedβœ… Working
πŸ”’ Documents blockedBlockedβœ… Working
πŸ“ Project folder accessibleAccessibleβœ… Working
🎯 Path matchingSame pathsβœ… Working
πŸ’Ύ State persistencePersistsβœ… Working
πŸ”§ Environment variablesAvailableβœ… Working
🐳 Docker socket accessWith sudoβœ… Working
🎭 Playwright isolationBrowsers isolatedβœ… Working
πŸͺͺ Git identity injectionAuto-injected⚠️ Not working

Key Takeaways

Regular ContainerDocker Sandbox
You manually decide what to mountAuto-mounts only project directory
Could accidentally mount ~/.ssh, ~/.awsAutomatically excludes sensitive dirs
Different paths inside vs outsideSame paths (path matching)
No Git identityShould auto-inject Git config
State lost on exitState persists per workspace

Docker Sandboxes = Secure by Default πŸ›‘️

The Future of AI Agent Security

Docker Sandboxes represents a critical step forward in making AI agents both powerful and safe. As recent vulnerabilities in tools like OpenAI Codex CLI (CVE-2025-61260) demonstrate, the security of AI coding assistants is an evolving challenge.


Conclusion

Docker Sandboxes solves the fundamental tension between AI agent autonomy and system security. By providing true isolation with zero-overhead development experience, it enables developers to harness the full power of AI coding assistants without compromising their machines.

The three principles that make it work:

  1. Security through isolation - Containers protect your host
  2. Familiarity through path mounting - Same paths, same workflows
  3. Power through customization - Adapt to any use case

As AI agents become more sophisticated and autonomous, proper sandboxing isn't optionalβ€”it's essential. Docker Sandboxes makes it practical.

References

  1. Docker Sandboxes Official Documentation
  2. How Code Execution Drives Key Risks in Agentic AI Systems - NVIDIA
  3. AI Agents Under Threat: A Survey - ACM Computing Surveys
  4. Agentic AI and Security - Martin Fowler
  5. Security of AI Agents - arXiv
  6. The Hidden Security Risks of SWE Agents - Pillar Security