Claude Code in a Docker Sandbox: How Kits Make It Shareable and Secure
Claude Code sandboxed in a microVM, egress filtered, credentials never entering the VM ~ as a shareable kit. Here's how it works and how to build your own.
Docker Sandboxes has been out for a while now, giving developers a way to run AI coding agents inside isolated microVMs with egress filtering and no risk to the host machine. What's interesting is watching the community start to build on top of it. A recent example: a developer named Tobby Lie published claude-sbx, a focused shell script that wires up Claude Code inside an sbx microVM with an egress allowlist and a selective config seed. The README is short, the script is readable, and the architecture diagram explains the whole setup at a glance.
It's a good illustration of both what sbx makes possible and where Docker Kits take things further. The shell script gets the security fundamentals right; kits are what make those fundamentals composable, shareable, and enforceable across a team. More on that in a moment - first, let's walk through what the approach actually does and why it matters.
The Problem: --dangerously-skip-permissions Is Useful, But Scary
When you run Claude Code with --dangerously-skip-permissions, the agent stops asking for approval on every tool call and just acts. That's genuinely useful. As the repo readme puts it directly: without it, alert fatigue kicks in and you end up approving everything anyway. Clicking "yes" seventeen times in a row is not a security posture.
The problem is that without isolation, that flag hands the agent fairly free rein on your machine. It can read files it shouldn't, reach hosts it shouldn't, and accumulate side effects that are hard to unwind. Running it in a microVM changes that calculus entirely: the agent has an isolated, disposable environment to work in, and a hostname allowlist limits what it can reach on the network. If it wrecks something, you rebuild. Your host is untouched.

How claude-sbx Works
The architecture is clean enough to explain in a paragraph. A microVM is created via sbx create with your workspace directory bind-mounted at the same path inside the VM. The agent edits files in that workspace and they land on the host in real time through the bind-mount, so no GitHub credentials ever need to exist inside the sandbox. git push and gh pr create happen from the host afterward. Egress is filtered at Layer 7 via sbx policy, with a per-workflow allowlist in a plain text file. Claude's config is seeded selectively: CLAUDE.md, skills, agents, commands, but not settings.json by default, because hooks in settings files can be exploited by the sandboxed agent itself.
The diagram captures everything worth understanding. The host owns git credentials; the VM never sees them. The bind-mount means edits are real-time without any explicit sync step. The egress proxy is the chokepoint for every outbound request the agent makes. And the selective config seed means the agent gets your skills and CLAUDE.md, but not the hooks in settings.json that could be turned against it.
Why the Shell Script Is Already Straining
claude-sbx.sh is clean, readable shell, and for a single developer on macOS it works fine. But look at what it's actually managing: provisioning the VM, seeding config, applying the egress policy, handling a staging directory workaround because sbx has no native cp subcommand, exposing lifecycle commands (start, stop, shell, sync-config, reload-allowlist), and letting users override defaults via config.local.sh. That's a lot of imperative orchestration for something that is, at its core, a declarative idea: "give me an isolated Claude Code environment with these constraints."
#!/usr/bin/env bash
# claude-sbx -- run Claude Code inside a Docker sbx microVM with an egress allowlist.
set -euo pipefail
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=config/config.sh
source "${REPO_ROOT}/config/config.sh"
mkdir -p "$RUNTIME_DIR"
usage() {
cat <<EOF
Usage: $(basename "$0") <command>
Commands:
start Provision Claude-ready sandbox (policy + VM + Claude Code + seed ~/.claude)
stop Destroy sandbox
shell Drop into bash inside the sandbox (launch claude yourself)
sync-config Re-copy selected paths from ~/.claude into the sandbox
reload-allowlist Re-apply config/allowlist.txt to sbx policy
status Show sandbox + policy state
Config: config/config.sh (override in config/config.local.sh)
Allowlist: config/allowlist.txt
EOF
}
log() { echo "[claude-sbx] $*"; }
warn() { echo "[claude-sbx] WARN: $*" >&2; }
die() { echo "[claude-sbx] ERROR: $*" >&2; exit 1; }
require_tools() {
command -v sbx >/dev/null || die "sbx not installed. brew install docker/tap/sbx"
command -v npm >/dev/null || die "npm not installed on host (needed to pack Claude Code tarball)"
}
sandbox_exists() {
sbx ls 2>/dev/null | awk 'NR>1 {print $1}' | grep -Fxq "$SANDBOX_NAME"
}
# sbx has no `cp` subcommand. sbx bind-mounts the workspace at the SAME path
# inside the VM as on host, so copying into ${WORKSPACE_HOST}/.claude-sbx-staging/
# makes the file visible inside the VM at the identical path.
STAGING="${WORKSPACE_HOST}/.claude-sbx-staging"
_stage_file() {
local src="$1"
mkdir -p "$STAGING"
local base
base="$(basename "$src")"
cp "$src" "$STAGING/$base"
echo "$STAGING/$base"
}
_stage_cleanup() {
[ -n "${STAGING:-}" ] && [ -d "$STAGING" ] || return 0
find "$STAGING" -maxdepth 1 -type f -delete
rmdir "$STAGING" 2>/dev/null || true
}
#############################################
# Egress policy
#############################################
apply_policy() {
log "applying egress policy from $ALLOWLIST_FILE"
# Set default to deny-all. Errors "default policy is already set" on
# repeat runs -- safe to ignore; we only need deny-all to be the baseline.
# NOTE: we intentionally do NOT run `sbx policy reset --force` here -- it
# restarts the daemon and unsets the default, then every sbx command hangs
# on an interactive TUI.
sbx policy set-default deny-all 2>/dev/null || true
# Reload semantics: drop existing local network-allow rules so the policy
# reflects the file exactly. Without this, reload-allowlist would only
# append -- stale rules from earlier runs (or ad-hoc additions) would
# stay active and quietly widen the allowlist.
# `sbx policy ls` NAME column is "local:<uuid>"; `sbx policy rm network --id`
# wants the bare uuid.
local stale_names
stale_names="$(sbx policy ls 2>/dev/null | awk '$2 == "network" && $3 == "local" && $4 == "allow" {print $1}' || true)"
if [ -n "$stale_names" ]; then
local count
count=$(echo "$stale_names" | wc -l | tr -d ' ')
log "clearing ${count} existing network rule(s)..."
while IFS= read -r name; do
[ -z "$name" ] && continue
local id="${name#local:}"
sbx policy rm network --id "$id" >/dev/null || warn "failed to remove rule $id"
done <<< "$stale_names"
fi
local domains=()
while IFS= read -r line; do
line="${line%%#*}"
line="$(echo "$line" | tr -d '[:space:]')"
[ -z "$line" ] && continue
domains+=("${line}")
done < "$ALLOWLIST_FILE"
if [ ${#domains[@]} -gt 0 ]; then
local joined
joined="$(IFS=,; echo "${domains[*]}")"
sbx policy allow network "$joined"
fi
log " ${#domains[@]} domains allowed"
}
#############################################
# Workspace
#############################################
ensure_workspace() {
if [ ! -d "$WORKSPACE_HOST" ]; then
log "creating workspace dir: $WORKSPACE_HOST"
mkdir -p "$WORKSPACE_HOST"
fi
}
#############################################
# Claude Code install
#############################################
install_claude_code() {
local version="${CLAUDE_CODE_VERSION}"
if [ "$version" = "latest" ]; then
log "resolving latest claude-code version from npm..."
version=$(npm view @anthropic-ai/claude-code version 2>/dev/null)
[ -z "$version" ] && die "couldn't resolve latest claude-code version"
log " → ${version}"
fi
local tarball="${RUNTIME_DIR}/anthropic-ai-claude-code-${version}.tgz"
if [ ! -f "$tarball" ]; then
log "packing claude-code@${version} on host..."
(cd "$RUNTIME_DIR" && npm pack "@anthropic-ai/claude-code@${version}" >/dev/null)
fi
log "installing claude-code@${version} in sandbox..."
local guest_tgz
guest_tgz=$(_stage_file "$tarball")
sbx exec -u root "$SANDBOX_NAME" -- bash -lc "
set -e
npm install -g '${guest_tgz}'
mkdir -p /home/agent/.local/bin
ln -sf /usr/local/share/npm-global/bin/claude /home/agent/.local/bin/claude 2>/dev/null || true
chown -h agent:agent /home/agent/.local/bin/claude 2>/dev/null || true
claude --version
"
}
#############################################
# Persistent env (non-secret only)
#############################################
# /etc/profile.d/*.sh is the canonical spot for bash -l login env on Linux --
# guaranteed to be sourced by `sbx run ... bash -lc` in cmd_shell.
install_persistent_env() {
log "writing /etc/profile.d/claude-sbx.sh (non-secret)..."
sbx exec -u root "$SANDBOX_NAME" -- bash -c "cat > /etc/profile.d/claude-sbx.sh <<'EOF'
# non-secret sandbox env -- no tokens, no keys
unset ANTHROPIC_API_KEY
export NO_PROXY=\"localhost,127.0.0.1,::1,api.anthropic.com,auth.anthropic.com,claude.ai,statsig.anthropic.com,sentry.io\"
export no_proxy=\$NO_PROXY
# Override TERM if the host propagated a value the sandbox's terminfo
# database doesn't know (Kitty, WezTerm, Alacritty direct modes, etc.).
# xterm-256color is almost universally recognized and renders fine.
if ! infocmp \"\${TERM:-}\" >/dev/null 2>&1; then
export TERM=xterm-256color
fi
EOF
chmod +x /etc/profile.d/claude-sbx.sh"
}
#############################################
# Seed Claude config into sandbox
#############################################
# Copy-in, not bind-mount -- agent can mutate its copy, host originals untouched.
# Selective: only the paths in CLAUDE_COPY_PATHS get copied (durable config,
# not ephemeral state). See config/config.sh for the default list.
seed_claude_config() {
local src="${REAL_HOME}/.claude"
[ -d "$src" ] || die "host ~/.claude not found at $src"
local paths=()
for p in "${CLAUDE_COPY_PATHS[@]}"; do
if [ -e "${src}/${p}" ]; then
paths+=("$p")
else
warn "skip ${p} -- not present at ${src}/${p}"
fi
done
if [ ${#paths[@]} -eq 0 ]; then
warn "nothing to copy, skipping"
return 0
fi
log "packing ~/.claude (${paths[*]})..."
local tgz="${RUNTIME_DIR}/claude-config.tgz"
tar -czf "$tgz" -C "$src" "${paths[@]}"
local guest_tgz
guest_tgz=$(_stage_file "$tgz")
log "extracting into /home/agent/.claude..."
sbx exec -u root "$SANDBOX_NAME" -- bash -c "
set -e
rm -rf /home/agent/.claude
mkdir -p /home/agent/.claude
tar -xzf '${guest_tgz}' -C /home/agent/.claude
chown -R agent:agent /home/agent/.claude
"
rm -f "$tgz"
}
#############################################
# Subcommands
#############################################
cmd_start() {
require_tools
ensure_workspace
log "--- [1/5] egress policy"
apply_policy
log "--- [2/5] create sandbox (workspace bind-mounted: ${WORKSPACE_HOST})"
if sandbox_exists; then
log "sandbox '${SANDBOX_NAME}' already exists, skipping create"
else
sbx create shell "$WORKSPACE_HOST" --name "$SANDBOX_NAME"
fi
log "--- [3/5] install Claude Code (${CLAUDE_CODE_VERSION})"
install_claude_code
log "--- [4/5] persistent env"
install_persistent_env
log "--- [5/5] seed Claude config"
seed_claude_config
_stage_cleanup
cat <<EOF
Sandbox '${SANDBOX_NAME}' ready.
Enter the sandbox:
$(basename "$0") shell
Inside, launch Claude (copy-paste):
cd ${WORKSPACE_HOST} && claude --model '${CLAUDE_MODEL}' ${CLAUDE_FLAGS}
Workspace (bind-mounted, live on host): ${WORKSPACE_HOST} <-> ${WORKSPACE_HOST} (same path inside VM)
Other ops:
$(basename "$0") sync-config # refresh ~/.claude inside the VM
$(basename "$0") reload-allowlist # re-apply egress policy
$(basename "$0") status / stop
EOF
}
cmd_stop() {
require_tools
if sandbox_exists; then
log "destroying sandbox '${SANDBOX_NAME}'..."
sbx rm "$SANDBOX_NAME"
else
log "sandbox '${SANDBOX_NAME}' not present"
fi
}
cmd_shell() {
require_tools
sandbox_exists || die "sandbox '${SANDBOX_NAME}' not running. Run: $(basename "$0") start"
exec sbx run "$SANDBOX_NAME"
}
cmd_reload_allowlist() {
require_tools
apply_policy
}
cmd_sync_config() {
require_tools
sandbox_exists || die "sandbox '${SANDBOX_NAME}' not running. Run: $(basename "$0") start"
seed_claude_config
_stage_cleanup
}
cmd_status() {
require_tools
echo "--- sandboxes ---"
sbx ls 2>&1 || true
echo ""
echo "--- policy ---"
sbx policy ls 2>&1 || true
}
main() {
local cmd="${1:-}"
[ -n "${1:-}" ] && shift
case "$cmd" in
start) cmd_start "$@" ;;
stop) cmd_stop "$@" ;;
shell) cmd_shell "$@" ;;
reload-allowlist) cmd_reload_allowlist "$@" ;;
sync-config) cmd_sync_config "$@" ;;
status) cmd_status "$@" ;;
""|-h|--help) usage ;;
*) echo "Unknown command: $cmd" >&2; usage; exit 1 ;;
esac
}
main "$@"Shell scripts are honest and portable, but they don't compose. If a teammate wants to add npm to the allowlist for their workflow and you want to add apt for yours, you're editing the same file with no principled way to layer those differences. The config.local.sh approach handles this today with overrides, but it's a convention, not a mechanism.
What a Kit Version Would Look Like
Docker Kits express environment declarations structurally rather than procedurally. The core question shifts from "what commands do I run to set this up?" to "what does this environment need to be?" Each kit lives in a directory with a spec.yaml and an optional files/ tree for static files to inject. The community kits repo at docker/sbx-kits-contrib shows exactly what this looks like in practice.
Take the config seed that claude-sbx manages via a staging directory workaround. In a kit, CLAUDE.md and skills simply live under files/home/.claude/ and get injected natively at sandbox creation. No shell gymnastics required. A kit that ships a Claude Code skill, say, a Dockerfile reviewer looks like this:
docker-review/
├── spec.yaml
└── files/
└── workspace/
└── .claude/
└── skills/
└── docker-review/
└── SKILL.md# docker-review/spec.yaml
schemaVersion: "1"
kind: mixin
name: docker-review
displayName: Dockerfile review skill
description: Ships a Claude Code skill that reviews DockerfilesThat's the entire spec. The skill file lands in the workspace automatically. No sync-config command, no staging dir, no shell function to update when the paths change.
For the egress policy, the allowlist that claude-sbx manages via config/allowlist.txt and reload-allowlist that becomes network.allowedDomains in the spec. And if you want to go further and fork the built-in Claude agent to remove --dangerously-skip-permissions entirely, the official docs ship an example for that too:
# claude-safe/spec.yaml (from docs.docker.com/ai/sandboxes/customize/kit-examples)
schemaVersion: "1"
kind: agent
name: claude-safe
displayName: Claude Code (with approval prompts)
description: Claude Code without --dangerously-skip-permissions
agent:
image: "docker/sandbox-templates:claude-code-docker"
aiFilename: CLAUDE.md
persistence: persistent
entrypoint:
run: [claude] # no --dangerously-skip-permissions
network:
serviceDomains:
api.anthropic.com: anthropic
console.anthropic.com: anthropic
serviceAuth:
anthropic:
headerName: x-api-key
valueFormat: "%s"
allowedDomains:
- "claude.com:443"
credentials:
sources:
anthropic:
env:
- ANTHROPIC_API_KEY$ sbx run claude-safe --kit ./claude-safe/Notice what's happening with credentials here: the API key never enters the VM. The credentials.sources block tells the proxy where to find it on the host, and serviceAuth injects it into outbound requests to api.anthropic.com transparently. This is more rigorous than the shell script approach, where the key has to be present in the environment when the sandbox is created.
We tested the docker-review mixin against a real workspace:
$ sbx run claude --kit ./docker-review/ --name test-blogThe skill loaded immediately, Claude found every Dockerfile in the bind-mounted workspace, and asked which one to review. One thing worth understanding from that result: the agent could see expenseflow/Dockerfile, NemoClaw/Dockerfile, and the others because ~/work/ is bind-mounted at the same path inside the VM. The sandbox doesn't wall off the workspace, it walls off everything outside it. The agent has full read/write access to your project; what it can't do is reach your SSH keys, other directories on the host, or arbitrary hosts on the network. Controlled blast radius, not a walled garden.
Stacking kits is where the composability really pays off. Run the Dockerfile review mixin alongside the isolated Claude agent with two --kit flags:
$ sbx run claude-safe --kit ./claude-safe/ --kit ./docker-review/Or skip writing a kit entirely and pull one directly from the community repo. The code-server kit installs a full VS Code web IDE on port 8080 against your sandbox workspace, pre-loaded with the Claude Code extension:
$ sbx run claude --kit "git+https://github.com/docker/sbx-kits-contrib.git#dir=code-server"That's a browser-based IDE, Claude Code, egress filtering, and VM isolation, in one command with nothing to install or configure locally.
The key difference: with the shell script, you edit config/allowlist.txt and run ./claude-sbx.sh reload-allowlist to change egress rules. With kits, different workflows are different kit directories that stack via --kit, so the base constraint set is never mutated. And because kits load from a Git URL or OCI registry, sharing the exact same environment across a team requires no clone, no setup script, no config dance.
The Properties That Actually Matter
It's worth being precise about what the project is getting right, because these aren't just nice-to-haves:
| Property | Why it matters | Now (shell) | With kit |
|---|---|---|---|
| VM isolation | Host files untouched except explicit bind-mount | ✓ both | ✓ both |
| Egress filtering | Reduces exfiltration surface | ✓ both | ✓ both |
| No agent credentials | Git push stays on host; sandbox is credential-free | ✓ both | ✓ both |
| Composable egress rules | Per-workflow allowlists without mutating the base | manual | kit extends |
| Shareable environment | Teammates get identical constraints, no setup dance | manual | native |
| Lifecycle hooks | Install once vs. run on every start | case stmts | install / startup |
A Note on settings.json
One of the subtler decisions in the repo deserves attention. By default, settings.json is excluded from what gets copied into the sandbox. The README explains why: settings files can contain hooks, and any hook whose output is valid JSON can be read by the sandboxed agent as a new instruction. A host Stop hook that fires inside the sandbox can cause an infinite stop-restart loop. That's a real attack surface and an honest tradeoff. The project documents it clearly and gives you a one-line override in config.local.sh if you've verified your settings file is safe.
This is exactly the kind of reasoning that should be encoded at the kit level, not buried in a README. When a kit excludes a path by default and requires an explicit opt-in to include it, that constraint is enforced at the mechanism level, not just the documentation level. Documentation rots; mechanisms don't.
The Bigger Picture
Claude Code with --dangerously-skip-permissions and no isolation is, frankly, how most people are running it today. The approval dialogs get dismissed, the agent gets access to everything, and it works until it doesn't. Projects like claude-sbx are pointing at the right answer: not more approval dialogs, but real isolation with real egress constraints, so that the agent can operate autonomously within a bounded environment rather than operating autonomously with no boundaries at all.
The architecture diagram in that README is doing a lot of work in very little space. If you're building tooling for AI coding agents, it's worth studying. The bind-mount-not-copy approach for real-time edits, the credential separation, the selective config seed with an explicit exclusion rationale, these are the right defaults. The shell script is the right implementation for today. Kits are where it goes next.
Check out the project at github.com/tobby-lie/claude-sbx, and if you're interested in what the kit-native version of this would look like in practice, Docker Sandboxes are exactly the surface this is heading toward.