Claude Code in a Docker Sandbox: How Kits Make It Shareable and Secure

Claude Code sandboxed in a microVM, egress filtered, credentials never entering the VM ~ as a shareable kit. Here's how it works and how to build your own.

Share
Claude Code in a Docker Sandbox: How Kits Make It Shareable and Secure

Docker Sandboxes has been out for a while now, giving developers a way to run AI coding agents inside isolated microVMs with egress filtering and no risk to the host machine. What's interesting is watching the community start to build on top of it. A recent example: a developer named Tobby Lie published claude-sbx, a focused shell script that wires up Claude Code inside an sbx microVM with an egress allowlist and a selective config seed. The README is short, the script is readable, and the architecture diagram explains the whole setup at a glance.

It's a good illustration of both what sbx makes possible and where Docker Kits take things further. The shell script gets the security fundamentals right; kits are what make those fundamentals composable, shareable, and enforceable across a team. More on that in a moment - first, let's walk through what the approach actually does and why it matters.

The Problem: --dangerously-skip-permissions Is Useful, But Scary

When you run Claude Code with --dangerously-skip-permissions, the agent stops asking for approval on every tool call and just acts. That's genuinely useful. As the repo readme puts it directly: without it, alert fatigue kicks in and you end up approving everything anyway. Clicking "yes" seventeen times in a row is not a security posture.

The problem is that without isolation, that flag hands the agent fairly free rein on your machine. It can read files it shouldn't, reach hosts it shouldn't, and accumulate side effects that are hard to unwind. Running it in a microVM changes that calculus entirely: the agent has an isolated, disposable environment to work in, and a hostname allowlist limits what it can reach on the network. If it wrecks something, you rebuild. Your host is untouched.

Source ~ https://github.com/tobby-lie/claude-sbx/blob/main/docs/architecture.png

How claude-sbx Works

The architecture is clean enough to explain in a paragraph. A microVM is created via sbx create with your workspace directory bind-mounted at the same path inside the VM. The agent edits files in that workspace and they land on the host in real time through the bind-mount, so no GitHub credentials ever need to exist inside the sandbox. git push and gh pr create happen from the host afterward. Egress is filtered at Layer 7 via sbx policy, with a per-workflow allowlist in a plain text file. Claude's config is seeded selectively: CLAUDE.md, skills, agents, commands, but not settings.json by default, because hooks in settings files can be exploited by the sandboxed agent itself.

The diagram captures everything worth understanding. The host owns git credentials; the VM never sees them. The bind-mount means edits are real-time without any explicit sync step. The egress proxy is the chokepoint for every outbound request the agent makes. And the selective config seed means the agent gets your skills and CLAUDE.md, but not the hooks in settings.json that could be turned against it.

Why the Shell Script Is Already Straining

claude-sbx.sh is clean, readable shell, and for a single developer on macOS it works fine. But look at what it's actually managing: provisioning the VM, seeding config, applying the egress policy, handling a staging directory workaround because sbx has no native cp subcommand, exposing lifecycle commands (start, stop, shell, sync-config, reload-allowlist), and letting users override defaults via config.local.sh. That's a lot of imperative orchestration for something that is, at its core, a declarative idea: "give me an isolated Claude Code environment with these constraints."

#!/usr/bin/env bash
# claude-sbx -- run Claude Code inside a Docker sbx microVM with an egress allowlist.
set -euo pipefail

REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=config/config.sh
source "${REPO_ROOT}/config/config.sh"

mkdir -p "$RUNTIME_DIR"

usage() {
  cat <<EOF
Usage: $(basename "$0") <command>

Commands:
  start              Provision Claude-ready sandbox (policy + VM + Claude Code + seed ~/.claude)
  stop               Destroy sandbox
  shell              Drop into bash inside the sandbox (launch claude yourself)
  sync-config        Re-copy selected paths from ~/.claude into the sandbox
  reload-allowlist   Re-apply config/allowlist.txt to sbx policy
  status             Show sandbox + policy state

Config:      config/config.sh  (override in config/config.local.sh)
Allowlist:   config/allowlist.txt
EOF
}

log()  { echo "[claude-sbx] $*"; }
warn() { echo "[claude-sbx] WARN: $*" >&2; }
die()  { echo "[claude-sbx] ERROR: $*" >&2; exit 1; }

require_tools() {
  command -v sbx >/dev/null || die "sbx not installed. brew install docker/tap/sbx"
  command -v npm >/dev/null || die "npm not installed on host (needed to pack Claude Code tarball)"
}

sandbox_exists() {
  sbx ls 2>/dev/null | awk 'NR>1 {print $1}' | grep -Fxq "$SANDBOX_NAME"
}

# sbx has no `cp` subcommand. sbx bind-mounts the workspace at the SAME path
# inside the VM as on host, so copying into ${WORKSPACE_HOST}/.claude-sbx-staging/
# makes the file visible inside the VM at the identical path.
STAGING="${WORKSPACE_HOST}/.claude-sbx-staging"

_stage_file() {
  local src="$1"
  mkdir -p "$STAGING"
  local base
  base="$(basename "$src")"
  cp "$src" "$STAGING/$base"
  echo "$STAGING/$base"
}

_stage_cleanup() {
  [ -n "${STAGING:-}" ] && [ -d "$STAGING" ] || return 0
  find "$STAGING" -maxdepth 1 -type f -delete
  rmdir "$STAGING" 2>/dev/null || true
}

#############################################
# Egress policy
#############################################
apply_policy() {
  log "applying egress policy from $ALLOWLIST_FILE"
  # Set default to deny-all. Errors "default policy is already set" on
  # repeat runs -- safe to ignore; we only need deny-all to be the baseline.
  # NOTE: we intentionally do NOT run `sbx policy reset --force` here -- it
  # restarts the daemon and unsets the default, then every sbx command hangs
  # on an interactive TUI.
  sbx policy set-default deny-all 2>/dev/null || true

  # Reload semantics: drop existing local network-allow rules so the policy
  # reflects the file exactly. Without this, reload-allowlist would only
  # append -- stale rules from earlier runs (or ad-hoc additions) would
  # stay active and quietly widen the allowlist.
  # `sbx policy ls` NAME column is "local:<uuid>"; `sbx policy rm network --id`
  # wants the bare uuid.
  local stale_names
  stale_names="$(sbx policy ls 2>/dev/null | awk '$2 == "network" && $3 == "local" && $4 == "allow" {print $1}' || true)"
  if [ -n "$stale_names" ]; then
    local count
    count=$(echo "$stale_names" | wc -l | tr -d ' ')
    log "clearing ${count} existing network rule(s)..."
    while IFS= read -r name; do
      [ -z "$name" ] && continue
      local id="${name#local:}"
      sbx policy rm network --id "$id" >/dev/null || warn "failed to remove rule $id"
    done <<< "$stale_names"
  fi

  local domains=()
  while IFS= read -r line; do
    line="${line%%#*}"
    line="$(echo "$line" | tr -d '[:space:]')"
    [ -z "$line" ] && continue
    domains+=("${line}")
  done < "$ALLOWLIST_FILE"

  if [ ${#domains[@]} -gt 0 ]; then
    local joined
    joined="$(IFS=,; echo "${domains[*]}")"
    sbx policy allow network "$joined"
  fi
  log "  ${#domains[@]} domains allowed"
}

#############################################
# Workspace
#############################################
ensure_workspace() {
  if [ ! -d "$WORKSPACE_HOST" ]; then
    log "creating workspace dir: $WORKSPACE_HOST"
    mkdir -p "$WORKSPACE_HOST"
  fi
}

#############################################
# Claude Code install
#############################################
install_claude_code() {
  local version="${CLAUDE_CODE_VERSION}"
  if [ "$version" = "latest" ]; then
    log "resolving latest claude-code version from npm..."
    version=$(npm view @anthropic-ai/claude-code version 2>/dev/null)
    [ -z "$version" ] && die "couldn't resolve latest claude-code version"
    log "  → ${version}"
  fi

  local tarball="${RUNTIME_DIR}/anthropic-ai-claude-code-${version}.tgz"

  if [ ! -f "$tarball" ]; then
    log "packing claude-code@${version} on host..."
    (cd "$RUNTIME_DIR" && npm pack "@anthropic-ai/claude-code@${version}" >/dev/null)
  fi

  log "installing claude-code@${version} in sandbox..."
  local guest_tgz
  guest_tgz=$(_stage_file "$tarball")
  sbx exec -u root "$SANDBOX_NAME" -- bash -lc "
    set -e
    npm install -g '${guest_tgz}'
    mkdir -p /home/agent/.local/bin
    ln -sf /usr/local/share/npm-global/bin/claude /home/agent/.local/bin/claude 2>/dev/null || true
    chown -h agent:agent /home/agent/.local/bin/claude 2>/dev/null || true
    claude --version
  "
}

#############################################
# Persistent env (non-secret only)
#############################################
# /etc/profile.d/*.sh is the canonical spot for bash -l login env on Linux --
# guaranteed to be sourced by `sbx run ... bash -lc` in cmd_shell.
install_persistent_env() {
  log "writing /etc/profile.d/claude-sbx.sh (non-secret)..."
  sbx exec -u root "$SANDBOX_NAME" -- bash -c "cat > /etc/profile.d/claude-sbx.sh <<'EOF'
# non-secret sandbox env -- no tokens, no keys
unset ANTHROPIC_API_KEY
export NO_PROXY=\"localhost,127.0.0.1,::1,api.anthropic.com,auth.anthropic.com,claude.ai,statsig.anthropic.com,sentry.io\"
export no_proxy=\$NO_PROXY

# Override TERM if the host propagated a value the sandbox's terminfo
# database doesn't know (Kitty, WezTerm, Alacritty direct modes, etc.).
# xterm-256color is almost universally recognized and renders fine.
if ! infocmp \"\${TERM:-}\" >/dev/null 2>&1; then
  export TERM=xterm-256color
fi
EOF
chmod +x /etc/profile.d/claude-sbx.sh"
}

#############################################
# Seed Claude config into sandbox
#############################################
# Copy-in, not bind-mount -- agent can mutate its copy, host originals untouched.
# Selective: only the paths in CLAUDE_COPY_PATHS get copied (durable config,
# not ephemeral state). See config/config.sh for the default list.
seed_claude_config() {
  local src="${REAL_HOME}/.claude"
  [ -d "$src" ] || die "host ~/.claude not found at $src"

  local paths=()
  for p in "${CLAUDE_COPY_PATHS[@]}"; do
    if [ -e "${src}/${p}" ]; then
      paths+=("$p")
    else
      warn "skip ${p} -- not present at ${src}/${p}"
    fi
  done

  if [ ${#paths[@]} -eq 0 ]; then
    warn "nothing to copy, skipping"
    return 0
  fi

  log "packing ~/.claude (${paths[*]})..."
  local tgz="${RUNTIME_DIR}/claude-config.tgz"
  tar -czf "$tgz" -C "$src" "${paths[@]}"

  local guest_tgz
  guest_tgz=$(_stage_file "$tgz")

  log "extracting into /home/agent/.claude..."
  sbx exec -u root "$SANDBOX_NAME" -- bash -c "
    set -e
    rm -rf /home/agent/.claude
    mkdir -p /home/agent/.claude
    tar -xzf '${guest_tgz}' -C /home/agent/.claude
    chown -R agent:agent /home/agent/.claude
  "
  rm -f "$tgz"
}

#############################################
# Subcommands
#############################################
cmd_start() {
  require_tools
  ensure_workspace

  log "--- [1/5] egress policy"
  apply_policy

  log "--- [2/5] create sandbox (workspace bind-mounted: ${WORKSPACE_HOST})"
  if sandbox_exists; then
    log "sandbox '${SANDBOX_NAME}' already exists, skipping create"
  else
    sbx create shell "$WORKSPACE_HOST" --name "$SANDBOX_NAME"
  fi

  log "--- [3/5] install Claude Code (${CLAUDE_CODE_VERSION})"
  install_claude_code

  log "--- [4/5] persistent env"
  install_persistent_env

  log "--- [5/5] seed Claude config"
  seed_claude_config

  _stage_cleanup

  cat <<EOF

Sandbox '${SANDBOX_NAME}' ready.

Enter the sandbox:
  $(basename "$0") shell

Inside, launch Claude (copy-paste):
  cd ${WORKSPACE_HOST} && claude --model '${CLAUDE_MODEL}' ${CLAUDE_FLAGS}

Workspace (bind-mounted, live on host): ${WORKSPACE_HOST}  <->  ${WORKSPACE_HOST} (same path inside VM)

Other ops:
  $(basename "$0") sync-config       # refresh ~/.claude inside the VM
  $(basename "$0") reload-allowlist  # re-apply egress policy
  $(basename "$0") status / stop
EOF
}

cmd_stop() {
  require_tools
  if sandbox_exists; then
    log "destroying sandbox '${SANDBOX_NAME}'..."
    sbx rm "$SANDBOX_NAME"
  else
    log "sandbox '${SANDBOX_NAME}' not present"
  fi
}

cmd_shell() {
  require_tools
  sandbox_exists || die "sandbox '${SANDBOX_NAME}' not running. Run: $(basename "$0") start"
  exec sbx run "$SANDBOX_NAME"
}

cmd_reload_allowlist() {
  require_tools
  apply_policy
}

cmd_sync_config() {
  require_tools
  sandbox_exists || die "sandbox '${SANDBOX_NAME}' not running. Run: $(basename "$0") start"
  seed_claude_config
  _stage_cleanup
}

cmd_status() {
  require_tools
  echo "--- sandboxes ---"
  sbx ls 2>&1 || true
  echo ""
  echo "--- policy ---"
  sbx policy ls 2>&1 || true
}

main() {
  local cmd="${1:-}"
  [ -n "${1:-}" ] && shift
  case "$cmd" in
    start)             cmd_start "$@" ;;
    stop)              cmd_stop "$@" ;;
    shell)             cmd_shell "$@" ;;
    reload-allowlist)  cmd_reload_allowlist "$@" ;;
    sync-config)       cmd_sync_config "$@" ;;
    status)            cmd_status "$@" ;;
    ""|-h|--help)      usage ;;
    *)                 echo "Unknown command: $cmd" >&2; usage; exit 1 ;;
  esac
}

main "$@"

Shell scripts are honest and portable, but they don't compose. If a teammate wants to add npm to the allowlist for their workflow and you want to add apt for yours, you're editing the same file with no principled way to layer those differences. The config.local.sh approach handles this today with overrides, but it's a convention, not a mechanism.

What a Kit Version Would Look Like

Docker Kits express environment declarations structurally rather than procedurally. The core question shifts from "what commands do I run to set this up?" to "what does this environment need to be?" Each kit lives in a directory with a spec.yaml and an optional files/ tree for static files to inject. The community kits repo at docker/sbx-kits-contrib shows exactly what this looks like in practice.

Take the config seed that claude-sbx manages via a staging directory workaround. In a kit, CLAUDE.md and skills simply live under files/home/.claude/ and get injected natively at sandbox creation. No shell gymnastics required. A kit that ships a Claude Code skill, say, a Dockerfile reviewer looks like this:

docker-review/
├── spec.yaml
└── files/
    └── workspace/
        └── .claude/
            └── skills/
                └── docker-review/
                    └── SKILL.md
# docker-review/spec.yaml
schemaVersion: "1"
kind: mixin
name: docker-review
displayName: Dockerfile review skill
description: Ships a Claude Code skill that reviews Dockerfiles

That's the entire spec. The skill file lands in the workspace automatically. No sync-config command, no staging dir, no shell function to update when the paths change.

For the egress policy, the allowlist that claude-sbx manages via config/allowlist.txt and reload-allowlist that becomes network.allowedDomains in the spec. And if you want to go further and fork the built-in Claude agent to remove --dangerously-skip-permissions entirely, the official docs ship an example for that too:

# claude-safe/spec.yaml  (from docs.docker.com/ai/sandboxes/customize/kit-examples)
schemaVersion: "1"
kind: agent
name: claude-safe
displayName: Claude Code (with approval prompts)
description: Claude Code without --dangerously-skip-permissions

agent:
  image: "docker/sandbox-templates:claude-code-docker"
  aiFilename: CLAUDE.md
  persistence: persistent
  entrypoint:
    run: [claude]  # no --dangerously-skip-permissions

network:
  serviceDomains:
    api.anthropic.com: anthropic
    console.anthropic.com: anthropic
  serviceAuth:
    anthropic:
      headerName: x-api-key
      valueFormat: "%s"
  allowedDomains:
    - "claude.com:443"

credentials:
  sources:
    anthropic:
      env:
        - ANTHROPIC_API_KEY
$ sbx run claude-safe --kit ./claude-safe/

Notice what's happening with credentials here: the API key never enters the VM. The credentials.sources block tells the proxy where to find it on the host, and serviceAuth injects it into outbound requests to api.anthropic.com transparently. This is more rigorous than the shell script approach, where the key has to be present in the environment when the sandbox is created.

We tested the docker-review mixin against a real workspace:

$ sbx run claude --kit ./docker-review/ --name test-blog

The skill loaded immediately, Claude found every Dockerfile in the bind-mounted workspace, and asked which one to review. One thing worth understanding from that result: the agent could see expenseflow/Dockerfile, NemoClaw/Dockerfile, and the others because ~/work/ is bind-mounted at the same path inside the VM. The sandbox doesn't wall off the workspace, it walls off everything outside it. The agent has full read/write access to your project; what it can't do is reach your SSH keys, other directories on the host, or arbitrary hosts on the network. Controlled blast radius, not a walled garden.

Stacking kits is where the composability really pays off. Run the Dockerfile review mixin alongside the isolated Claude agent with two --kit flags:

$ sbx run claude-safe --kit ./claude-safe/ --kit ./docker-review/

Or skip writing a kit entirely and pull one directly from the community repo. The code-server kit installs a full VS Code web IDE on port 8080 against your sandbox workspace, pre-loaded with the Claude Code extension:

$ sbx run claude --kit "git+https://github.com/docker/sbx-kits-contrib.git#dir=code-server"

That's a browser-based IDE, Claude Code, egress filtering, and VM isolation, in one command with nothing to install or configure locally.

The key difference: with the shell script, you edit config/allowlist.txt and run ./claude-sbx.sh reload-allowlist to change egress rules. With kits, different workflows are different kit directories that stack via --kit, so the base constraint set is never mutated. And because kits load from a Git URL or OCI registry, sharing the exact same environment across a team requires no clone, no setup script, no config dance.

The Properties That Actually Matter

It's worth being precise about what the project is getting right, because these aren't just nice-to-haves:

Property Why it matters Now (shell) With kit
VM isolation Host files untouched except explicit bind-mount ✓ both ✓ both
Egress filtering Reduces exfiltration surface ✓ both ✓ both
No agent credentials Git push stays on host; sandbox is credential-free ✓ both ✓ both
Composable egress rules Per-workflow allowlists without mutating the base manual kit extends
Shareable environment Teammates get identical constraints, no setup dance manual native
Lifecycle hooks Install once vs. run on every start case stmts install / startup

A Note on settings.json

One of the subtler decisions in the repo deserves attention. By default, settings.json is excluded from what gets copied into the sandbox. The README explains why: settings files can contain hooks, and any hook whose output is valid JSON can be read by the sandboxed agent as a new instruction. A host Stop hook that fires inside the sandbox can cause an infinite stop-restart loop. That's a real attack surface and an honest tradeoff. The project documents it clearly and gives you a one-line override in config.local.sh if you've verified your settings file is safe.

This is exactly the kind of reasoning that should be encoded at the kit level, not buried in a README. When a kit excludes a path by default and requires an explicit opt-in to include it, that constraint is enforced at the mechanism level, not just the documentation level. Documentation rots; mechanisms don't.

The Bigger Picture

Claude Code with --dangerously-skip-permissions and no isolation is, frankly, how most people are running it today. The approval dialogs get dismissed, the agent gets access to everything, and it works until it doesn't. Projects like claude-sbx are pointing at the right answer: not more approval dialogs, but real isolation with real egress constraints, so that the agent can operate autonomously within a bounded environment rather than operating autonomously with no boundaries at all.

The architecture diagram in that README is doing a lot of work in very little space. If you're building tooling for AI coding agents, it's worth studying. The bind-mount-not-copy approach for real-time edits, the credential separation, the selective config seed with an explicit exclusion rationale, these are the right defaults. The shell script is the right implementation for today. Kits are where it goes next.

Check out the project at github.com/tobby-lie/claude-sbx, and if you're interested in what the kit-native version of this would look like in practice, Docker Sandboxes are exactly the surface this is heading toward.