Building Complete AI Automation Workflows: n8n + Docker Model Runner + MCP Toolkit

Stop sending your sensitive data to the cloud. Learn how to set up a complete local AI automation platform using n8n, Docker Model Runner and Docker MCP Toolkit

Building Complete AI Automation Workflows: n8n + Docker Model Runner + MCP Toolkit

In today's AI-driven world, creating intelligent automation workflows requires more than just language models. You need a complete ecosystem that combines workflow orchestration, local AI inference, and secure tool integration. This comprehensive tutorial introduces the powerful three-component stack: n8n, Docker Model Runner, and MCP Toolkit – a combination that delivers enterprise-grade AI automation while maintaining complete data privacy and control.

This tutorial will guide you through building a complete local AI automation platform that can analyze code, process documents, interact with GitHub repositories, and orchestrate complex multi-step workflows – all running entirely on your machine.

What is n8n?

n8n (pronounced "n-eight-n") is a fair-code workflow automation platform that combines the flexibility of code with the speed of no-code solutions. With over 400 integrations and native AI capabilities, n8n allows you to:

  • Build visual workflows with drag-and-drop interface
  • Write custom JavaScript/Python when needed
  • Self-host with complete control over your data
  • Integrate with virtually any service or API
  • Create AI-powered automations

What is Docker Model Runner?

Docker Model Runner: The Missing Piece for Your GenAI Development Workflow - Collabnix
Ever tried building a GenAI application and hit a wall? 🧱

Docker Model Runner is an experimental feature in Docker Desktop 4.40+ that enables you to run large language models (LLMs) locally on your host system - not in containers. Instead, it runs models directly on your Mac using a host-installed inference server (llama.cpp) for optimal GPU acceleration. Key features include:

  • Host-native execution: Models run directly on your host machine, not in containers
  • Apple Silicon optimization: Direct access to Metal API for GPU acceleration
  • OCI artifact storage: Models stored as standardized artifacts in Docker Hub
  • OpenAI-compatible API: Familiar endpoints for easy integration
  • No containerization overhead: Faster inference without container layers
  • Mac-only currently: Optimized for Apple Silicon (M1/M2/M3/M4), Windows support coming soon

What is Docker MCP Toolkit?

An image showing the power of Docker MCP Toolkit

The Model Context Protocol (MCP) Toolkit is a powerful addition to Docker Desktop that enables AI agents to securely interact with external systems and data sources. MCP provides:

  • Standardized protocols for AI-tool interactions
  • Secure sandboxed execution of external tools
  • GitHub integration for repository operations
  • File system access for document processing
  • API connectivity to various services
  • Tool chaining for complex automation workflows

Why Combine n8n + Docker Model Runner + MCP Toolkit?

This three-component stack creates a complete AI automation ecosystem:

  1. Complete Data Privacy: All processing happens locally on your host machine
  2. Native Performance: Direct GPU access without containerization overhead
  3. Advanced Tool Integration: MCP enables AI agents to interact with GitHub, file systems, and APIs
  4. Cost Efficiency: No per-token charges from cloud providers
  5. Offline Capability: Work without internet connectivity after initial setup
  6. Standardized Distribution: Models and tools stored as OCI artifacts
  7. Powerful Workflow Engine: n8n orchestrates complex multi-step automations
  8. Apple Silicon Optimization: Leverages Metal API for maximum performance
  9. Secure Tool Execution: MCP provides sandboxed access to external systems

Prerequisites

Before starting, ensure you have:

  • Mac with Apple Silicon (M1/M2/M3/M4) - Windows support coming soon
  • Docker Desktop 4.40+ with Model Runner support
  • 8GB+ RAM (16GB+ recommended for larger models)
  • 10GB+ free disk space (for models and containers)
  • Basic Docker and workflow automation knowledge

Note: Currently, Docker Model Runner is optimized for macOS with Apple Silicon. Windows support with NVIDIA GPU arrived early April 2025.

How Docker Model Runner Works

Unlike traditional containerized AI solutions, Docker Model Runner takes a unique approach:

Host-Native Execution

  • No containers for models: AI models run directly on your host machine using llama.cpp
  • Direct GPU access: Leverages Apple's Metal API without containerization overhead
  • Host-level process: Docker Desktop runs the inference server natively on your Mac

Model Storage & Distribution

  • OCI artifacts: Models stored as standardized artifacts in Docker Hub
  • No compression layers: Faster downloads since model weights are uncompressible
  • Efficient storage: No need for both compressed and uncompressed versions
  • Registry compatibility: Works with any Docker-compatible registry

Connection Methods

Docker Model Runner provides multiple access patterns:

  1. From containers: http://model-runner.docker.internal/
  2. Host via Docker socket: /var/run/docker.sock
  3. Host via TCP: When enabled, direct port access (default: 12434)

This architecture provides the performance benefits of native execution while maintaining Docker's standardized distribution and management capabilities.

Step 1: Enable Docker Model Runner

First, enable the Model Runner feature in Docker Desktop:

Via Docker Desktop UI

  1. Open Docker Desktop Settings
  2. Navigate to Features in development β†’ Beta
  3. Enable "Enable Docker Model Runner"
  4. Optionally enable "Enable host-side TCP support" (port 12434)
  5. Click Apply & restart

Via Command Line

# Enable Model Runner
docker desktop enable model-runner

# Enable with TCP support (optional)
docker desktop enable model-runner --tcp 12434

Step 2: Download AI Models

Pull your preferred models using the new docker model CLI (these run natively on your host, not in containers):

# Lightweight model (1.2GB) - fast, good for testing
docker model pull ai/llama3.2:1B-Q8_0

# Balanced model (3GB) - recommended for most use cases
docker model pull ai/llama3.2:3B

# More capable model (2GB) - good balance of size and capability
docker model pull ai/gemma2:2B

# List downloaded models
docker model ls

# Test a model directly
docker model run ai/llama3.2:1B-Q8_0 "Hello, how are you?"

Important: These models run directly on your host using llama.cpp for optimal Apple Silicon GPU acceleration, not in Docker containers.

Model Comparison

ModelSizeUse CaseSpeed
ai/llama3.2:1B-Q8_01.2GBTesting, simple tasksVery Fast
ai/llama3.2:3B3GBGeneral purposeFast
ai/gemma2:2B2GBBalanced performanceFast
ai/qwen2.5:7B7GBComplex reasoningSlower
ai/mistral:7B7GBCode & analysisSlower

Step 2.5: Install Docker MCP Toolkit

The MCP Toolkit enables AI agents to interact with external systems securely. Install it through Docker Desktop:

Via Docker Desktop Extensions

  1. Open Docker Desktop
  2. Go to Extensions in the left sidebar
  3. Search for "MCP Toolkit" in the marketplace
  4. Click Install on the Docker MCP Toolkit extension
  5. Wait for installation to complete

Verify MCP Installation

# Check if MCP CLI is available
docker mcp --help

# List available MCP servers
docker mcp server ls

# Install GitHub MCP server (for repository operations)
docker mcp server install github-official

# Verify GitHub MCP server
docker mcp server ls | grep github

Configure GitHub MCP

  1. Select Docker MCP Toolkit and explore MCP Catalog
  1. Search for GitHub Official

Add Personal Access token. If you haven't yet created PAT, follow the steps below.

  1. Create a GitHub Personal Access Token (PAT)
  • Go to GitHub.com and sign in to your account
  • Click your profile picture in the top-right corner
  • Select "Settings"
  • Scroll down to "Developer settings" in the left sidebar
  • Click on "Personal access tokens" β†’ "Tokens (classic)"
  • Click "Generate new token" β†’ "Generate new token (classic)"
  • Give your token a descriptive name like "Docker MCP GitHub Access"
  • Select the following scopes (permissions):
  • repo (Full control of private repositories)
  • workflow (if you need workflow actions)
  • read:org (if you need organization access)
  • Click "Generate token"
  1. Configure the GitHub MCP Server in Docker
  • Open Docker Desktop
  • Navigate to the MCP Server
  • Find the GitHub tool (official) card and click on it to expand details. Add PAT details.
githubrefernece

If you want to stick to the terminal, you can use docker mcp secret command to set up the GitHub token as a secret:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_YOUR_TOKEN_HERE

For example:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_11AACMRCAXXXXXXxEp_QRZW43Wo1k6KYWwDXXXXXXXXGPXLZ7EGEnse82YM
Info: No policy specified, using default policy

For GitHub repository operations, configure authentication:

# Set GitHub token (get from https://github.com/settings/tokens)
docker mcp server configure github-official --github-token YOUR_GITHUB_TOKEN

# Test GitHub MCP tools
docker mcp tools list github-official

# Example tools available:
# - get_repository_info
# - list_repositories  
# - create_issue
# - list_pull_requests
# - search_repositories

Step 3: Project Setup

Create your project directory and required files:

mkdir n8n-ai-setup
cd n8n-ai-setup
mkdir shared  # This will be mounted to /data/shared in n8n

Step 4: Docker Compose Configuration

Create a docker-compose.yml file:


services:
  postgres:
    image: postgres:13
    restart: unless-stopped
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
      interval: 30s
      timeout: 10s
      retries: 5

  redis:
    image: redis:7-alpine
    restart: unless-stopped
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 30s
      timeout: 10s
      retries: 5

  n8n:
    image: docker.n8n.io/n8nio/n8n:latest
    restart: unless-stopped
    environment:
      # Database
      DB_TYPE: postgresdb
      DB_POSTGRESDB_HOST: postgres
      DB_POSTGRESDB_PORT: 5432
      DB_POSTGRESDB_DATABASE: ${POSTGRES_DB}
      DB_POSTGRESDB_USER: ${POSTGRES_USER}
      DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
      
      # Redis
      QUEUE_BULL_REDIS_HOST: redis
      QUEUE_BULL_REDIS_PORT: 6379
      
      # n8n Configuration
      N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
      N8N_HOST: ${N8N_HOST}
      N8N_PORT: 5678
      N8N_PROTOCOL: http
      WEBHOOK_URL: http://localhost:5678/
      GENERIC_TIMEZONE: ${GENERIC_TIMEZONE}
      
      # AI Configuration
      N8N_AI_ENABLED: true
      N8N_AI_OPENAI_DEFAULT_BASE_URL: http://model-runner.docker.internal/engines/llama.cpp/v1
      N8N_AI_DEFAULT_MODEL: ${N8N_AI_DEFAULT_MODEL}
      
      # Execution Mode
      EXECUTIONS_MODE: main
      
      # File system
      N8N_DEFAULT_BINARY_DATA_MODE: filesystem
      
    ports:
      - "5678:5678"
    volumes:
      - n8n_data:/home/node/.n8n
      - ./shared:/data/shared
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  # Uncomment for queue mode (production)
  # n8n-worker:
  #   image: docker.n8n.io/n8nio/n8n:latest
  #   restart: unless-stopped
  #   environment:
  #     # Same environment as n8n main
  #     DB_TYPE: postgresdb
  #     DB_POSTGRESDB_HOST: postgres
  #     DB_POSTGRESDB_PORT: 5432
  #     DB_POSTGRESDB_DATABASE: ${POSTGRES_DB}
  #     DB_POSTGRESDB_USER: ${POSTGRES_USER}
  #     DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD}
  #     QUEUE_BULL_REDIS_HOST: redis
  #     QUEUE_BULL_REDIS_PORT: 6379
  #     N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY}
  #     GENERIC_TIMEZONE: ${GENERIC_TIMEZONE}
  #     N8N_AI_ENABLED: true
  #     N8N_AI_OPENAI_DEFAULT_BASE_URL: http://model-runner.docker.internal/engines/llama.cpp/v1
  #     N8N_AI_DEFAULT_MODEL: ${N8N_AI_DEFAULT_MODEL}
  #     EXECUTIONS_MODE: queue
  #   volumes:
  #     - n8n_data:/home/node/.n8n
  #     - ./shared:/data/shared
  #   depends_on:
  #     postgres:
  #       condition: service_healthy
  #     redis:
  #       condition: service_healthy
  #   command: n8n worker

volumes:
  postgres_data:
  redis_data:
  n8n_data:

Step 5: Environment Configuration

Create a .env file with your configuration:

# Generate a secure encryption key
openssl rand -hex 32
# Database Configuration
POSTGRES_DB=n8n
POSTGRES_USER=n8n
POSTGRES_PASSWORD=n8n_password_change_me

# n8n Configuration
N8N_ENCRYPTION_KEY=your_generated_32_char_hex_key_here
N8N_HOST=localhost
GENERIC_TIMEZONE=America/New_York

# AI Model Configuration
N8N_AI_DEFAULT_MODEL=ai/llama3.2:3B

# Security Note: Change these default passwords!
# For production: Use strong, unique passwords

Important: Always change the default passwords and generate a unique encryption key!

Step 6: Launch the Stack

Start all services:

# Start all services in detached mode
docker compose up -d

# Check service status
docker compose ps

# View logs
docker compose logs -f n8n

# Check specific service logs
docker compose logs postgres
docker compose logs redis

Step 7: Initial Setup

  1. Open your browser and navigate to http://localhost:5678
  2. Create your admin account when prompted
  3. Complete the initial setup wizard

Step 8: Test Your AI Setup

Let's create a simple workflow to test the AI integration:

Basic AI Test Workflow

  1. Create a new workflow in n8n
  2. Add a Manual Trigger node
    • Method: POST
    • URL: http://model-runner.docker.internal/engines/llama.cpp/v1/chat/completions
    • Headers: Content-Type: application/json
    • Body:
  3. Execute the workflow and verify you receive an AI response

Add an HTTP Request node with these settings:

{
  "model": "ai/llama3.2:3B",
  "messages": [
    {
      "role": "user",
      "content": "Hello! Please introduce yourself and explain what you can do."
    }
  ],
  "max_tokens": 150,
  "temperature": 0.7
}

Using n8n AI Nodes

For more advanced scenarios, you can use n8n's built-in AI nodes:

  1. OpenAI Chat Model node (configured with local endpoint)
  2. AI Agent node for complex reasoning
  3. Text Classifier node for categorization
  4. Information Extractor node for data extraction

Step 9: Complete Stack Testing - All Three Components

Now let's verify that all three components (n8n + Model Runner + Docker MCP Toolkit) work together harmoniously. Save the comprehensive testing script below as complete-stack-test.sh:

# Download and run the complete stack test
curl -O https://raw.githubusercontent.com/ajeetraina/n8n-model-runner/main/performance-test-mcp.sh
chmod +x performance-test-mcp.sh
./performance-test-mcp.sh

This script will:

  • βœ… Test n8n health and API connectivity
  • βœ… Verify Model Runner API and AI inference
  • βœ… Check MCP Toolkit and GitHub integration
  • βœ… Create integration workflow using all three components
  • βœ… Performance test the complete stack
  • βœ… Generate comprehensive report of capabilities

Expected Output

When running successfully, you should see:

GitHub AI Workflow Testing

The script creates a complete integration workflow (complete-integration-workflow.json) that demonstrates:

Workflow Architecture:

Webhook Trigger β†’ AI Analysis β†’ GitHub MCP β†’ Process Results

What it does:

  1. Receives repository name via webhook
  2. Analyzes repository using local AI model
  3. Fetches GitHub data via MCP toolkit
  4. Processes and combines results
  5. Returns intelligent insights about the repository

Import the workflow:

  1. Open n8n at http://localhost:5678
  2. Click Import from File
  3. Select complete-integration-workflow.json
  4. Activate the workflow

Test the workflow:

# Test the complete GitHub AI workflow
curl -X POST http://localhost:5678/webhook/complete-demo \
  -H "Content-Type: application/json" \
  -d '{"repo_name": "ajeetraina/n8n-model-runner"}'

This will trigger the workflow and return AI-powered analysis of the repository!

Advanced Three-Component Workflow Examples

1. AI-Powered GitHub Code Review System

GitHub Webhook β†’ Fetch PR Data (MCP) β†’ AI Code Analysis β†’ Post Review Comments (MCP)

Workflow Capabilities:

  • Automatically analyzes pull requests using local AI
  • Fetches code changes via GitHub MCP
  • Generates intelligent code review comments
  • Posts feedback directly to GitHub
  • Maintains complete privacy - code never leaves your system

2. Smart Repository Documentation Generator

Schedule Trigger β†’ List Repositories (MCP) β†’ Fetch README β†’ AI Enhancement β†’ Update Documentation (MCP)

Features:

  • Automatically scans your GitHub repositories
  • Analyzes existing documentation with AI
  • Generates improved README files
  • Commits updates back to repositories
  • Uses local AI for content generation

3. Intelligent Issue Triaging System

GitHub Issue Webhook β†’ AI Classification β†’ Route to Team β†’ Create Project Cards (MCP)

Automation includes:

  • Categorizes issues using local AI models
  • Assigns priority levels based on content analysis
  • Routes to appropriate team members
  • Creates project board cards automatically
  • Sends notifications to relevant stakeholders

4. AI-Enhanced Document Processing Pipeline

File Upload β†’ Extract Text β†’ AI Analysis β†’ GitHub Wiki Update (MCP) β†’ Team Notification

Capabilities:

  • Process uploaded documents (PDF, DOCX, etc.)
  • Extract and analyze content with local AI
  • Generate summaries and insights
  • Update team wikis via GitHub MCP
  • Notify team members of new documentation

5. Automated Security Analysis Workflow

Code Push β†’ Fetch Changes (MCP) β†’ AI Security Scan β†’ Create Security Issues (MCP) β†’ Alert Team

Security features:

  • Monitors code changes in real-time
  • Performs AI-powered security analysis
  • Identifies potential vulnerabilities
  • Creates GitHub security issues automatically
  • Maintains audit trail of security reviews

Production Considerations

Queue Mode for Scalability

For high-volume workflows, enable queue mode:

  1. Uncomment the n8n-worker service in docker-compose.yml
  2. Set EXECUTIONS_MODE=queue for both main and worker services
  3. Scale workers as needed:
# Scale to 3 workers
docker compose up -d --scale n8n-worker=3

Security Best Practices

  1. Use strong passwords for all services
  2. Enable HTTPS with a reverse proxy (nginx, Traefik)
  3. Set up firewall rules to restrict access
  4. Regular backups of volumes and configurations
  5. Monitor resource usage and set alerts

Backup and Recovery

# Backup n8n data
docker run --rm \
  -v n8n-ai-setup_n8n_data:/data \
  -v $(pwd):/backup \
  alpine tar czf /backup/n8n-backup.tar.gz /data

# Restore n8n data
docker run --rm \
  -v n8n-ai-setup_n8n_data:/data \
  -v $(pwd):/backup \
  alpine tar xzf /backup/n8n-backup.tar.gz -C /

Monitoring and Troubleshooting

Common Issues and Solutions

TCP connection issues:

# Verify TCP support is enabled
docker desktop status

# Test TCP endpoint (if enabled on port 12434)
curl http://localhost:12434/engines/llama.cpp/v1/models

Models not loading:

# Verify downloaded models
docker model ls

# Test model directly
docker model run ai/llama3.2:3B "test message"

Database connection issues:

# Check PostgreSQL health
docker compose logs postgres

# Restart if needed
docker compose restart postgres

Check Model Runner logs (Mac):

# View inference logs in real-time
tail -f ~/Library/Containers/com.docker.docker/Data/log/host/inference-llama.cpp.log

Model Runner not accessible:

# Check Model Runner status
docker model status

# List available models
docker model ls

# Test direct connection
curl http://model-runner.docker.internal/engines/llama.cpp/v1/models

Performance Monitoring

# Monitor Docker containers (n8n, postgres, redis)
docker stats

# Check Model Runner status and performance
docker model status

# Monitor GPU usage in real-time (Mac)
# Open Activity Monitor and view GPU History

# Check n8n container health
docker compose ps
docker compose logs --tail=50 n8n

# Monitor Model Runner inference logs
tail -f ~/Library/Containers/com.docker.docker/Data/log/host/inference-llama.cpp.log

GPU Monitoring on Mac:

  • Open Activity Monitor (Cmd + Space, type "Activity Monitor")
  • Go to Window > GPU History
  • You'll see GPU activity spikes when AI inference is running

Advanced Model Configurations

Custom Model Templates

For different types of workflows, create custom HTTP request templates:

Code Assistant Template:

{
  "model": "ai/gemma2:2B",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful coding assistant specialized in Python and JavaScript."
    },
    {
      "role": "user",
      "content": "{{ $json.user_prompt }}"
    }
  ],
  "temperature": 0.1,
  "max_tokens": 500
}

Content Creator Template:

{
  "model": "ai/llama3.2:3B",
  "messages": [
    {
      "role": "system",
      "content": "You are a creative writing assistant that helps with content creation."
    },
    {
      "role": "user",
      "content": "{{ $json.user_prompt }}"
    }
  ],
  "temperature": 0.8,
  "max_tokens": 800
}

Integration Examples

Slack Bot with AI

Connect n8n to Slack and create an AI-powered bot:

  1. Slack Trigger - Listen for mentions
  2. AI Agent - Process the question
  3. Slack Response - Send intelligent replies

GitHub Issue Analyzer

Automate issue analysis and labeling:

  1. GitHub Trigger - New issue created
  2. AI Text Classifier - Categorize the issue
  3. GitHub Action - Apply appropriate labels

Email Newsletter Generator

Automate content creation:

  1. Schedule Trigger - Weekly execution
  2. RSS Feed Reader - Gather latest news
  3. AI Content Generator - Create newsletter
  4. Email Service - Send to subscribers

Next Steps and Resources

Explore Further

  1. n8n AI Nodes: Experiment with built-in AI Agent and Text Classifier nodes
  2. Custom Nodes: Develop your own nodes that integrate with Model Runner
  3. Workflow Templates: Import templates from n8n's community gallery
  4. API Integration: Connect with your existing tools and services

Useful Resources

Community and Support

Conclusion

You now have a complete local AI automation platform that leverages the best of both worlds: n8n's powerful workflow capabilities running in containers, and AI models running natively on your host for optimal performance! This setup provides:

  • Privacy-first AI workflows with no data leaving your environment
  • Native performance with direct GPU access via Apple's Metal API
  • Cost-effective automation without per-token charges
  • Unlimited experimentation with different models from Docker Hub
  • Production-ready scalability with queue mode and worker scaling
  • Standardized distribution using OCI artifacts

The combination of n8n's containerized workflow engine and Docker Model Runner's host-native AI inference creates a powerful platform for building intelligent automation workflows while maintaining complete control over your data and infrastructure. Whether you're processing customer emails, analyzing documents, or creating content, this local AI stack provides the power and flexibility you need.

Start building your first AI-powered workflow today, and discover the potential of local AI automation with the performance benefits of native execution!