Docker cagent: A Command Line Tool For Running AI Agents

Discover how Docker's revolutionary cagent framework is transforming AI agent development with simple YAML configurations, multi-agent orchestration, and seamless tool integration.

Docker cagent: A Command Line Tool For Running AI Agents

The Rise of Agentic AI

The AI landscape is rapidly shifting from simple chatbots to sophisticated AI agents that can reason, plan, and execute complex tasks autonomously. At the forefront of this revolution is Docker cagent – a powerful, easy-to-use multi-agent runtime that's democratizing AI agent development for developers worldwide.

Unlike traditional AI applications that simply respond to queries, agentic AI systems built with cagent can:

  • Break down complex problems into manageable tasks
  • Delegate work to specialized agents
  • Use external tools and APIs through the Model Context Protocol (MCP)
  • Collaborate intelligently to achieve goals
  • Scale from prototype to production seamlessly

What is Docker cagent?

GitHub - docker/cagent: Agent Builder and Runtime by Docker Engineering
Agent Builder and Runtime by Docker Engineering. Contribute to docker/cagent development by creating an account on GitHub.

Docker cagent is an open-source multi-agent runtime developed by Docker Engineering that allows developers to create, orchestrate, and deploy teams of AI agents with specialized capabilities. Think of it as building a virtual workforce where each agent is an expert in their domain, working together to solve complex problems.

Key Features That Set cagent Apart

🏗️ Multi-Agent Architecture Create specialized agents for different domains – from code analysis to content creation to data processing. Each agent can have unique skills, knowledge bases, and tools.

🔧 Rich Tool Ecosystem Agents can access external tools, APIs, databases, and services through the standardized Model Context Protocol (MCP), dramatically expanding their capabilities.

🔄 Smart Delegation Agents automatically determine which tasks they can handle and which should be delegated to more suitable specialists in the team.

📝 YAML Configuration Define your entire agent ecosystem using simple, declarative YAML files – no complex coding required.

💭 Advanced Reasoning Built-in "think", "todo", and "memory" tools enable sophisticated problem-solving and context retention.

🌐 Multiple AI Providers Support for OpenAI, Anthropic, Google Gemini, and Docker Model Runner means you're not locked into a single provider.

Getting Started: Your First AI Agent in Minutes

cagent is a command-line tool for running AI agents.

Installation

Setting up cagent is remarkably straightforward:


# Download fresh (make sure you're getting the right architecture)
curl -L -o cagent https://github.com/docker/cagent/releases/download/v1.0.3/cagent-darwin-arm64

# Remove quarantine immediately after download
/usr/bin/xattr -rd com.apple.quarantine cagent

# Make executable
chmod +x cagent

# Move to PATH
sudo mv cagent /usr/local/bin/

# Test
cagent --help

Ensuring that cagent is installed.

Set up your API keys:

# For OpenAI models
export OPENAI_API_KEY=your_api_key_here
# For Anthropic models  
export ANTHROPIC_API_KEY=your_api_key_here
# For Gemini models
export GOOGLE_API_KEY=your_api_key_here

Creating Your First Agent

Here's a basic agent configuration (basic_agent.yaml):

This is a minimal cagent configuration that creates a simple AI assistant named "root" using OpenAI's GPT-4o-mini model (the faster, more cost-effective version of GPT-4o). The description field provides a brief summary of what the agent does, while the instruction section contains the system prompt that defines the agent's personality and behavior - in this case, telling it to be a helpful, accurate, and concise assistant for various tasks.

agents:
  root:
    model: openai/gpt-4o-mini
    description: A helpful AI assistant
    instruction: |
      You are a knowledgeable assistant that helps users with various tasks.
      Be helpful, accurate, and concise in your responses.

This is a "vanilla" agent without any special tools or capabilities - notice there's no toolset section, which means it can't search the web, read files, or access external services. It can only respond based on its training data and the conversation context. This type of basic configuration is perfect for general Q&A, explanations, writing help, or simple problem-solving where you don't need real-time information or external tool access.

Run it with:

cagent run basic_agent.yaml

That's it! You now have a functioning AI agent.

Multi-Agent Coordination System

Here's a more sophisticated setup with a coordinator and specialist agent:

agents:
  root:
    model: anthropic/claude-sonnet-4-20250514  # Latest Sonnet 4
    description: "Main coordinator agent that delegates tasks and manages workflow"
    instruction: |
      You are the root coordinator agent. Your job is to:
      1. Understand user requests and break them down into manageable tasks
      2. Delegate appropriate tasks to your helper agent
      3. Coordinate responses and ensure tasks are completed properly
      4. Provide final responses to the user
    sub_agents: ["helper"]
    
  helper:
    model: anthropic/claude-opus-4-20250805   # Latest Opus 4.1
    description: "Assistant agent that helps with various tasks as directed by the root agent"
    instruction: |
      You are a helpful assistant agent. Your role is to:
      1. Complete specific tasks assigned by the root agent
      2. Provide detailed and accurate responses
      3. Ask for clarification if tasks are unclear
      4. Report back to the root agent with your results

Research Agent with Web Search Capabilities

This YAML file defines a research agent named "root" that uses OpenAI's GPT-4o model. The description provides a brief summary of what the agent does, while the instruction section contains the detailed system prompt that tells the agent exactly how to behave - in this case, acting as an expert research analyst that searches for current information, verifies facts, and provides structured summaries with citations. The | symbol after instruction: allows for multi-line text formatting.

agents:
  root:
    model: openai/gpt-4o
    description: Advanced research agent with multiple tools
    instruction: |
      You are an expert research analyst with access to web search tools.
      Your capabilities include:
      - Real-time web searching and information gathering
      - Fact verification across multiple sources
      - Trend analysis and competitive intelligence
      - Academic and scientific research
      - Market research and business intelligence

      Always:
      1. Search for the most current information available
      2. Cross-reference multiple sources for accuracy
      3. Provide clear source attribution
      4. Distinguish between verified facts and speculation
      5. Offer analysis and insights based on findings
      6. Structure your responses clearly with key takeaways
    toolset:
      - type: mcp
        command: docker
        args: ["mcp", "gateway", "run", "--servers=duckduckgo"]
      # Add more tools as needed
      # - type: mcp
      #   command: docker
      #   args: ["mcp", "gateway", "run", "--servers=brave"]

models:
  gpt4o:
    provider: openai
    model: gpt-4o
    max_tokens: 4000
    temperature: 0.1  # Lower temperature for more factual responses

The toolset section is what gives the agent its web search superpowers - it connects to Docker's MCP Gateway running a DuckDuckGo search server, enabling real-time web searching instead of just relying on training data. The models section at the bottom defines the specific model configuration, including token limits (4000 tokens max) and temperature (0.1 for more factual, less creative responses). This combination creates an agent that can actually search the web and provide current information rather than just giving generic advice.

Development Assistant with File Operations

This YAML defines a sophisticated coding assistant powered by Claude Sonnet 4 (Anthropic's latest model) that combines AI reasoning with practical development tools.

The agent is designed as an "expert coding assistant" with dual capabilities: it can read and write files in your local directory using the rust-mcp-filesystem tool, and search the web for documentation and solutions using DuckDuckGo via Docker's MCP Gateway.

This combination allows the agent to understand your existing codebase by reading files, research best practices and solutions online, and then implement changes directly to your files.

agents:
  root:
    model: anthropic/claude-sonnet-4-20250514  # Correct Claude Sonnet 4 model
    description: A development assistant with file system access and web search
    instruction: |
      You are an expert coding assistant with access to file operations and web search.
      Your capabilities include:
      - Reading and writing files in the current directory
      - Searching the web for documentation, solutions, and current information
      - Code review, refactoring, and development assistance
      - Debugging and troubleshooting
      
      Always:
      1. Read existing files to understand the codebase structure
      2. Search for best practices and current solutions when needed
      3. Write clean, well-documented code
      4. Explain your changes and reasoning
      5. Follow the existing code style and conventions
    toolset:
      - type: mcp
        command: docker
        args: ["mcp", "gateway", "run", "--servers=duckduckgo"]
      - type: mcp
        command: rust-mcp-filesystem
        args: ["--allow-write", "."]
        tools: ["read_file", "write_file"]

The instruction section establishes a methodical workflow where the agent first examines existing code to understand the project structure, researches current best practices when needed, and then writes clean, well-documented code while explaining its reasoning.

The toolset configuration is what makes this powerful - the MCP (Model Context Protocol) tools give the agent real-world capabilities beyond just text generation. Unlike a basic AI assistant that can only provide advice, this agent can actually read your code files, search for current documentation or solutions, and write code changes back to your filesystem, making it a true development partner rather than just a chatbot.

Integration with MCP

One of cagent's most powerful features is its integration with the Model Context Protocol (MCP). This enables agents to:

  • Search the web using DuckDuckGo
  • Access GitHub repositories for code analysis
  • Read and write files on the local system
  • Query databases for data retrieval
  • Integrate with APIs for external services
  • Send emails and notifications

Agent Sharing and Distribution

cagent includes built-in capabilities for sharing agents:

# Push your agent to Docker Hub
cagent push ./my_agent.yaml namespace/agent-name

# Pull and run someone else's agent
cagent pull creek/pirate
cagent run creek/pirate

For example, I pushed research agent to my Docker Hub

cagent push ./research_agent.yaml ajeetraina777/researchagent



Pushing agent ./research_agent.yaml to ajeetraina777/researchagent
Successfully pushed artifact to ajeetraina777/researchagent

This creates a marketplace of AI agents where developers can share specialized agents for different use cases.

Use Cases: Where cagent Excels

A Futuristic cagent Marketplace at https://cagent.vercel.app

1. Content Creation and Marketing

  • Blog writing teams: Research agent → Content creator → Editor → SEO optimizer
  • Social media management: Trend analyzer → Content generator → Scheduler
  • Marketing campaigns: Market researcher → Copywriter → Designer coordinator

2. Software Development

  • Code review systems: Static analyzer → Security checker → Performance optimizer
  • DevOps automation: Monitoring agent → Issue detector → Fix implementer
  • Documentation generation: Code analyzer → Writer → Formatter

3. Research and Analysis

  • Financial analysis: Data collector → Trend analyzer → Report generator
  • Scientific research: Literature reviewer → Data analyzer → Hypothesis tester
  • Competitive intelligence: Web scraper → Analyzer → Report compiler

4. Customer Support

  • Ticket routing: Classifier → Specialist router → Response generator
  • Knowledge base management: Content updater → Accuracy checker → Publisher
  • Escalation handling: Issue detector → Priority assessor → Human handoff

5. Business Process Automation

  • Invoice processing: Document reader → Data extractor → System updater
  • HR workflows: Resume screener → Interview scheduler → Decision maker
  • Inventory management: Stock monitor → Reorder trigger → Supplier coordinator

Conclusion: The Agent-Driven Future

Docker cagent represents a fundamental shift in how we think about AI applications. By making multi-agent development as simple as writing a YAML file, cagent democratizes access to sophisticated AI systems that were previously available only to large tech companies.

Whether you're a startup looking to automate business processes, a developer wanting to build AI-powered applications, or an enterprise seeking to scale intelligent automation, cagent provides the tools and framework you need to succeed.

The future belongs to intelligent agents working together, and with cagent, that future is available today.


Resources and Further Reading