Sonu Sahani logo
Sonusahani.com
DeerFlow + Ollama: Exploring the Power of the Super Agent Harness

DeerFlow + Ollama: Exploring the Power of the Super Agent Harness

0 views
7 min read
#AI
Table Of Content

    Most AI tools today feel like smart autocomplete. You ask a question, you get text back, and you still have to do the actual work. DeerFlow from ByteDance tries to change that by doing the work.

    It spins up an isolated Docker environment with a real file system, a bash terminal, and the ability to read and write actual files. You give it a task, a lead agent breaks it into steps, sub agents handle each step in parallel, and you get a finished result. That could be a report, a website, or a slide deck.

    There are bugs. You need to install the prerequisites properly. The idea is good, and if it improves, it could be a big deal.

    I am installing it with Ollama-based models on Ubuntu. I have one Nvidia RTX 6000 with 48 GB VRAM. If you run a smaller Ollama model, DeerFlow may not work well for serious tasks.

    For production or important jobs, go with a hosted API model. I will also show a switch to a Minimax API model after the local setup. The flow will be fully local first and can run offline.

    DeerFlow + Ollama: Exploring the Power of the Super Agent

    What makes this stand out is that it does not just tell you what to do. It actually does it inside a sandboxed container. That is different from a regular chatbot.

    The tool uses an OpenAI-compatible API, so you can point it at a local Ollama server. It can call tools for web search, fetch full page content, and run bash in the sandbox. It supports memory and checkpointing so your sessions can persist across restarts.

    If you are comparing options and patterns for orchestrating tasks with agents, browse our agent write-ups. They will give you context as you plan your stack.

    Local setup on Ubuntu with Ollama

    I am using a local GLM 4.7 Flash model via Ollama. The machine has enough VRAM to respond quickly under load. Smaller models may struggle.

    Install Docker, Node.js, npm, and pnpm. If you do not have pnpm, install it globally.

    npm install -g pnpm

    Create a dedicated environment with Python 3.12. I am using conda.

    conda create -n deerflow python=3.12 -y
    conda activate deerflow

    Clone the DeerFlow repo and the DeerFlow installer repo. From the root of the main repo, generate the config files. Open the config in your editor and adjust the model section for Ollama.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 233s

    Model configuration for local Ollama

    Point DeerFlow at your local Ollama with the OpenAI-compatible endpoint. Because DeerFlow runs inside Docker, use the host address that resolves correctly from inside containers.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 282s

    models:
      default:
        provider: openai
        base_url: http://host.docker.internal:11434/v1
        api_key: "ollama"
        model: "glm-4.7-flash"
        temperature: 0.7
    tools:
      web_search:
        provider: duckduckgo
      web_reader:
        provider: jina
      file_bash:
        enabled: true
    sandbox:
      docker_aio: true
    skills:
      path: "../deerflow-installer/skills"
    memory:
      file: "memory.json"
    checkpoints:
      sqlite: "state.db"

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 292s

    This tells DeerFlow to use your local model through an OpenAI-compatible API. The sandbox setting spins up an isolated container per task with its own file system and terminal. Skills point at prebuilt workflows for research, slide generation, and data analysis.

    If you are exploring Ollama setups and model choices, see the Ollama collection for more context.

    Initialize the Docker sandbox

    From the root of the repo, pull the sandbox image and start the stack.

    make docker init
    make docker start

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 383s

    This starts multiple containers for the frontend, LangGraph gateway, and Nginx. Access the UI at http://localhost:2026.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 427s

    If your stack fails to start or a job aborts mid-run, review your logs and compare symptoms with this troubleshooting guide: Agent execution terminated error.

    Working in the UI

    Create an agent and give it a grounding prompt for your use case. You can keep chatting with that agent and it will stay focused. I noticed the UI can be slow at times.

    DeerFlow can search the web and fetch full page content if you provide API keys for the configured tools. The bash tool runs inside the sandbox and can read and write files. Memory persists preferences in memory.json and you can keep conversation state with SQLite.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 484s

    Switching to a hosted API model

    If you want better performance and reach, use an API model. I switched to Minimax for this purpose. Here is how I adjusted configuration.

    Minimax configuration

    Edit your model section to point to Minimax. Set temperature to 1 or Minimax will return an error.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 523s

    models:
      default:
        provider: minimax
        base_url: https://api.minimax.chat/v1
        api_key: ${MINIMAX_API_KEY}
        model: "M2.7"
        temperature: 1

    Add your API key to the environment file in the repo root.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 550s

    MINIMAX_API_KEY=your_real_key_here

    Restart the stack so the changes take effect.

    make docker stop
    make docker start

    Open the UI again at http://localhost:2026, create a new agent, and pick the Minimax model. You can name it, pick a purpose like a structured report writer, and start chatting.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 610s

    How it executes work

    DeerFlow is not a single agent. It is infrastructure around an agentic approach that can coordinate multiple sub agents.

    It can orchestrate steps in parallel and persist context. It can read local files inside its sandbox, call tools, and write outputs to a workspace directory.

    If multi-agent patterns are on your roadmap, compare with our notes on Hermes style multi-agent flows on Ollama. It will help you understand tradeoffs across local and hosted approaches.

    Use cases

    Give it a research topic and ask for a clear, structured report with citations. It will plan the steps, search, read, and assemble a document in your workspace.

    Screenshot from DeerFlow + Ollama: Exploring the Power of the Super Agent Harness at 660s

    Ask it to generate a simple static site from a brief and assets in the workspace. It will write files, run shell tasks, and produce an artifact you can serve.

    Point it at a dataset and request an exploratory analysis. It can script, run commands, and save a notebook or summary.

    If you are mapping broader agent design patterns for your team, browse the AI agent references. They will give you a wider view of orchestration strategies.

    Limitations and notes

    You will likely need multiple API keys to unlock meaningful web and tool coverage. Running only local models limits reach compared to stronger hosted models.

    I keep local Ollama models for private work and testing. For production-grade tasks, I recommend switching to a robust API model inside the same setup.

    Final thoughts

    DeerFlow sets up a real sandbox, breaks tasks into steps, calls tools, and produces artifacts. That is a material step up from a plain chatbot.

    It still needs polish and careful setup of prerequisites and API keys. With the right model and configuration, it can research, write, and execute tasks on your behalf.

    Subscribe to our newsletter

    Get the latest updates and articles directly in your inbox.

    sonuai.dev

    Sonu Sahani

    AI Engineer & Full Stack Developer. Passionate about building AI-powered solutions.

    Related Posts