Table Of Content

DeerFlow + Ollama: Exploring the Power of the Super Agent Harness
Table Of Content
Most AI tools today feel like smart autocomplete. You ask a question, you get text back, and you still have to do the actual work. DeerFlow from ByteDance tries to change that by doing the work.
It spins up an isolated Docker environment with a real file system, a bash terminal, and the ability to read and write actual files. You give it a task, a lead agent breaks it into steps, sub agents handle each step in parallel, and you get a finished result. That could be a report, a website, or a slide deck.
There are bugs. You need to install the prerequisites properly. The idea is good, and if it improves, it could be a big deal.
I am installing it with Ollama-based models on Ubuntu. I have one Nvidia RTX 6000 with 48 GB VRAM. If you run a smaller Ollama model, DeerFlow may not work well for serious tasks.
For production or important jobs, go with a hosted API model. I will also show a switch to a Minimax API model after the local setup. The flow will be fully local first and can run offline.
DeerFlow + Ollama: Exploring the Power of the Super Agent
What makes this stand out is that it does not just tell you what to do. It actually does it inside a sandboxed container. That is different from a regular chatbot.
The tool uses an OpenAI-compatible API, so you can point it at a local Ollama server. It can call tools for web search, fetch full page content, and run bash in the sandbox. It supports memory and checkpointing so your sessions can persist across restarts.
If you are comparing options and patterns for orchestrating tasks with agents, browse our agent write-ups. They will give you context as you plan your stack.
Local setup on Ubuntu with Ollama
I am using a local GLM 4.7 Flash model via Ollama. The machine has enough VRAM to respond quickly under load. Smaller models may struggle.
Install Docker, Node.js, npm, and pnpm. If you do not have pnpm, install it globally.
npm install -g pnpmCreate a dedicated environment with Python 3.12. I am using conda.
conda create -n deerflow python=3.12 -y
conda activate deerflowClone the DeerFlow repo and the DeerFlow installer repo. From the root of the main repo, generate the config files. Open the config in your editor and adjust the model section for Ollama.

Model configuration for local Ollama
Point DeerFlow at your local Ollama with the OpenAI-compatible endpoint. Because DeerFlow runs inside Docker, use the host address that resolves correctly from inside containers.

models:
default:
provider: openai
base_url: http://host.docker.internal:11434/v1
api_key: "ollama"
model: "glm-4.7-flash"
temperature: 0.7
tools:
web_search:
provider: duckduckgo
web_reader:
provider: jina
file_bash:
enabled: true
sandbox:
docker_aio: true
skills:
path: "../deerflow-installer/skills"
memory:
file: "memory.json"
checkpoints:
sqlite: "state.db"
This tells DeerFlow to use your local model through an OpenAI-compatible API. The sandbox setting spins up an isolated container per task with its own file system and terminal. Skills point at prebuilt workflows for research, slide generation, and data analysis.
If you are exploring Ollama setups and model choices, see the Ollama collection for more context.
Initialize the Docker sandbox
From the root of the repo, pull the sandbox image and start the stack.
make docker init
make docker start
This starts multiple containers for the frontend, LangGraph gateway, and Nginx. Access the UI at http://localhost:2026.

If your stack fails to start or a job aborts mid-run, review your logs and compare symptoms with this troubleshooting guide: Agent execution terminated error.
Working in the UI
Create an agent and give it a grounding prompt for your use case. You can keep chatting with that agent and it will stay focused. I noticed the UI can be slow at times.
DeerFlow can search the web and fetch full page content if you provide API keys for the configured tools. The bash tool runs inside the sandbox and can read and write files. Memory persists preferences in memory.json and you can keep conversation state with SQLite.

Switching to a hosted API model
If you want better performance and reach, use an API model. I switched to Minimax for this purpose. Here is how I adjusted configuration.
Minimax configuration
Edit your model section to point to Minimax. Set temperature to 1 or Minimax will return an error.

models:
default:
provider: minimax
base_url: https://api.minimax.chat/v1
api_key: ${MINIMAX_API_KEY}
model: "M2.7"
temperature: 1Add your API key to the environment file in the repo root.

MINIMAX_API_KEY=your_real_key_hereRestart the stack so the changes take effect.
make docker stop
make docker startOpen the UI again at http://localhost:2026, create a new agent, and pick the Minimax model. You can name it, pick a purpose like a structured report writer, and start chatting.

How it executes work
DeerFlow is not a single agent. It is infrastructure around an agentic approach that can coordinate multiple sub agents.
It can orchestrate steps in parallel and persist context. It can read local files inside its sandbox, call tools, and write outputs to a workspace directory.
If multi-agent patterns are on your roadmap, compare with our notes on Hermes style multi-agent flows on Ollama. It will help you understand tradeoffs across local and hosted approaches.
Use cases
Give it a research topic and ask for a clear, structured report with citations. It will plan the steps, search, read, and assemble a document in your workspace.

Ask it to generate a simple static site from a brief and assets in the workspace. It will write files, run shell tasks, and produce an artifact you can serve.
Point it at a dataset and request an exploratory analysis. It can script, run commands, and save a notebook or summary.
If you are mapping broader agent design patterns for your team, browse the AI agent references. They will give you a wider view of orchestration strategies.
Limitations and notes
You will likely need multiple API keys to unlock meaningful web and tool coverage. Running only local models limits reach compared to stronger hosted models.
I keep local Ollama models for private work and testing. For production-grade tasks, I recommend switching to a robust API model inside the same setup.
Final thoughts
DeerFlow sets up a real sandbox, breaks tasks into steps, calls tools, and produces artifacts. That is a material step up from a plain chatbot.
It still needs polish and careful setup of prerequisites and API keys. With the right model and configuration, it can research, write, and execute tasks on your behalf.
Subscribe to our newsletter
Get the latest updates and articles directly in your inbox.
Related Posts

MolmoWeb: Open Multimodal Agents for Local Browser Control
MolmoWeb: Open Multimodal Agents for Local Browser Control

How AutoClaw Integrates AI Agents into Telegram on Windows
How AutoClaw Integrates AI Agents into Telegram on Windows

GLM-5V-Turbo with OpenClaw: Transform Sketches, Images, Videos into Apps
GLM-5V-Turbo with OpenClaw: Transform Sketches, Images, Videos into Apps

