Claude Code + Ollama: Run Powerful AI Coding with Free Models in 2026
Ollama now lets you launch Claude Code backed by free and cloud-hosted open-source models — including MiniMax M2.5, Kimi K2.5, and Ministral 3 — with a single terminal command. No Anthropic API key required.
The Big Shift: Ollama Meets Claude Code
In January 2026, Ollama released v0.14.0 with full support for Anthropic's Messages API, meaning Claude Code — Anthropic's powerful agentic coding tool — can now route its model calls to any Ollama-served model instead of being locked to Anthropic's paid cloud. This decouples Claude Code's best-in-class developer experience (plan mode, subagents, file editing, terminal commands) from the underlying model.
Ollama also introduced a new ollama launch command that sets up and runs Claude Code, OpenCode, and Codex with local or cloud models automatically — no environment variables or config files needed.
How It Works
Claude Code communicates with models using Anthropic's Messages API format. Since Ollama now speaks that same format, Claude Code connects to Ollama's local server as if it were Anthropic's cloud — but all inference happens on your machine or via Ollama's partner cloud. The ollama launch claude command handles all the wiring automatically.
The Three Models You Can Use
1. MiniMax M2.5 — Best for Vibe Coding
ollama launch claude --model minimax-m2.5:cloud
MiniMax M2.5 is a cloud-hosted model available to Ollama users, offered free during promotional periods via Ollama's partnership with MiniMax AI. It is a 10B activated parameter model (sparse MoE) that delivers state-of-the-art performance while remaining incredibly efficient. It excels at UI/UX generation, native Android & iOS app development, advanced code review, and complex tool-calling workflows — and has been explicitly benchmarked against Claude Code, Cline, Kilo Code, and Roo Code.
Key highlights:
- Advanced Interleaved Thinking (first open-source model with this feature)
- Superior multilingual support, outperforming Claude Sonnet 4.5 in multilingual benchmarks
- Optimized for agentic scaffolding with support for
Claude.md,.cursorrules, and Slash Commands - Cloud model — no local GPU required
2. Kimi K2.5 — Best for Large Agentic Tasks
ollama launch claude --model kimi-k2.5:cloud
Kimi K2.5 is Moonshot AI's open-source native multimodal model built on approximately 15 trillion mixed visual and text tokens. It features 1 trillion total parameters (32B active) with a 256K context window, released under the MIT license. It's purpose-built for agentic workflows — making it ideal for long-running Claude Code sessions that involve tool calling, file navigation, and iterative code generation.
Key highlights:
- Native multimodal (vision + text pre-training)
- Can generate code directly from UI designs or video workflows
- 256K context window — larger than Claude Opus 4.5's 200K
- Cloud-hosted via Ollama — no heavy local hardware needed
- ~$0.81 per million tokens on Ollama Cloud
3. Ministral 3 — Best for Local / Edge Deployment
ollama launch claude --model ministral-3
Ministral 3 is Mistral's open-source family (3B, 8B, 14B) designed specifically for edge deployment — capable of running on a wide range of hardware, including older machines. Released under the Apache 2.0 license, it supports vision, multilingual input, native function calling, structured JSON output, and even runs fully in-browser via WebGPU. With a 256K token context window and pricing around $0.1 per million tokens via Mistral's API, it's practically free to use.
Key highlights:
- Smallest variant (3B) runs locally, even on low-end hardware
- Fully multimodal with vision input support
- Native function/tool calling — critical for Claude Code's agentic features
- Apache 2.0 license — free to use and fine-tune commercially
- Runs 100% offline with no data leaving your machine
Quick Setup Guide
Step 1: Install Ollama
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
Step 2: Install Claude Code
npm install -g @anthropic-ai/claude-code
Step 3: Sign in to Ollama (for cloud models)
ollama signin
Step 4: Launch with your chosen model
# Option A – MiniMax M2.5 (cloud, free/promotional)
ollama launch claude --model minimax-m2.5:cloud
# Option B – Kimi K2.5 (cloud, low cost)
ollama launch claude --model kimi-k2.5:cloud
# Option C – Ministral 3 (fully local)
ollama launch claude --model ministral-3
For larger context windows with cloud models, you can configure Ollama before launching:
OLLAMA_CONTEXT_LENGTH=64000 ollama serve
Model Comparison
| Feature | MiniMax M2.5 | Kimi K2.5 | Ministral 3 |
|---|---|---|---|
| Hosting | Cloud ☁️ | Cloud ☁️ | Local 💻 |
| Parameters (active) | 10B | 32B | 3B / 8B / 14B |
| Context Window | Large | 256K | 256K |
| Vision | ✅ | ✅ | ✅ |
| License | Proprietary | MIT | Apache 2.0 |
| Cost | Free (promo) / paid | ~$0.81/M tokens | Free (local) |
| Best For | Vibe coding, UI gen | Long agentic sessions | Offline / edge use |
Why This Matters for Developers
This integration represents a genuine shift in AI coding economics. Previously, heavy Claude Code usage could rack up significant API costs. Now, developers can:
- Save costs by routing to free or near-free open models
- Protect privacy by keeping sensitive code off Anthropic's servers
- Work offline with local models like Ministral 3
- Switch models instantly without changing Claude Code's interface or workflow
Importantly, this setup keeps Claude Code itself (the agent, tooling, and UX) intact — only the model backend changes. You still get plan mode, subagents, parallel workflows, and file editing — just powered by a different brain.
Comments 0
No comments yet
Be the first to share your thoughts!
Leave a Comment