Claude Code + Ollama: Free AI Coding with Open Models

Claude Code + Ollama: Free AI Coding with Open Models

Claude Code + Ollama: Run Powerful AI Coding with Free Models in 2026

Ollama now lets you launch Claude Code backed by free and cloud-hosted open-source models — including MiniMax M2.5, Kimi K2.5, and Ministral 3 — with a single terminal command. No Anthropic API key required.

The Big Shift: Ollama Meets Claude Code

In January 2026, Ollama released v0.14.0 with full support for Anthropic's Messages API, meaning Claude Code — Anthropic's powerful agentic coding tool — can now route its model calls to any Ollama-served model instead of being locked to Anthropic's paid cloud. This decouples Claude Code's best-in-class developer experience (plan mode, subagents, file editing, terminal commands) from the underlying model.

Ollama also introduced a new ollama launch command that sets up and runs Claude Code, OpenCode, and Codex with local or cloud models automatically — no environment variables or config files needed.

How It Works

Claude Code communicates with models using Anthropic's Messages API format. Since Ollama now speaks that same format, Claude Code connects to Ollama's local server as if it were Anthropic's cloud — but all inference happens on your machine or via Ollama's partner cloud. The ollama launch claude command handles all the wiring automatically.

The Three Models You Can Use

1. MiniMax M2.5 — Best for Vibe Coding

ollama launch claude --model minimax-m2.5:cloud

MiniMax M2.5 is a cloud-hosted model available to Ollama users, offered free during promotional periods via Ollama's partnership with MiniMax AI. It is a 10B activated parameter model (sparse MoE) that delivers state-of-the-art performance while remaining incredibly efficient. It excels at UI/UX generation, native Android & iOS app development, advanced code review, and complex tool-calling workflows — and has been explicitly benchmarked against Claude Code, Cline, Kilo Code, and Roo Code.

Key highlights:

  • Advanced Interleaved Thinking (first open-source model with this feature)
  • Superior multilingual support, outperforming Claude Sonnet 4.5 in multilingual benchmarks
  • Optimized for agentic scaffolding with support for Claude.md, .cursorrules, and Slash Commands
  • Cloud model — no local GPU required

2. Kimi K2.5 — Best for Large Agentic Tasks

ollama launch claude --model kimi-k2.5:cloud

Kimi K2.5 is Moonshot AI's open-source native multimodal model built on approximately 15 trillion mixed visual and text tokens. It features 1 trillion total parameters (32B active) with a 256K context window, released under the MIT license. It's purpose-built for agentic workflows — making it ideal for long-running Claude Code sessions that involve tool calling, file navigation, and iterative code generation.

Key highlights:

  • Native multimodal (vision + text pre-training)
  • Can generate code directly from UI designs or video workflows
  • 256K context window — larger than Claude Opus 4.5's 200K
  • Cloud-hosted via Ollama — no heavy local hardware needed
  • ~$0.81 per million tokens on Ollama Cloud

3. Ministral 3 — Best for Local / Edge Deployment

ollama launch claude --model ministral-3

Ministral 3 is Mistral's open-source family (3B, 8B, 14B) designed specifically for edge deployment — capable of running on a wide range of hardware, including older machines. Released under the Apache 2.0 license, it supports vision, multilingual input, native function calling, structured JSON output, and even runs fully in-browser via WebGPU. With a 256K token context window and pricing around $0.1 per million tokens via Mistral's API, it's practically free to use.

Key highlights:

  • Smallest variant (3B) runs locally, even on low-end hardware
  • Fully multimodal with vision input support
  • Native function/tool calling — critical for Claude Code's agentic features
  • Apache 2.0 license — free to use and fine-tune commercially
  • Runs 100% offline with no data leaving your machine

Quick Setup Guide

Step 1: Install Ollama

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.com/install.sh | sh

Step 2: Install Claude Code

npm install -g @anthropic-ai/claude-code

Step 3: Sign in to Ollama (for cloud models)

ollama signin

Step 4: Launch with your chosen model

# Option A – MiniMax M2.5 (cloud, free/promotional)
ollama launch claude --model minimax-m2.5:cloud

# Option B – Kimi K2.5 (cloud, low cost)
ollama launch claude --model kimi-k2.5:cloud

# Option C – Ministral 3 (fully local)
ollama launch claude --model ministral-3

For larger context windows with cloud models, you can configure Ollama before launching:

OLLAMA_CONTEXT_LENGTH=64000 ollama serve

Model Comparison

FeatureMiniMax M2.5Kimi K2.5Ministral 3
HostingCloud ☁️Cloud ☁️Local 💻
Parameters (active)10B32B3B / 8B / 14B
Context WindowLarge256K256K
Vision
LicenseProprietaryMITApache 2.0
CostFree (promo) / paid~$0.81/M tokensFree (local)
Best ForVibe coding, UI genLong agentic sessionsOffline / edge use

Why This Matters for Developers

This integration represents a genuine shift in AI coding economics. Previously, heavy Claude Code usage could rack up significant API costs. Now, developers can:

  • Save costs by routing to free or near-free open models
  • Protect privacy by keeping sensitive code off Anthropic's servers
  • Work offline with local models like Ministral 3
  • Switch models instantly without changing Claude Code's interface or workflow

Importantly, this setup keeps Claude Code itself (the agent, tooling, and UX) intact — only the model backend changes. You still get plan mode, subagents, parallel workflows, and file editing — just powered by a different brain.

Comments 0

No comments yet

Be the first to share your thoughts!

Leave a Comment

Your comment will be reviewed before being published.
React to this post
1 reaction