I used to ignore every “free AI coding” post on sight.

Most of them are coupon math. Free for three days. Free until you breathe wrong. Free if you bring five API keys and pretend the invoice is someone else’s problem.

This one is different. Not because free models are suddenly magic, but because the tooling around them finally stopped feeling cursed.

You can get real code done now. Small patches, bug hunts, reviews, docs, test fixes, boring repo cleanup. The kind of work that actually stacks up over a week.

The catch is quota. Always quota.

A normal chat burns one request when you ask one thing. A coding agent burns requests while reading files, searching the repo, editing, running tests, getting confused, recovering, and asking the model again.

So the question is not “is it free?”

The question is: how long does the free day last before the agent hits the wall?

Start Here

If I had to set this up from zero today, I would keep it simple:

  1. Install free-coding-models.
  2. Use OpenCode as the main coding agent.
  3. Keep Freebuff around for quick second opinions.
  4. Add OpenRouter free models as spare fuel.

That is the whole stack.

Not glamorous. Not a giant spreadsheet. Just the stuff I would actually open in a real repo without feeling like I am debugging my tools instead of my code.

Everything else I checked either needs your own paid key, has a quota too tiny for agent work, or has a quota story I do not trust enough to recommend here.

Quotas In Human Units

Requests are not tasks. This is where people get baited.

A “50 requests/day” free tier sounds decent until you remember a coding agent can spend ten requests just figuring out where the bug lives.

My rough mental math:

  • 1-3 requests: one coding question.
  • 10-25 requests: one small patch.
  • 30-80 requests: one bug hunt.
  • 80-200 requests: one feature across a few files.
  • 200+ requests: refactor country.

So no, 50 free requests is not a daily coding subscription. It is one careful debugging session or a couple of clean patches if you keep the agent on a leash.

That leash matters. Ask for one change. Run the test yourself. Paste the failure back. Do not hand the agent your whole roadmap and watch it burn the day on vibes.

The Cheat Code: free-coding-models

free-coding-models is the star here.

It is not a coding agent. It is better than that. It is the thing that makes free coding agents less annoying to use.

The annoying part is never just “find a free model.” The annoying part is finding one that is alive right now, fast enough right now, and wired correctly into the tool you want to use right now.

free-coding-models does that boring work. It tracks free and free-limited models, checks what is working, shows what needs keys, and writes the config for your coding tool.

Install free-coding-models

1
2
npm install -g free-coding-models
free-coding-models

The flow is basically:

  1. Add a free provider key.
  2. Let it test the models.
  3. Pick something alive.
  4. Launch your coding tool from the same screen.

It supports providers like OpenRouter, Groq, NVIDIA, Cerebras, Mistral, and more. You press P to add keys, Z to cycle tools, and Ctrl+P when your brain forgets the hotkeys.

How It Works

The cool part is that it is not reading a stale leaderboard and praying.

It pings the free providers from your machine, in parallel, and builds the table from what is actually alive right now. That matters because free APIs are moody. A model can be great at noon and useless by dinner.

It also scores stability, not just raw speed. The score mixes p95 latency, jitter, spike rate, and uptime. That is the right instinct. A model that answers once in 300ms but randomly stalls for six seconds is not fast. It is annoying.

Once you pick a model, free-coding-models writes the endpoint and model ID into the coding tool config for you. That is the boring step everyone hates doing manually, and it is exactly the kind of boring step computers should eat.

Router mode is the really neat bit. Your agent talks to one local OpenAI-compatible endpoint, usually http://localhost:19280/v1, with model fcm. Behind that, free-coding-models keeps probing your active model set and routes around the usual free-tier nonsense: 429, timeouts, random 500s, provider maintenance pages, bad JSON.

This is not a privacy invisibility cloak. Your prompts still go to whichever provider handles the request. But the tool keeps the switching logic local, keeps your setup from becoming config soup, and limits telemetry to product analytics: no API keys, prompts, source code, file paths, or secrets. You can also turn it off with --no-telemetry.

That is why I like it. Free coding usually fails because one provider gets slow or rate-limited and your whole session dies. free-coding-models treats free models like flaky network routes: keep checking, keep scoring, move when one path breaks.

For OpenCode, this is the lazy path:

Launch OpenCode With A Free Model

1
free-coding-models --opencode

If you use Crush or Cline workflows, launch them from here too:

Launch Crush With A Free Model

1
free-coding-models --crush

Launch Cline With A Free Model

1
free-coding-models --cline

To actually run that router, start the daemon:

Start The Local Router

1
2
free-coding-models --daemon-bg
free-coding-models --daemon-status

Then configure your tool with:

  • Base URL: http://localhost:19280/v1
  • Model: fcm
  • API key: fcm-local

That is why I would put it first. OpenCode is the agent. free-coding-models is the switchboard. It turns “which free model is not dead today?” into a menu.

In real work, this does not make quotas infinite. It just wastes fewer requests. When one free provider slows down or throws a 429, you can move instead of staring at the wall.

OpenCode Zen

OpenCode is the first actual agent I would install.

It feels like a real CLI, not a prompt box pretending to be an IDE. It can read the repo, edit files, run commands, and stay out of the way enough that you still feel in control.

Install OpenCode

1
2
curl -fsSL https://opencode.ai/install | bash
opencode

OpenCode also has Zen, which includes free hosted models. Inside OpenCode, open /models and pick an opencode/ model.

The live Zen list currently includes:

  • opencode/big-pickle
  • opencode/deepseek-v4-flash-free
  • opencode/mimo-v2.5-free
  • opencode/qwen3.6-plus-free
  • opencode/minimax-m2.5-free
  • opencode/nemotron-3-super-free

The important bit: these are free, but not holy scripture. Some are feedback-window models. The list can move.

OpenCode does not publish a clean “X free requests per day” table for Zen. There is rate limiting, and the public source shows IP/window-based limits, but production numbers are not posted.

My practical budget for OpenCode free: a few small patches, one real bug hunt, or an evening of learning in a public repo. If the pool is generous that day, great. If it tightens up, switch models or stop.

Do not throw secrets at the free hosted models. Zen’s privacy docs have exceptions for free models, including trial logging and model improvement cases. Public code is fine. Sensitive client code is not the place to be heroic.

If you want predictable volume later, OpenCode Go is the cheap paid lane. Not free, but at least the math is visible: $12 of model usage per 5 hours, $30 weekly, $60 monthly.

Freebuff

Freebuff is the “I do not want to configure anything” option.

No API key dance. No model-router ceremony. Install it, open a repo, and ask it to do the thing.

Install Freebuff

1
2
3
npm install -g freebuff
cd ~/your-project
freebuff

Freebuff currently shows DeepSeek V4 Flash, DeepSeek V4 Pro, MiniMax M2.7, and Kimi K2.6 as active models. Its README says the free tier is supported by ads in the CLI.

There is no official request-per-day number I would plan my life around. That is the trade. It may feel generous in practice, but it is not a clean quota you can schedule against.

In real work, I would use it for supervised tasks: fix this bug, review this diff, add this route, wire this component, explain this ugly function.

I would not give it “build my app” and disappear.

The price is not money. The price is ads, cloud routing, less control over model choice, and trust. Fine for personal projects. Less fine for private code you would be embarrassed to leak.

OpenRouter Free Models

OpenRouter is not a coding agent. It is a model gateway.

That still matters because free-coding-models and OpenAI-compatible tools can use it as a provider.

The current free-model rule is simple:

  • 20 requests per minute.
  • 50 :free requests per day before buying credits.
  • 1,000 :free requests per day after buying at least $10 credits.

Pure free OpenRouter is backup fuel. Fifty requests can cover one careful bug hunt or two small patches. It will not survive a messy agent session where the model keeps rereading half the repo.

After the $10 unlock, it becomes much more useful. Not free-free, but cheap enough to keep as an emergency tank.

What I Am Not Recommending

Gemini CLI is not here. The quota story moved around too much, and I am not putting stale Google numbers in a guide about free usage.

GitHub Copilot Free is also not a real daily agent budget. The official free plan has 2,000 completions and 50 premium requests per month. Nice trial. Bad backbone.

I am also skipping the pure BYOK tools. Continue, Goose, and similar tools are useful if you already have keys or local models. But “bring your own key” is not the same thing as “free coding agent.”

My Free Setup

This is what I would actually run:

  • free-coding-models as the front door.
  • OpenCode for the main session.
  • Freebuff when I want a second agent quickly.
  • OpenRouter as one provider in the stack, not home base.

Then I would keep the workflow boring:

  1. Ask for a plan.
  2. Let the agent make one small change.
  3. Run the test myself.
  4. Paste the failure back.
  5. Repeat.

Free quotas punish vague prompts. The more specific you are, the longer the day lasts.

That is the whole trick. Use the free stuff like a sharp tool, not like an unlimited intern.