AI Coding Agent Pricing in 2026: Per-Seat vs Per-Task vs Self-Hosted Math
Real annual cost math for Claude Code, Cursor, Copilot, Codex, and self-hosted Aider across a 10-engineer team. The token-volume thresholds that flip the answer.
Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

The Quick Answer for 10-Engineer Teams
For a 10-engineer team writing code full-time in 2026, the annual cost ladder runs roughly: GitHub Copilot Business at $1,280/year (cheapest, weakest agentic loop), Cursor Business at $4,800/year (best ergonomics, usage-capped), Claude Code Max bundle at $24,000/year (highest ceiling, unlimited Claude inside the CLI), and self-hosted Aider + Qwen 3.5 32B at roughly $10,000-21,000/year all-in (one RTX 5090 box plus electricity, amortized). Anything in between is a rounding error. The math hinges on token volume per developer per day — under 200K tok/dev/day, hosted wins; past 6M tok/dev/day, self-hosted wins by orders of magnitude.
Last updated: April 2026 — verified Anthropic, Cursor, GitHub Copilot, and OpenAI pricing pages; included new entrants Codebuff, Augment, Cline, Roo Code.
Pricing Table: Every Major Coding Agent in April 2026
| Tool | Per-seat / month | Annual / dev | Token model | Best for |
|---|---|---|---|---|
| Claude Code Max | $200 | $2,400 | Unlimited inside Claude Code CLI, fair-use cap | Heavy agentic loops, long sessions |
| Cursor Business | $40 | $480 | 500 fast requests + unlimited slow | IDE-first, multi-file edits |
| Cursor Pro | $20 | $240 | 500 fast / mo | Solo devs, light usage |
| GitHub Copilot Business | $19 | $228 | Unmetered chat + completions | Cost-sensitive teams, Microsoft shops |
| GitHub Copilot Individual | $10 | $120 | Unmetered | Hobbyists, students |
| Codex (ChatGPT Pro plugin) | $200 | $2,400 | Bundled with ChatGPT Pro tier | OpenAI-loyal devs already on Pro |
| Codebuff | $15 | $180 | Pay-per-task on top of base | Lightweight terminal agent |
| Augment Code | $30 (BYO key) | varies | BYO Anthropic / OpenAI key | Procurement-blocked teams |
| Cline | Free | API costs only | BYO API key, OSS | Cost-conscious power users |
| Roo Code | Free | API costs only | BYO API key, OSS fork of Cline | Same as Cline, faster updates |
| Aider + Qwen 3.5 32B (self-hosted) | n/a | ~$1,800/team | Hardware + electricity | Compliance-blocked teams, very heavy use |
Three things change the picture from this table. First, "unmetered" never quite means it — Copilot has soft rate limits at roughly 60 requests/hour per user, Cursor's "unlimited slow" requests queue under load, and Claude Code Max enforces a fair-use review if you sustain more than ~50M tokens/dev/day. Second, every per-seat plan is priced for the median dev — heavy users subsidize light ones. Third, BYO-API-key tools (Augment, Cline, Roo) shift cost from the seat to the underlying model API; for most teams that's Anthropic Claude or OpenAI billing, which can dwarf any seat cost.
Real Workload Math: 10 Devs, 22 Working Days
The number nobody runs publicly: 200K tokens/dev/day × 22 working days × 10 devs = 44M tokens/month. That's a fair midpoint for teams using AI agents seriously — frequent multi-file edits, code review, test generation, refactors. Light users hit 50K/dev/day; heavy agentic-loop users (everything-via-Claude-Code workflow) cross 1M/dev/day.
Hosted Per-Seat (Cursor Business, 10 seats)
Flat $40 × 10 × 12 = $4,800/year. Cursor's 500 fast requests/dev/mo budget covers the 200K-tok/dev/day midpoint comfortably. Past 6M tokens/dev/day, the slow-request queue starts hurting iteration speed and you'd push toward Claude Code Max.
Claude Code Max (10 seats)
$200/dev/mo × 10 × 12 = $24,000/year. Buys you uncapped Claude Sonnet 4.6 and Opus 4.7 inside Claude Code, including the agentic harness, sub-agents, and skills. The pricing only makes sense if your team genuinely uses Opus daily — at the 200K-tok/dev/day midpoint, equivalent API spend would be ~$360/dev/mo, so the bundle saves roughly 45%.
BYO-Key with Cline + Anthropic API (10 devs)
Cline is free. Anthropic's prompt caching applied aggressively brings the effective input rate down to about $0.30/M (vs the headline $3/M for Sonnet 4.6). At 44M tokens/month with a 70/30 input/output split, that's 30M cached input + 13M output ≈ $9/mo + $195/mo = ~$204/mo for the team. Adds up to roughly $2,448/year — competitive with Cursor Business but the cache hit rate is the entire ballgame.
Self-Hosted Aider + Qwen 3.5 32B
One RTX 5090 box at ~$3,500 amortized over 3 years = $1,167/year hardware. Electricity at 350 W average × 50% utilization × $0.14/kWh × 8,760 h/year = ~$215/year (assume residential US rate). Aider is free OSS. Total: ~$1,800/year for the team. Quality is the question — Qwen 3.5 32B in Q4_K_M lands at roughly 78% of Claude Sonnet 4.6 on SWE-Bench Pro, which is enough for refactors, doc generation, and test scaffolding, but falls short on subtle architectural reasoning. See self-hosted AI coding agents for the full setup playbook and Qwen 3.5 VRAM requirements if you're sizing the box.
The Verdict on Token Volume
- Under 200K tok/dev/day: GitHub Copilot Business is fine. Don't overbuy.
- 200K-1M tok/dev/day: Cursor Business or Claude Code Max — pick by IDE preference.
- 1M-6M tok/dev/day: Claude Code Max wins; the bundle absorbs heavy use that would push BYO-key API bills past $500/dev/mo.
- Past 6M tok/dev/day per dev: Self-hosted Aider + Qwen 3.5 32B starts paying off in pure economics. You give up Opus-tier quality but for high-volume CI fixers, mass migrations, and test generation, "good enough" wins.
What "Per-Task" Actually Costs
Per-task pricing (Codebuff, some Augment plans, OpenRouter-based agents) charges roughly $0.05-0.20 per agent task — where a "task" is one bounded request like "write tests for this file" or "fix the failing CI". The math: a focused dev runs maybe 80 tasks per day. At $0.10/task, that's $8 per dev per day = $176/mo, more expensive than any seat plan. Per-task pricing makes sense when usage is genuinely bursty — a dev who runs 5 tasks one day and 50 the next averages cheaper than a flat seat — but for full-time AI-augmented coding, it loses to per-seat.
Pro tip: Track tokens per dev per day in your first month with any tool. Most teams discover their actual usage is 3-5x lower than they assumed during procurement, and they're paying for headroom they don't use.
Hidden Costs Nobody Lists in the Comparison
The headline number is rarely what you pay. Hidden costs that catch teams during the first quarterly review:
- Rate-limit upgrades: Cursor's "fast request" tier looks unlimited until your team blows past the soft cap and you're forced into the per-call overage tier.
- Model upgrade premiums: Some plans default to Sonnet/GPT-5; switching the team to Opus/GPT-5.4 doubles or triples the underlying API cost on BYO-key plans.
- Audit log seats: SSO, audit logging, and SCIM provisioning are usually only on the highest tier — even if you don't need agentic features, compliance often forces you up the ladder.
- Onboarding tax: Each tool needs configuration files (.cursorrules, CLAUDE.md, .github/copilot-instructions.md) per repo. Switching tools is a real engineering project, not a config change.
- Retraining: Switching from Cursor to Claude Code (or vice versa) costs roughly 2-3 weeks of partial productivity per dev as keybindings and prompting habits relearn.
When Self-Hosted Genuinely Wins
Self-hosted (Aider, Continue, OpenHands on Qwen / DeepSeek / GLM) wins in three scenarios. Compliance lockout: regulated industries (healthcare, finance, defense) often can't ship code or context outside the firewall; hosted is structurally banned. Extreme volume: above 6M tokens/dev/day, electricity beats API pricing. Learning: running your own inference forces you to understand quantization, KV cache, prompt caching, and tool-use loops — knowledge that compounds. See self-hosted AI coding agents for the full landscape and self-hosted LLM TCO for the full hardware-amortization math.
It does NOT win when the headcount is small (under 5 devs the hardware amortization breaks down), when SWE-Bench-tier quality matters daily (Opus 4.7 remains ~6 percentage points ahead of Qwen 3.5 32B on hard problems), or when the team can't operate inference infrastructure (Triton vs vLLM tuning, batching, GPU memory pressure are real engineering work). For most 10-30 dev shops, hosted is structurally cheaper if you account for engineering time at fully-loaded cost.
Decision Matrix
| Situation | Pick | Why |
|---|---|---|
| Microsoft / GitHub-native team, cost-sensitive | GitHub Copilot Business | SSO included, audit logs, $19/dev/mo |
| VS Code-style IDE, multi-file edits, mid-budget | Cursor Business | Best in-IDE ergonomics, $40/dev/mo |
| Heavy CLI agent loops, willing to pay for ceiling | Claude Code Max | Uncapped Opus 4.7, sub-agents, skills |
| Already on ChatGPT Pro, OpenAI-loyal | Codex plugin | Bundled, no extra spend |
| Procurement blocked on hosted-AI vendors | Cline / Augment + BYO-key | Vendor-of-record stays your existing API contract |
| Compliance bans cloud AI entirely | Aider + self-hosted Qwen / GLM | Air-gapped, MIT-licensed model |
| Heavy bursty usage, <20 devs | Codebuff per-task | Pay only for what you run |
Procurement Notes Most Teams Miss
Beyond price, three contract clauses matter for any agent your developers run against your codebase:
- Data handling: Does the vendor train on your prompts? Anthropic and OpenAI explicitly do not on enterprise tiers; some smaller vendors do unless you opt out. Read the DPA, not the marketing page.
- IP on output: Most vendors disclaim IP rights on generated code, but the language varies. GitHub Copilot has the strongest indemnity (it'll defend you if a copyright claim arises against generated code matching public training data).
- Vendor risk: New entrants (Augment, Codebuff, Cline-as-a-service) may not survive 2026. For mission-critical workflows, prefer tools where the underlying model API contract is your fallback (BYO-key tools degrade gracefully).
For broader tool-side comparison see AI coding assistants compared, and for the underlying API economics that drive every BYO-key calculation, see LLM API pricing.
Frequently Asked Questions
Is Claude Code worth $200/month per developer?
If your dev runs Opus 4.7 daily for agentic loops, sub-agent dispatch, and long sessions, the equivalent API cost would be $300-450/mo per dev, so the bundle saves 40-60%. If they use it occasionally for chat-style help, you're overpaying — Cursor Pro at $20/mo or Copilot Business at $19/mo will do.
Cursor vs GitHub Copilot — which is cheaper for a 10-person team?
GitHub Copilot Business at $19/dev/mo ($2,280/year for 10) is cheaper than Cursor Business at $40/dev/mo ($4,800/year for 10). Copilot is the right choice if cost dominates and your team is already in the GitHub ecosystem. Cursor wins on agentic multi-file edits, which Copilot's chat doesn't match in 2026.
When does self-hosted AI coding pay off vs hosted?
Three signals: token volume above 6M/dev/day (hardware amortization beats API), compliance lockout (regulated industries can't ship context out), or learning value. Below 6M tok/dev/day for a 10-dev team, hosted is cheaper after factoring engineering time to operate the inference stack.
What does "per-task" pricing actually mean for AI agents?
Per-task pricing (Codebuff, OpenRouter-based agents) charges $0.05-0.20 for one bounded agent run — like "fix the failing test" or "scaffold a CRUD endpoint." Math: 80 tasks/dev/day at $0.10 = $176/dev/mo, pricier than any per-seat plan. It only makes sense for bursty users averaging under 10 tasks/day.
Can I use Claude Code with my own Anthropic API key?
Yes — Claude Code accepts ANTHROPIC_API_KEY for direct API billing. You lose the Max plan's bundled-unlimited usage and pay metered rates instead. For light or sporadic use, BYO-key is cheaper than the $200/mo Max bundle. For heavy daily use, Max wins.
What about IP ownership of code generated by AI agents?
All major vendors (Anthropic, OpenAI, GitHub, Cursor) disclaim ownership and assign output rights to the user. GitHub Copilot ships the strongest IP indemnity (they'll defend you against copyright suits arising from training-data-similar code). For regulated industries, get the DPA reviewed before procurement — clauses vary on training-on-prompts, retention, and audit access.
Bottom Line
Don't buy ceiling you won't hit. Most 10-engineer teams in 2026 are either on Cursor Business ($4,800/year) or GitHub Copilot Business ($2,280/year) and are fine. The story changes if you've got a heavy agentic workflow (Claude Code Max), a compliance wall (self-hosted Aider + Qwen), or a sub-5-dev team with bursty usage (per-task billing). Track tokens per dev per day before renewal — most teams discover they're paying for 5-10x the headroom they actually use.
Written by
Abhishek Patel
Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.
Related Articles
Multi-Cluster Kubernetes: Argo CD ApplicationSet Patterns
When 10+ clusters or 50+ services break hand-written GitOps. ApplicationSet's four generators (cluster list, Git directory, PR, cluster decision), real production patterns (env promotion, per-tenant, multi-region failover, preview envs), and the sharp edges (template debugging, cascading mistakes, RBAC).
11 min read
AI/ML EngineeringLLM Latency: TTFT, ITL, and Why End-User Latency Isn't What You Think
LLM latency decomposes into TTFT (time to first token, 300-1500ms), ITL (inter-token, 10-30ms), and total time. Each has different causes and fixes. Why streaming dominates UX, when Cerebras/Groq beat Claude on speed, and the optimization playbook.
11 min read
DevOpsPython uv vs pip vs Poetry vs PDM: Speed Benchmarks 2026
Real benchmarks: uv installs Django + ML stack in 8s vs pip's 90s, Poetry's 50s, PDM's 38s. Why uv is fast (Rust + parallelism + PubGrub), what pip still does that uv doesn't, migration paths, and where Poetry's ergonomics still win.
12 min read
Enjoyed this article?
Get more like this in your inbox. No spam, unsubscribe anytime.