AI Stopped Being a Subscription. Here Is What Comes Next.

I remember the first time someone showed me their AWS bill after a misconfigured Lambda function ran for a weekend. The number made no sense for a few days of testing. Everyone in the room learned the same lesson at once: pay-as-you-go sounds friendly until it isn't.

GitHub Copilot gave the developer world the same lesson on June 1, 2026.

GitHub switched all Copilot plans from flat-rate billing to usage-based billing. Instead of a fixed monthly fee covering a set number of AI interactions, every token you send and receive now counts against a credit balance. Developers are feeling it.

A developer staring at a shocking invoice on their laptop, expression of disbelief

What Changed

The new system runs on GitHub AI Credits. One credit equals $0.01. Your monthly plan includes a credit allowance matching your subscription price: Copilot Pro at $10/month gets $10 in credits, Pro+ at $39 gets $39 in credits. Business plans at $19/user get $19 in credits.

Sounds manageable. Here is the problem: those credits burn fast in agentic mode.

The model you choose sets the cost. GPT-5.5 runs $5.00 per million input tokens and $30.00 per million output tokens through Copilot. Claude Sonnet costs $3.00 input / $15.00 output per million tokens. An intensive agentic session, where the AI reads large files, writes code, runs commands, checks output, and iterates, burns millions of tokens in an hour.

Some developers are reporting monthly costs jumping from $29 to $750. Others are seeing $50 turn into $3,000. The range sits at 10x to 50x for the heaviest users.

For developers who use Copilot mainly for autocomplete and tab completions, the change is minimal. Code completions do not consume credits at all. The hit lands on people doing the most sophisticated AI-assisted work.

And those are exactly the people worth watching.

GitHub did add a grace period: Business customers get $30/month in promotional credits through August 2026, Enterprise customers get $70/month. Consider this a runway to get your usage under control, not a reason to postpone the audit.

The AWS Parallel

If you were building on AWS around 2012, you learned a pattern. First, AWS offered pricing looking impossibly cheap. Teams built real workflows on top of it. Then the bills arrived.

Not because AWS did anything wrong. Usage-based pricing works differently from subscription pricing. With subscriptions, you budget once and move on. With usage-based billing, your bill is a function of your behavior, and behavior is unpredictable.

The developer who builds an agent to automatically review every pull request, scan for security issues, write commit summaries, and respond to review comments is not doing four small tasks. They are building a token-consuming machine. Every PR, every commit, every comment is a billing event. Multiply by a team of 20 and you have an infrastructure cost hiding inside your developer tooling budget.

AI tool providers have run the same playbook as AWS. They subsidized usage to drive adoption. Developers got comfortable building heavily with these tools. Now the economics need to work, and the subsidy ends.

74% of software suppliers have already adopted usage-based pricing models. GitHub Copilot is not early here. It is late.

A spinning cost meter representing unpredictable token-based billing

Copilot Is Not the Only One

Every AI tool in your stack is heading the same direction.

Claude Code, Cursor, Codeium, Cline, Continue... all are in various stages of the same transition. Some already bill by token. Some bill by premium request, which is tokens with extra steps. Some still offer flat subscriptions but with usage limits tightening as compute costs bite. One analysis of eight months of daily Claude Code usage found consumption of 10 billion tokens. At current API pricing, the total runs around $50,000. No flat subscription was designed to absorb numbers like those.

The comfortable era of "unlimited" AI assistance for $20/month is ending. It was never real. Anthropic, OpenAI, and Google do not run frontier models for free. They were building market share. Market share is built. Now they are building margins.

Gartner projects 35% of point-product SaaS tools will be replaced by AI agents by 2030. AI agents consume tokens at a rate no flat-rate subscription was designed to support. An agent reading your entire codebase, running tests, checking results, and writing a fix burns through a week's "unlimited" credits in an afternoon.

What To Do Now

The practical response is straightforward.

Start tracking. If you do not know how many tokens your team consumes per month, find out now. Most platforms show usage dashboards. GitHub Copilot's billing UI makes token consumption visible by user. Look at your heaviest users. The spread will surprise you.

Benchmark before you scale. If you are rolling out AI coding tools to a larger team, run a one-month usage test with a representative group first. The cost-per-developer figure from marketing is based on average users. Your agentic power users will blow through it. I have seen estimates where the top 10% of users consume 60% of the tokens. Your budget needs to reflect your actual distribution, not the mean.

Budget for overages. With flat subscriptions you budget once a year and forget. With token billing, AI spend is a variable cost line, the same way cloud infrastructure is. Finance needs to know this now, not after the first big invoice. The developer who had uncapped AI access on a $39/month plan now has uncapped AI access billed per token. Two different business relationships.

Match model to task. Not every task needs GPT-5.5 or Claude Opus. Quick autocomplete, documentation searches, and simple refactors work fine on cheaper models. Using a $0.75/million input token model instead of a $5.00/million model cuts costs by 85% on low-complexity work. Build a habit of picking the right model for the job, not the most capable one by default. This is not a sacrifice in output quality for most tasks. It is a precision choice.

Watch for context compounding. In agentic workflows, the AI reads the full conversation history at the start of every action. A long session means large input token costs on every step. If you are building agents handling multi-hour tasks, context management is now a cost management skill. Prune your context. Use summaries. Know when to start a fresh session.

Set hard limits. Both GitHub Copilot and most API providers now offer spend caps and per-user limits. Use them. A $200/month cap per developer means no single agent run turns into a four-figure surprise. The AWS world learned this the hard way. Do not repeat the lesson.

A floodgate opening, representing budgets breaking open under usage-based billing

The Bigger Pattern

The developers most hit by GitHub Copilot's billing change are the ones doing the most with it. Agentic coders. People building AI-driven workflows at scale. The people whose workflows represent where the rest of us will be in 18 months.

No coincidence there. Early adopters absorb the true cost of a new technology while tooling and pricing models mature around them. The same happened with cloud storage, with API calls, with container orchestration. Each time, the initial price looked like a new paradigm. Each time, real economics arrived later.

AI productivity and AI cost are now directly linked. The more you use AI, the more you spend. This is normal for infrastructure. Anthropic, OpenAI, and Microsoft always knew flat-rate AI was a growth tactic, not a business model.

For individual developers, this is a workflow audit. For engineering managers, this is a budget conversation with finance. For CTOs, this is a line item on the infrastructure P&L, sitting right next to AWS and Datadog.

The tools are not going away. The costs are going up. Build cost awareness into how you use them now, or let the bill tell you later.

I know which one I prefer. What is your AI spend going to look like this time next year?