Microsoft terminates Anthropic licenses as token costs skyrocket. Analysis of hidden AI budgets and optimization strategies for SMEs and mid-market companies.


Microsoft just terminated its internal Anthropic licenses. The reason: in just a few months, the shift to token-based billing caused their annual budgets to explode. If a tech giant with virtually unlimited resources deems these costs unsustainable, what should French and Belgian SMEs and mid-market companies expect?
This decision isn't trivial. It reveals a reality that many executives discover too late: the true cost of generative AI in business bears no resemblance to the advertised rates. Between initial estimates and the final invoice, the gap can reach 300 to 500% depending on usage patterns. And this phenomenon will intensify in 2025-2026.
This article breaks down the mechanics of token-based billing, identifies the hidden costs no one mentions, and offers concrete strategies to control your AI budget without sacrificing performance.
A token represents approximately 0.75 words in French. Each interaction with an AI model like GPT-4, Anthropic's Claude, or Google's Gemini consumes input tokens (your request) and output tokens (the generated response). Billing distinguishes between these two flows, with different rates.
Here are the average rates observed in 2025 for premium models:
These figures seem modest. They become staggering at organizational scale.
A simple conversation with an AI assistant consumes between 1,000 and 4,000 tokens. But professional use cases involve much heavier contexts: reference documents, conversation history, detailed system instructions. A single business query can reach 50,000 to 100,000 tokens.
Let's take a concrete example. A team of 20 salespeople uses an AI assistant to draft commercial proposals. Each proposal requires:
That's 35,000 tokens per proposal. With 10 proposals per salesperson per week, the team consumes 7 million tokens weekly. Over a year: 364 million tokens, representing between EUR 15,000 and 40,000 depending on the model used. For a single use case.
Employees naturally learn to get better responses. How? By providing more context, requesting more detailed answers, following up to refine results. Each quality improvement translates to increased token consumption.
At AISOS, we observe that average consumption per user increases by 15 to 25% each month during the first six months of deployment. Without caps, initial budgets become obsolete within a quarter.
Every API call includes system instructions that define AI behavior. These instructions are billed with each request, even if they never change. A 2,000-token system prompt repeated 10,000 times daily represents 20 million monthly tokens: between EUR 200 and 600 per month for text that nobody reads.
AI models don't always succeed on the first try. Format errors, incomplete responses, timeouts: each failure consumes tokens. Robust architectures include automatic retry mechanisms. Result: 10 to 20% additional consumption to handle edge cases.
Providers update their models regularly. Each new version can modify behaviors, requiring prompt adjustments and testing phases. These iterations consume tokens without producing direct value. The most active companies can dedicate 5 to 10% of their annual budget to this.
Once your workflows are built around a specific model, migrating to a less expensive alternative means rewriting prompts, retesting use cases, training teams. This migration cost strengthens the original provider's negotiating power. Price increases become difficult to contest.
The Microsoft case illustrates a systemic phenomenon. According to available information, the company found that its internal teams had consumed the equivalent of their entire annual Anthropic budget in just a few months.
Several factors explain this drift:
Microsoft isn't abandoning AI. The company is rationalizing its investments by favoring its own models via Azure OpenAI, where it has better control over costs and margins. This decision is strategic, not defeatist.
Before any deployment, clearly define:
This governance isn't bureaucratic constraint. It's the condition for transforming AI into a controlled investment rather than a financial sinkhole.
Not all use cases require GPT-4 or Claude Opus. A simple classification can reduce costs by 40 to 70%:
AISOS audits reveal that 60 to 75% of enterprise queries can be handled by mid-range models without perceptible quality loss.
Every token counts. Simple techniques can reduce consumption by 20 to 40%:
Public rates are starting points, not final prices. Beyond a certain volume, negotiate:
Companies that negotiate regularly achieve 15 to 30% reductions compared to standard rates.
For non-critical use cases or very high volumes, open source models like Llama 3, Mistral, or Falcon offer comparable performance at virtually zero marginal cost once infrastructure is deployed.
The economic calculation becomes favorable when:
Current trends paint a contrasted landscape for the coming years:
Factors driving unit cost reduction:
Factors driving total cost increase:
The most realistic projection: unit costs will decrease by 20 to 30%, but volumes will increase by 100 to 200%. Companies' overall AI budgets will continue to grow, but more predictably if best practices are in place.
The Microsoft case isn't a defeat for enterprise AI. It's a signal of maturity. Organizations that survive the euphoria phase will be those that have learned to measure, optimize, and prioritize their AI investments.
For SMEs and mid-market companies, this discipline is even more crucial as budget flexibility is limited. But it's also an opportunity: a mid-sized company that masters its AI costs can deploy use cases that competitors will deem too expensive.
Three priority actions to launch this week:
Mastering AI token costs is no longer a technical subject reserved for IT teams. It's a general management issue, on par with payroll or procurement. Leaders who integrate it into their financial management now will gain a decisive head start.

Microsoft Drops Anthropic: When AI Costs Skyrocket, How Companies Adapt

Microsoft cancels its Anthropic licenses: how businesses can avoid their AI budget spiraling out of control

Claude Surpasses ChatGPT: 2028 Scenarios and New Opportunities for B2B Companies