Microsoft has discontinued its Anthropic licenses following a cost explosion. Here's how SMEs and mid-market companies can control their AI budget in 2025.


In May 2025, news that shook the enterprise artificial intelligence world: Microsoft cancelled its internal Anthropic licenses. The reason? An uncontrolled explosion of costs linked to token-based billing. What was supposed to be an annual budget was consumed in just a few months.
If a tech giant like Microsoft, with dedicated teams and considerable internal expertise, can be caught off guard by escalating AI costs, what about French and Belgian SMEs and mid-market companies? This situation reveals a structural problem that many executives discover too late: the usage-based billing model for LLMs is a budgetary time bomb.
This article gives you the keys to understanding what happened, anticipating the risks for your business, and implementing an AI cost control strategy without sacrificing your visibility in generative search engines.
A token represents approximately 4 characters in English, or about 0.75 words. In French, this ratio is often less favorable due to accents and grammatical structure. Each interaction with an LLM like Claude (Anthropic), GPT-4 (OpenAI), or Gemini (Google) consumes tokens on input (your query) and output (the generated response).
Pricing varies considerably across models:
These figures seem modest. But a single complex conversation can consume 10,000 to 50,000 tokens. Multiply this by hundreds of employees using these tools daily, and the amounts become astronomical.
The Microsoft case illustrates a phenomenon we regularly observe at AISOS during our audits: actual consumption systematically exceeds initial projections by 300 to 800%. Why?
Before reducing costs, you need to understand them. Identify precisely:
This mapping often reveals surprises. A 200-person company may discover it's simultaneously paying for ChatGPT Team licenses, OpenAI API access, Perplexity Pro subscriptions, and Claude credits, with no coordination between teams.
Not all use cases require the most powerful models. Structure your AI access in three levels:
This tiered approach can reduce your bill by 60 to 75% with no perceptible impact on productivity.
A well-designed prompt consumes fewer tokens and produces better results. Key principles:
Training your teams in prompt engineering represents a minimal investment with immediate cost returns.
Professional platforms allow you to configure:
Microsoft would have avoided its Anthropic mishap by activating these safeguards. Many companies neglect these features despite their availability.
For certain internal use cases, open source models like Llama 3, Mistral, or Qwen may suffice. Advantages:
The trade-off: generally inferior performance compared to cutting-edge commercial models and technical skills required for deployment.
Some companies, frightened by costs, give up on AI investment. This is a strategic mistake. In 2025, nearly 40% of B2B searches go through conversational interfaces like ChatGPT, Perplexity, or Google AI Overview.
Not appearing in these generative engines' responses means becoming invisible to a growing portion of your prospects. The challenge isn't to spend less on AI, but to spend intelligently.
To maximize your AI ROI, focus your resources on:
AISOS audits regularly reveal that companies invest in generic AI tools when their priority should be optimizing their presence in generative engine responses.
Take the example of a 450-employee manufacturing company in the Lyon region. In January 2025, the company accumulates:
Total: approximately EUR 7,400/month, or EUR 89,000 annually, with a 15% monthly growth trend.
After a three-week audit:
New monthly budget: EUR 3,100, a 58% reduction. Measured productivity hasn't decreased. The heaviest user teams even report improvement thanks to the training received.
Enterprise contracts with multi-year commitments may seem attractive. But in a market where prices and technologies evolve every quarter, they quickly become burdens. Prioritize flexibility, even if it means paying slightly more short-term.
Unlimited offers often hide limitations: query quotas, restrictions on premium models, response length throttling. Read the terms carefully.
Imposing a single tool across the enterprise without consulting teams generates shadow IT. Employees find workarounds, often more expensive and less secure. Involve key users in the decision.
GDPR and sector requirements impose constraints on data processing by LLMs. A compliance incident costs infinitely more than savings realized on a subscription.
Microsoft's mishap with Anthropic isn't an isolated case. It foreshadows what many companies will experience in coming months if they don't anticipate.
A sustainable AI policy rests on four pillars:
Companies that master these four dimensions transform AI from an unpredictable cost center into a measurable competitive advantage.
AI budget explosion isn't inevitable. The Microsoft-Anthropic case simply demonstrates that even the biggest players can be caught off guard by an economic model that's still young and poorly understood.
For SME and mid-market executives, the lesson is clear: implement AI usage governance now. Audit, structure, train, measure. The tools and methods exist.
The challenge isn't to slow AI adoption in your company. It's to accelerate it in a controlled manner, maximizing return on every euro invested, including on your visibility in generative engines where an increasing share of your customer acquisition will be played out.
Want to assess your exposure to AI budget drift risks and optimize your presence in LLMs? Contact AISOS for a personalized diagnosis.