AI companies are not raising prices for users while Nvidia and Micron collect record profits
Nvidia and Micron Technology are breaking profit records on the wave of the AI boom, and memory shortage for neural networks will persist at least through…
AI-processed from 3DNews AI; edited by Hamidun News
Producers of AI chips and memory are recording record profits amid the generative AI boom. But the companies developing the models themselves are not yet rushing to pass on rising infrastructure costs to end users — and this is changing the entire pricing logic in the industry.
Windfall profits for hardware suppliers
For several quarters in a row, Nvidia has demonstrated margins atypical even for the best periods of the semiconductor industry. Now memory manufacturers have joined them. Micron Technology is showing comparable results against the backdrop of soaring demand for HBM (High Bandwidth Memory) — specialized memory for AI accelerators. SK Hynix, Samsung, and Micron combined are not able to increase HBM production faster than demand from data centers is growing. Analysts predict that the shortage will persist at least through the end of 2026. This means: component prices will remain high, and supplier margins will grow.
- HBM3E — a key component for next-generation GPUs (Nvidia H200, B200)
- HBM prices have more than doubled over the past two years
- Production capacity is expanding, but the lag behind demand persists
- Shortage is forecast for all of 2025–2026
- New players (Broadcom, Marvell) are entering the segment, but so far are not changing the situation
Investors, not users, are closing the gap
Despite rising costs, major AI model developers are generally holding or reducing API access costs. Competition at this level is so intense that openly passing on costs to clients is risky: a company will simply lose market share to a cheaper competitor. Infrastructure expenses are currently being covered by three factors.
First — model optimization: more efficient architectures reduce inference costs. Second — economies of scale: the more requests flowing through a single cluster, the lower the per-unit cost. Third — venture capital subsidization: investors are putting in billions in expectation of future monetization, covering current losses.
OpenAI, according to analysts' estimates, still operates at a loss at the operating expense level. Anthropic receives investments from Amazon and Google that effectively subsidize developer access to models. This creates artificially low market prices that do not reflect the true cost of computation.
Who ultimately pays
The value chain in the AI industry today works paradoxically: hardware manufacturers earn, model developers operate at a loss or break even, end users receive services at subsidized prices. The question is how long this can continue.
With market consolidation — when 2–3 major players survive — the price war will die down and service costs will begin to rise. With a technological breakthrough, costs will naturally decline. If major corporate clients begin switching to their own models and infrastructure — this will change the entire balance of power.
For now, venture capital is paying for the "feast," but its patience is not infinite.
What this means
The current pricing situation benefits everyone building AI-based products right now: API costs are historically low relative to the true cost of computation. For business, this is a window of opportunity — while the market is subsidizing infrastructure, it makes sense to maximize competencies and products based on AI. A revaluation is inevitable — it's just a matter of timing.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.
The AI world, distilled — once a week
Seven stories that actually mattered, hand-picked. No noise, no reposts, no press releases.
Done! Check your inbox for a confirmation.