DeepSeek cuts V4-Pro prices by 75% and reduces cache costs tenfold across entire API

Q: What is the source?

Originally published on TNW. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Apr 27, 2026. Reading time: 3 min.

DeepSeek has intensified the price war in the AI API market. The company temporarily cut V4-Pro prices by 75% until May 5, 2026, and simultaneously reduced…

Hamidun News Editorial

AI monitoring · TNW

Apr 27, 2026· 2 min

AI-processed from TNW; edited by Hamidun News

DeepSeek cuts V4-Pro prices by 75% and reduces cache costs tenfold across entire API — Source: TNW. Collage: Hamidun News.

◐ Listen to article

DeepSeek has sharply intensified its price war on the AI API market: the company announced a temporary 75% discount on the DeepSeek-V4-Pro model and simultaneously reduced the cost of cache hits tenfold across the entire API lineup. For developers, this is not merely a promotional offer valid until May 5, 2026, but an attempt to make the transition to the Chinese model almost painless financially, even for teams already working with OpenAI, Anthropic, or Google.

According to DeepSeek's current pricing table, the standard rate for V4-Pro is $1.74 per million input tokens on cache miss, $0.0145 per million cached input tokens, and $3.48 per million output tokens. The temporary promotion, valid until May 5, 2026, 15:59 UTC, reduces these values to $0.435, $0.003625, and $0.87 respectively. In parallel, the company updated caching rules for the entire API lineup: the price of a cache hit now stands at one-tenth of the original launch level. For production agents, this is particularly important because they continuously reuse the same system instructions, long prefixes, and context fragments.

This move looks strategic rather than merely marketing-driven. DeepSeek has long pressured the market on price, especially after the R1 release in January 2025, when it became clear that the Chinese company was willing to compete not only on quality but also on the cost of inference. Now the stakes are even higher: V4-Pro launched on April 24, 2026, and by April 27, the company announced aggressive API pricing.

Against this backdrop, the offer looks like a direct challenge to American providers, who in recent months have themselves been gradually lowering prices, but not nearly as sharply. The political backdrop adds an additional effect: the Donald Trump administration is simultaneously accusing Chinese AI companies of massive distillation of American models.

V4-Pro itself is designed for more than just price competition. According to DeepSeek, it is a mixture-of-experts model with 1.6 trillion total parameters and 49 billion active parameters per task. It supports a context window of 1 million tokens and maximum output of up to 384,000 tokens, making it notably more convenient for long documents, large codebases, and multi-step agent scenarios. The company specifically emphasizes API compatibility with familiar OpenAI and Anthropic formats, as well as native integration with Claude Code, OpenClaw, and OpenCode. This reduces the cost not only of usage but also of migration itself: there is no need to change your entire stack for a new model.

Another important layer of the story relates to infrastructure. DeepSeek is promoting V4 as a model optimized for Chinese Huawei Ascend 950 chips and Cambricon hardware, not just Nvidia. For the market, this signals that competition is no longer just between individual models, but between entire technological stacks: their own accelerators, their own API layer, their own agent tools, and their own pricing policy. If such a combination truly delivers stable quality on long context, the pressure on closed American suppliers will shift from episodic to systemic.

The conclusion is simple: DeepSeek is trying to win not just one news cycle but a share in real development. When a model has open weights, a million tokens of context, compatibility with popular SDKs, and simultaneously pricing that sharply reduces the cost of repetitive requests, the arguments for staying on a more expensive API diminish. For startups and small teams, this is a chance to launch agent products at lower cost, and for large players, a reminder that the next phase of competition in AI will be determined not only by answer quality but by the price of each working request.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation