ZDNet AI→ original

Nvidia wants to cover the entire AI data center stack — from chips to networking and storage

At GTC 2026, Nvidia outlined a new bet: selling not just GPUs, but the entire AI data center for enterprise AI workloads. The stack includes Vera Rubin…

AI-processed from ZDNet AI; edited by Hamidun News
Nvidia wants to cover the entire AI data center stack — from chips to networking and storage
Source: ZDNet AI. Collage: Hamidun News.
◐ Listen to article

Nvidia at GTC 2026 showed that it no longer wants to be just a GPU supplier. The company is offering customers an entire AI data center as a single product — from compute and networks to context storage and management software.

Betting on the Vertical

The main signal from Jensen Huang's presentation was not about one new chip, but about business architecture. Nvidia built a lineup of racks on stage and essentially told the market: AI infrastructure will be cheaper, faster, and more profitable if you buy not individual components, but the entire stack from a single supplier. This is no longer the model of "we sell accelerators, and you assemble the rest yourselves," but an attempt to turn a data center into a fully designed-by-Nvidia system.

Huang has long been pushing the idea of an AI factory — a factory that produces tokens and intelligence just as a plant produces parts. Now that idea has become even more hardcore: Nvidia wants to control not just compute, but also the network, memory, intermediate data storage, the CPU layer, and the software that ties it all together. Against the backdrop of the boom in agent systems, such an approach makes logical sense: the bottleneck is no longer just GPUs, but everything that moves data between them.

What Went Into the Stack

In the new configuration, Nvidia assembled several racks, each solving a separate problem in an AI data center. Together they form almost a complete kit for companies building large clusters for training and inference and not wanting to deal with dozens of disparate suppliers in a single scheme.

  • Vera Rubin NVL72 — flagship rack-scale system with 72 Rubin GPUs and 36 Vera CPUs.
  • Vera CPU Rack — separate rack with 256 CPUs for agent AI tasks, where tool calling, SQL, and code execution matter.
  • BlueField-4 STX — storage layer and fast KV-cache delivery, which large language models need during inference.
  • Spectrum-6 SPX — new Ethernet network for connecting racks and scaling clusters.
  • Groq 3 LPX — inference rack with 256 LPU accelerators focused on low latency and large context.

The point of this assembly is that Nvidia now sells not just "horsepower" in the form of GPUs. It covers those parts of the system where milliseconds, watts, and money are usually lost: moving data between chips, working with context, network latency, CPU tasks for agents, and overall load balancing. The larger the model and the bigger its context, the more noticeable these overhead costs become.

Where Nvidia Sees the Advantage

Nvidia's most obvious argument is inference economics. The company claims that the combination of Vera Rubin and Groq 3 LPX reduces external DRAM accesses due to the large SRAM volume in the LPU, thus reducing latency and speeding up token delivery. According to Nvidia, such a scheme can provide up to 35 times more throughput per megawatt for models with trillions of parameters, and up to 10 times more revenue per watt in scenarios with expensive "premium" tokens.

"What used to require a whole day of requests will now be done in less

than an hour."

A special bet has been placed on the CPU layer. Nvidia directly states that even in the age of GPUs, agents constantly hit regular computational tasks: tool calling, SQL queries, code compilation, and sandboxed code execution. That's why the company puts its own Vera CPUs front and center and adds DPUs and specialized context storage nearby. This expands Nvidia's ambitions far beyond accelerators and shows that the company wants to take even more margin in AI infrastructure.

What This Means

For the market, this is another step toward vertically integrated AI data centers, where a single vendor is responsible for almost everything. For customers, such a model can provide simpler deployment and better efficiency, but at the cost of stronger Nvidia lock-in. For competitors — from CPU and networking players to storage system suppliers — this is a signal that Nvidia is no longer playing only in the GPU field.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…