Hardware

AI Data Center

An AI data center is a facility purpose-built or heavily retrofitted to house GPU and accelerator clusters, designed specifically to handle the extreme power density, cooling demands, and high-bandwidth networking that large-scale AI training and inference workloads require.

Conventional data centers are designed around servers drawing 5–15 kW per rack. A rack of eight H100 GPUs consumes 60–80 kW, and next-generation accelerator systems push beyond 100–150 kW per rack, requiring fundamentally different infrastructure. AI data centers deploy direct liquid cooling — cold plates on GPUs and CPUs, rear-door heat exchangers, or full immersion tanks — along with high-capacity power distribution and facility designs that prioritize thermal density over floor-area efficiency. A cluster of 30,000 H100s can draw more than 100 MW continuously, comparable to a small city's residential load.

The networking layer is equally distinctive. AI data centers support fat-tree or rail-optimized InfiniBand or RoCE Ethernet topologies with switches aggregating hundreds of high-bandwidth ports per rack. Storage must deliver training data fast enough to prevent GPU starvation, requiring all-flash arrays or high-throughput distributed file systems such as Lustre, WEKA, or IBM Spectrum Scale. Site selection is heavily driven by power: facilities are increasingly co-located with substations or built near abundant energy sources — hydroelectric in Scandinavia and the U.S. Pacific Northwest, nuclear-adjacent sites in the U.S. Midwest — because multi-hundred-megawatt demands strain regional grids.

AI data centers became a major capital investment category in 2024–2025, with Microsoft, Google, Amazon, Meta, and Oracle collectively announcing hundreds of billions of dollars in construction commitments. Microsoft's partnership with OpenAI includes multi-gigawatt capacity targets across multiple continents. Geographic site selection is governed by power cost and availability, water access for cooling, subsea cable connectivity, regulatory environment, and tax policy. The U.S., Sweden, Finland, Singapore, and the UAE have emerged as significant hubs, and several jurisdictions have introduced permitting pathways specifically for AI-scale facilities.

As of 2026, a new generation of facilities is being designed around 100+ kW per rack, co-packaged optics for reduced interconnect power, and in some cases on-site power generation (gas turbines, small modular nuclear reactors under development). The 'AI factory' — a facility whose primary output is sold inference compute — has emerged as a distinct real-estate and infrastructure asset class. Persistent challenges include power grid capacity, water consumption for cooling, and the 18–36 month construction timeline, which creates a structural mismatch with GPU hardware cycles that turn over every 12–18 months.

Example

A cloud provider building a 30,000-H100 cluster selects a campus with 120 MW of contracted power, engineers each rack for direct liquid cooling at 80 kW load, and deploys a fat-tree 400G InfiniBand network to keep all-reduce latency within the budget needed to maintain high GPU utilization.

Related terms

Latest news on this topic

← Glossary