OpenAI Blog→ оригинал

OpenAI introduced MRC — a network protocol for 100,000-GPU AI training clusters

Through the Open Compute Project, OpenAI released the MRC specification, a new network protocol for training large models. It splits the traffic of a single com

OpenAI introduced MRC — a network protocol for 100,000-GPU AI training clusters
Source: OpenAI Blog. Коллаж: Hamidun News.
◐ Слушать статью

Through the Open Compute Project, OpenAI released the MRC specification, a new network protocol for training large models. It splits the traffic of a single communication across hundreds of paths, routes around failures faster, and simplifies network architecture. According to the company, MRC is already running in the largest clusters with NVIDIA GB200 and can withstand link failures and even switch reboots without interrupting training.

ЖХ
Hamidun News
AI‑новости без шума. Ежедневный редакторский отбор из 400+ источников. Продукт Жемала Хамидуна, Head of AI в Alpina Digital.
What do you think?
Loading comments…