Together AI: how kernel optimizations close the gap between models and GPUs
Together AI’s team adapted CUDA kernels for the new Blackwell GPUs in one week — work NVIDIA had spent a year on. All thanks to FlashAttention (2022) and Thunde

◐ Listen to article
Together AI’s team adapted CUDA kernels for the new Blackwell GPUs in one week — work NVIDIA had spent a year on. All thanks to FlashAttention (2022) and ThunderKittens. This closes the gap between model mathematics and real hardware power.
Хотите не читать про ИИ, а внедрить его?
«AI News» — это полезные новости из мира ИИ. Системно научиться работать с нейросетями и применять их в работе — в Hamidun Academy.