Together AI introduced ATLAS: a speculator that speeds up LLMs 4x
Together AI introduced ATLAS, an adaptive machine-learning-based speculator that speeds up LLM inference 4x without manual tuning. The system automatically lear

◐ Listen to article
Together AI introduced ATLAS, an adaptive machine-learning-based speculator that speeds up LLM inference 4x without manual tuning. The system automatically learns and adapts to your workload as you use it. On DeepSeek-V3.1, it reaches 500 tokens per second — 2.65x faster than standard decoding and outperforming specialized Groq hardware.