Together AI at NVIDIA GTC 2026: Dynamo, multi-agent models, and voice AI
Together AI made four major announcements at NVIDIA GTC 2026. The company integrated NVIDIA Dynamo 1.0 into its inference stack and, together with NVIDIA…
AI-processed from Together AI Blog; edited by Hamidun News
Together AI appeared at NVIDIA GTC 2026 in San Jose with four major announcements — from a new inference engine to voice AI and a powerful model for multi-agent workflows. The main theme of the conference aligns with the company's strategic direction: AI systems are becoming more open, agentic, and ready for industrial deployment. For developers and AI teams, this means a new level of accessibility to tools that previously required significant resources.
Dynamo 1.0 and agentic infrastructure
NVIDIA released Dynamo 1.0 — an open-source software engine for generative and agentic inference at production scale. Together AI has already integrated Dynamo into its inference stack and actively applies it to optimize real-world workloads.
According to the company, this enables higher performance when processing heavy requests with lower costs. In parallel, NVIDIA and Together AI jointly launched NVIDIA NemoClaw — an open-source stack that simplifies the deployment of AI assistants with persistent connections. In a single deployment, it sets up NVIDIA OpenShell — a secure environment for running autonomous agents — and provides access to open models, including NVIDIA Nemotron.
For developers building agentic systems, this means direct access to Together's library of 150+ optimized models with dedicated endpoints scaled for production workloads. The combination of NemoClaw and Together's dedicated infrastructure lowers the barrier to entry for teams looking to launch agentic AI products without lengthy setup.
Nemotron 3 Super: complex reasoning and multi-agent tasks
NVIDIA Nemotron 3 Super is a hybrid mixture-of-experts model built on the Mamba-Transformer architecture. It is specifically designed for complex tasks with long reasoning horizons and scenarios where multiple agents interact within a single workflow.
Key characteristics of the model:
- 120B parameters total — with only 12B active per token, significantly reducing computational overhead
- 1M token context window for long-horizon reasoning tasks
- Optimization for parallel multi-agent operation — even on a single GPU
- Applications: software development, financial analysis, cybersecurity automation
The model is available through Together AI's Dedicated Model Inference. Developers get a simple and scalable way to run advanced reasoning models in production without building custom infrastructure from scratch.
Voice AI: Parakeet for real-time transcription
A separate announcement was the arrival of NVIDIA Parakeet TDT 0.6B V3 in Together AI's model library. This is a low-latency ASR (automatic speech recognition) model optimized for real-time applications. Parakeet brings high transcription accuracy combined with the performance required by conversational AI agents. Combined with Together's high-performance inference infrastructure, developers get a ready-made stack for building voice agents — from accurate transcription to scalable request handling. Potential applications span voice interfaces in customer support, healthcare, education, and corporate communications, where recognition speed and reliability are critical.
"AI systems are becoming more open, agentic, and ready for production" —
Together AI on the main theme of GTC 2026.
At the conference, the Together AI team also conducted technical sessions with customers — including Cursor (an AI assistant for developers) and Decagon (customer support automation) — demonstrating real-world platform applications in software development and business process automation.
What this means
Together AI is consistently strengthening its position as an "AI Native Cloud" — a unified platform where open models, agentic infrastructure, and voice AI are available to developers from a single point. Tight integration with the NVIDIA ecosystem through Dynamo, NemoClaw, and Parakeet makes Together a real alternative to closed solutions for teams that value infrastructure flexibility, predictable costs, and full control over the models used.
Need AI working inside your business — not just in your newsfeed?
I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).
The AI world, distilled — once a week
Seven stories that actually mattered, hand-picked. No noise, no reposts, no press releases.
Done! Check your inbox for a confirmation.