Arcee AI Released Trinity Large Thinking — Open Reasoning Model for AI Agents
Arcee AI released Trinity Large Thinking — an open reasoning model under Apache 2.0 for long-running agent tasks and tool use. It's a 400B sparse MoE with…
AI-processed from MarkTechPost; edited by Hamidun News
Arcee AI released Trinity Large Thinking on April 1, 2026 — an open reasoning model designed not for short chat responses, but for long agentic scenarios with multiple steps, tool calls, and context preservation between turns. For the market, this is an important signal: in a segment where closed models from major labs set the tone, an open variant has appeared with an Apache 2.0 license that can not only be called via API, but also run, fine-tune, and embed in your own systems without licensing gray zones.
Trinity Large Thinking is the reasoning version of the Trinity Large family, which Arcee is developing as an open alternative to major proprietary models. The company released the weights on Hugging Face and simultaneously launched the model on its own API. The key bet here is not on a "universal chat for everything," but on tasks where an agent needs to maintain a plan, remember previous steps, carefully use tools, and not fall apart after several iterations.
These are precisely the scenarios becoming fundamental to AI development: code agents, operators of internal systems, corporate assistants, and pipelines with multiple external service calls. By architecture, the model belongs to the sparse MoE class: Trinity Large Thinking has approximately 398–400 billion parameters, but roughly 13 billion are activated per token. Inside are 256 experts, with only four working simultaneously.
This design is needed to maintain high quality ceiling without making inference completely impractical. Arcee also mentions support for context up to 512 thousand tokens after window expansion, which is especially important for long agentic cycles, large repositories, voluminous documentation, and complex multi-step tasks. Another detail — the model generates an explicit "reasoning layer" before the response, and developers are recommended to preserve this reasoning context between turns, otherwise the quality of multi-step work can noticeably degrade.
The most interesting part of the release is not only the license, but also the stated focus on practical agency. According to Arcee, Trinity Large Thinking ranks second on PinchBench, yielding only to Claude Opus 4.6, and is significantly stronger than early Trinity Large Preview precisely in multi-step tool work, instruction following, and maintaining coherence over long runs.
In the model card, the company also lists strong results on a number of agentic benchmarks, including τ²-Bench and LiveCodeBench. Simultaneously, Arcee emphasizes economics: at the time of announcement, the company valued inference cost at approximately $0.90 per million output tokens, positioning the model as a substantially cheaper alternative to closed reasoning systems for production agents.
The context around the release is also important. Trinity Large Preview, presented in late January 2026, according to the company, processed 3.37 trillion tokens through OpenRouter in its first two months.
Arcee claims that the preview version became the most-used open model in the United States in the OpenClaw collection and fourth globally. For a small team, this is a way to show that demand for open models in real agentic scenarios already exists — not at the demo level, but at the level of steady production workload. Technically, the project proved non-trivial: Trinity Large was pretrained on 17 trillion tokens, used 2048 NVIDIA B300 GPUs, and the company previously valued the entire path to the Large family at approximately $20 million.
The main takeaway from the release is this: the open-source AI market is shifting from the race of "who writes better text" to the race of "which model reliably conducts long work." Trinity Large Thinking is important not because it instantly surpassed closed leaders on all metrics, but because it gives developers and companies another genuinely open option for building agentic systems without API-only limitations. Now the question is not whether you can release an open reasoning model of this class, but how stably it will perform in production, where beautiful demos don't matter — what matters are multi-hour cycles, the cost of errors, and behavior predictability.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.