Composer 2.5 from Cursor: code agent trained to handle long-term tasks better
Cursor updated Composer 2.5 — a code agent for IDE. Now it works better with long-term tasks. The main innovation: trained with a new targeted feedback method —
AI-processed from Cursor Blog; edited by Hamidun News
Cursor released Composer 2.5 — an update to its AI agent for code work in IDE. This is a significant leap in intelligence and behavior compared to version 2.
What Changed
Composer 2.5 handles long-term tasks better, follows complex instructions more accurately, and is more pleasant to work with. The Cursor team improved the model by increasing training volume, creating more complex reinforcement learning environments, and implementing new methods. Not all improvements are visible in tests and benchmarks. Important metrics are communication style and the ability to calibrate effort. These aspects are what make the model more practical in real work.
Targeted Feedback
The main innovation is a new training method with textual feedback. The problem: when the RL signal comes from an entire multi-step sequence (hundreds of thousands of tokens), it's difficult for the model to understand which action led to the error. The final result is a noisy signal. The solution: insert a hint at the exact moment of error. For example, if the model tries to call a non-existent tool, a hint is added to the context: "Available tools: [list]". This helps the model correct itself immediately and avoid making the same error again.
"This gives the model a local learning signal for the behavior we want
to change, while maintaining the broader RL task across the entire trajectory"
Synthetic Data and Scaling
- Composer 2.5 is trained on 25x more synthetic tasks
- Tasks are generated dynamically during training
- An approach is used that removes features from real code bases
- The system selects complex tasks directly during a training run
During training, Composer's coding ability grows to the point where it solves most tasks. To continue increasing intelligence, the team both selects more complex tasks and generates them dynamically throughout the training run.
What's Next
Composer 2.5 is based on the open Kimi K2.5 checkpoint from Moonshot. But this is a transitional step. Together with SpaceX, Cursor is training a much larger model from scratch, using 10 times more compute. Colossus 2 has the equivalent of one million H100s. Cursor expects the new model to be a massive leap in capabilities.
What This Means
Code generation has reached a new level. Agents are becoming not just auxiliary tools, but full-fledged assistants for long-term projects. This brings closer the moment when AI can lead project development almost independently.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.