MarkTechPost→ original

Z.ai releases GLM-5.2: real million tokens and two levels of deep thinking

Z.ai released GLM-5.2—a code model with a real 1-million-token context window and two thinking modes: High and Max. The model integrates directly into Claude…

AI-processed from MarkTechPost; edited by Hamidun News
Z.ai releases GLM-5.2: real million tokens and two levels of deep thinking
Source: MarkTechPost. Collage: Hamidun News.
◐ Listen to article

Z.ai released GLM-5.2 — an updated code model with a genuinely usable context window of one million tokens, two deep reasoning modes, and seamless integration into popular development tools.

Million tokens: "usable" is fundamental

Long context windows have long become a standard line in marketing descriptions. Claiming a million tokens is easy — ensuring the model actually works with them is far harder. Most competitors degrade at the limit: they "lose" information from the middle of long documents, start ignoring early instructions, or produce noticeably less accurate answers.

Z.ai deliberately highlighted the word "usable" in the release description. This means the team is betting on actual processing of the entire context, not just a number in the specification.

For developers, this opens specific scenarios: load an entire large codebase into a single request, multiple long documents at once, or full discussion history from an issue tracker — and work with them without losing context.

Two levels of "thinking"

Instead of a single generation mode, GLM-5.2 offers two effort levels:

  • High — balanced mode for everyday tasks: fast, accurate, without unnecessary computation overhead
  • Max — extended reasoning: the model builds an internal chain of reasoning before answering, providing greater depth for complex tasks

This approach is already familiar from OpenAI products (o1/o3 series) and Anthropic (extended thinking in Claude 3.7). The advantage of GLM-5.2 — both modes are available in a single model through one endpoint, with no switching between versions. High is convenient for refactoring and autocomplete, Max — for architectural review, test writing, and debugging tangled errors.

Integration in minutes

GLM-5.2 is delivered through an Anthropic-compatible API format. For a developer already using one of the supported tools, connection takes minutes — no adapters or logic rewrites:

  • Claude Code — terminal-first AI assistant from Anthropic
  • Cline — popular open-source agent inside VS Code
  • OpenClaw — Z.ai's own multi-agent platform

The release covers all GLM Coding Plan tiers without exception. No waitlists or priority access programs.

Without benchmarks — for now

Z.ai has not published standard evaluations alongside the release. For a market where it's customary to open announcements with comparison tables on HumanEval, MMLU, and Codeforces, this is an atypical move. In exchange, the company promises open MIT weights within the next week. When the weights arrive, independent researchers will be able to verify the model's capabilities independently, without relying on numbers from the press release. This is either a signal of confidence in the results, or a deliberate choice not to disclose details before the open version launches.

What this means

GLM-5.2 appears in the developer's working environment without friction: one endpoint, two thinking modes, large real context. Z.ai continues to occupy a position between closed commercial models and the open-source community — and MIT weights in a week will make it accessible for local deployment without any restrictions.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…