Liquid AI Releases LFM2.5-8B: A Compact MoE Model with 128K Context

Q: What is the source?

Originally published on MarkTechPost. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 29, 2026. Reading time: 3 min.

Liquid AI released LFM2.5-8B-A1B — a MoE model that activates only 1.5B out of 8.3B parameters. Runs on standard PCs, supports a 128K token context window…

Hamidun News Editorial

AI monitoring · MarkTechPost

May 29, 2026· 2 min

AI-processed from MarkTechPost; edited by Hamidun News

Liquid AI Releases LFM2.5-8B: A Compact MoE Model with 128K Context — Source: MarkTechPost. Collage: Hamidun News.

◐ Listen to article

Liquid AI has released a new LFM2.5-8B-A1B model that can run directly on a personal computer without requiring a cloud server or subscription.

How MoE Architecture Works

This model uses the Mixture of Experts (MoE) architecture, which has become a trend in 2024–2025. LFM2.5 has a total of eight and a half billion parameters, but only one and a half billion are activated simultaneously. It's like a large library: instead of reading all books in sequence, you take only those you need for the current task.

This approach provides two advantages. First, it saves RAM and computational power — the model runs faster. Second, you can run a model of smaller physical size on a single GPU. Performance remains high while hardware requirements are lower.

Capabilities and Characteristics

LFM2.5-8B-A1B supports a 128,000 token context — approximately one hundred pages of text printed in 12pt font. The model is capable of reasoning: it solves complex tasks step-by-step, showing intermediate results. It can also invoke tools — for example, search the internet, perform calculations, or call external APIs.

128,000 token context for long documents
Step-by-step logical reasoning
Tool and function invocation
Running on consumer PCs without the cloud

The model is targeted at developers who want to embed AI in their applications without dependence on cloud services.

Local Execution — The Key Advantage

The difference between LFM2.5 and cloud-based models is that it runs entirely on the user's device. Requests are not sent to a server; responses are generated locally. This means minimal latency and complete privacy — data never leaves your computer.

For developers, this means no subscriptions and no dependence on Liquid AI's API. They simply download the model and integrate it into their product. There are no limits on the number of requests, no charges per call. Perfect for mobile applications, embedded systems, and software that needs to work offline.

What This Means

The world of AI models is gradually shifting from cloud services to local solutions. If before only large corporations (OpenAI, Google, Meta) could serve millions of users, now every developer can embed a powerful model directly into their software. LFM2.5-8B-A1B is another step in this direction. This is especially useful in applications that require privacy, fast response times, or offline operation.

*Meta is recognized as an extremist organization and is banned in the Russian Federation.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation