Liquid AI Releases LFM2.5-8B: A Compact MoE Model with 128K Context
Liquid AI released LFM2.5-8B-A1B — a MoE model that activates only 1.5B out of 8.3B parameters. Runs on standard PCs, supports a 128K token context window…
AI-processed from MarkTechPost; edited by Hamidun News
Liquid AI has released a new LFM2.5-8B-A1B model that can run directly on a personal computer without requiring a cloud server or subscription.
How MoE Architecture Works
This model uses the Mixture of Experts (MoE) architecture, which has become a trend in 2024–2025. LFM2.5 has a total of eight and a half billion parameters, but only one and a half billion are activated simultaneously. It's like a large library: instead of reading all books in sequence, you take only those you need for the current task.
This approach provides two advantages. First, it saves RAM and computational power — the model runs faster. Second, you can run a model of smaller physical size on a single GPU. Performance remains high while hardware requirements are lower.
Capabilities and Characteristics
LFM2.5-8B-A1B supports a 128,000 token context — approximately one hundred pages of text printed in 12pt font. The model is capable of reasoning: it solves complex tasks step-by-step, showing intermediate results. It can also invoke tools — for example, search the internet, perform calculations, or call external APIs.
- 128,000 token context for long documents
- Step-by-step logical reasoning
- Tool and function invocation
- Running on consumer PCs without the cloud
The model is targeted at developers who want to embed AI in their applications without dependence on cloud services.
Local Execution — The Key Advantage
The difference between LFM2.5 and cloud-based models is that it runs entirely on the user's device. Requests are not sent to a server; responses are generated locally. This means minimal latency and complete privacy — data never leaves your computer.
For developers, this means no subscriptions and no dependence on Liquid AI's API. They simply download the model and integrate it into their product. There are no limits on the number of requests, no charges per call. Perfect for mobile applications, embedded systems, and software that needs to work offline.
What This Means
The world of AI models is gradually shifting from cloud services to local solutions. If before only large corporations (OpenAI, Google, Meta) could serve millions of users, now every developer can embed a powerful model directly into their software. LFM2.5-8B-A1B is another step in this direction. This is especially useful in applications that require privacy, fast response times, or offline operation.
*Meta is recognized as an extremist organization and is banned in the Russian Federation.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.