Ollama 0.1.5: Qwen3-Coder-Next и радости локального запуска

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

2026-02-08. Reading time: 2 min.

Вышла Ollama 0.15.5 с моделью Qwen3-Coder-Next, обученной на 80 млрд параметров. Для локального запуска потребуется минимум 80 ГБ видеопамяти или 128 ГБ RAM. Ав

Hamidun News Editorial

AI monitoring · Habr AI

2026-02-08· 2 min

AI-processed from Habr AI; edited by Hamidun News

Ollama 0.1.5: Qwen3-Coder-Next и радости локального запуска — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

Ollama has released an update to version 0.15.5, which brings support for new models, with Qwen3-Coder-Next standing out particularly. This model, oriented towards code generation, promises to become a powerful tool for developers, but, as is often the case, high performance comes with high resource requirements.

Qwen3-Coder-Next is a model with 80 billion parameters, most of which were trained on code. This allows it to demonstrate impressive results in programming-related tasks. However, running this model locally will require serious hardware. In particular, for Ollama only quantized versions of the model are available (q4_K_M at 52GB and q8_0 at 85GB), which already hints at its "greediness".

For comfortable work with Qwen3-Coder-Next, you will need at least 80 GB of VRAM if you want to achieve high inference speed. Of course, the model can be run on CPU with 128 GB of DDR5 RAM, but in this case the speed will be significantly lower. This makes Qwen3-Coder-Next not the most accessible solution for an average user, which, as noted by the author of the original article, often causes negative reactions.

Interestingly, the author offers free access to his server, which already has 10 different models loaded, including Qwen3-Coder-Next. This is a great opportunity for those who don't have the ability to run the model locally, but want to test its capabilities. However, it should be noted that access to the server is not permanent and will be closed after setting up the RAG (Retrieval-Augmented Generation) system. The author also warns that he will disconnect users who attempt to overload the server.

Overall, the appearance of Qwen3-Coder-Next in Ollama is an important step forward in the development of local LLMs. This allows developers to gain access to a powerful model for code generation without needing to rely on cloud services. However, high resource requirements remain a serious obstacle to the widespread adoption of such models. The author's offer of free access to the server is an excellent opportunity for those who want to test Qwen3-Coder-Next but don't have the necessary equipment. This also highlights a growing trend towards creating local, self-sufficient AI solutions that give users more control over their data and computing resources.

Thus, Ollama continues to evolve, offering users access to cutting-edge models, but it's important to remember the growing hardware requirements. The ability to test Qwen3-Coder-Next on the author's server is a valuable chance to assess the potential of such models and understand whether you're ready to upgrade to more powerful hardware.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation