36Kr (36氪)→ original

Unitree UnifoLM-VLA-0: Chinese Robots Learn to Think with Their Hands

For a long time we looked at humanoid robots as impressive pieces of machinery that could do backflips but froze when faced with an ordinary doorknob. The…

AI-processed from 36Kr (36氪); edited by Hamidun News
Unitree UnifoLM-VLA-0: Chinese Robots Learn to Think with Their Hands
Source: 36Kr (36氪). Collage: Hamidun News.
◐ Listen to article

For a long time we looked at humanoid robots as impressive pieces of machinery that could do backflips but froze when faced with an ordinary doorknob. The problem wasn't the motors, but the "head". And now Unitree, a company that has already accustomed us to affordable robot dogs, has decided to take artificial intelligence seriously.

They have open-sourced UnifoLM-VLA-0, and this event could change the rules of the game in the industry faster than it seems at first glance. We are finally transitioning from simple neural networks that can only talk, to models like VLA (Vision-Language-Action), capable of controlling a physical body in real space. To understand the scale, we need to remember how robots used to learn.

It was usually either rigid software logic or reinforcement learning for a specific narrow task. If you taught a robot to open a refrigerator, that's all it could do. UnifoLM-VLA-0 works differently.

It is a descendant of large language models that has undergone fine-tuning on specific data of physical interaction. The result is an "embodied brain" that understands context. It doesn't just see an apple on a table, it understands how to grab it, with what force to squeeze it, and where to place it, based on the user's text command.

The most ironic thing here is that Unitree chose the path of openness. While Western giants and even some Chinese competitors are building "walled gardens", hiding the architecture of their control systems, Unitree is putting its cards on the table. This is a strategic calculation.

By open-sourcing UnifoLM-VLA-0, they are essentially inviting thousands of developers around the world to test, improve, and adapt their model to a wide variety of hardware. This is a classic move from the software history playbook: if you can't beat everyone alone, become the standard for everyone. If tomorrow every other robotics research project uses Unitree's brains, the question of industry leadership will resolve itself.

Technically, the UnifoLM-VLA-0 model attempts to bridge the gap between visual understanding (VLM) and real action. Ordinary models often hallucinate or don't understand the laws of physics — they can "say" they lifted a cup, but their virtual hand will pass through it. Unitree's new architecture aims for the robot to possess what engineers call "physical common sense".

This is knowledge that objects have weight, friction, and inertia. Without this, humanoids would remain expensive exhibition toys, capable only of waving at passersby in a pre-recorded cycle. What does this mean for us?

We will likely see a sharp jump in the capabilities of home and warehouse robots in the next year or two. When software becomes common knowledge, progress accelerates exponentially. We already saw this with language models after LLaMA's release.

Now it's the turn of the physical world. Of course, a full-fledged butler robot is still far away, but the foundation in the form of an open "brain" has already been laid. Now it's up to the community, which must teach this brain not only to understand commands, but also not to break everything in the process of executing them.

The key point: Unitree is betting on open-source, trying to become the "Android" of the robotics world. Will closed proprietary systems like Tesla Optimus be able to withstand competition with the collective intelligence of developers?

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…