LingBot-VLA: Ant Group Teaches Robots to Understand the World Without Extra Words
Китайская компания Lingbo (дочка Ant Group) выложила в открытый доступ LingBot-VLA — мультимодальную модель для управления роботами. Главная фишка: кросс-онтоло
AI-processed from Jiqizhixin (机器之心); edited by Hamidun News
Imagine you bought a new phone and didn't need to relearn how to use it — your fingers already know where to press. In the world of robotics, things have been different: every 'machine' required its own unique code and thousands of hours of training in simulations. But the team at Lingbo, a division of Chinese tech giant Ant Group, decided it was time to end this. They released LingBot-VLA to the public — a model that claims to be the universal brain for everything with motors and manipulators.
The core of the 'physical AI' problem has always hinged on what we call cross-ontology. If you train a neural network to control one specific gripper, it's completely helpless when moved to a bipedal humanoid. LingBot-VLA attempts to solve this cognitive dissonance. Being a Vision-Language-Action (VLA) model, it doesn't simply 'see' an image and 'read' text. It translates this data into specific movement vectors that different types of robots can understand. It's as if one driver could equally masterfully operate both a bicycle and a career dump truck without additional training.
Why is this happening right now? We're experiencing a transition from 'chatty AI' to 'acting AI'. Chatbots learned to write poetry, but they still can't wipe dust off a table without breaking a vase. For a robot to be useful in everyday life, it needs generalization capability. LingBot-VLA shows record results in task generalization tests: it understands the command 'bring me an apple' even in an unfamiliar room and with an unfamiliar gripper type. The developers used a massive dataset combining visual images and movement trajectories, allowing the model to build an internal logic of space.
The political context is also interesting. While American companies like Figure or Tesla keep their developments under lock and key, Chinese tech giant Ant Group chooses the open-source path. It's a strategic move: if LingBot-VLA becomes the standard for small robot manufacturers worldwide, China will effectively capture the operating system of the future 'physical internet'. It's a classic long game where dominance at the standards level is more important than short-term profits from selling licenses.
For the industry, this means a sharp lowering of the barrier to entry. Now a startup doesn't need to hire a hundred PhDs to train basic robot movements — it can take a ready-made 'foundation' and fine-tune it for a specific task. We're approaching a moment when hardware becomes secondary and software becomes determinative. If LingBot-VLA is really as good at adaptation as the developers claim, in a couple of years we'll see an invasion of robots that will finally stop being helpless in front of a closed door.
Of course, there are questions about safety and accuracy — in the physical world, an error costs more than a typo in a chatbot. But the vector is clear: AI is coming out of smartphone screens into reality. We can only watch how quickly these 'brains' will acquire worthy 'bodies'.
The key thing: LingBot-VLA could become the 'Android' for the world of robotics, making universal robot control accessible to everyone. Are we ready for open source to now be able to move objects in our apartment?
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.