Robots in a Data Trap: Why Backflip Videos Are Only the Beginning
Viral videos create the illusion of a technological breakthrough, but real progress is being held back by a "data crisis." Unlike text-based AI, robots need…
AI-processed from 36Kr (36氪); edited by Hamidun News
# Robots in a Data Trap: Why Somersault Videos Are Just the Beginning
Over the past year, the internet has been flooded with videos of impressive robot tricks: robots jumping, dancing, throwing punches, smashing watermelons with their feet. Investments are growing, media headlines are filled with optimism, and the public is confident that the era of home robot assistants is about to begin. But if you look behind the curtains of this technological theater, the picture turns out to be much more complex and sad.
Right now, in quiet data preparation centers across China, human operators in gloves are slowly, almost painfully controlling manipulators — teaching machines to pick up parts, fold tools, close box lids. This sight lacks all the cinematic quality of viral videos, but it is precisely this that determines real progress in robotics. The thing is, the path from an impressive trick to a useful home assistant is blocked by one fundamental problem: a catastrophic shortage of quality data.
Language models like ChatGPT and DeepSeek are built on the triumph of simple logic — thousands of billions of text examples from the internet allowed AI to understand language and begin generating meaningful content. But robotics faced a completely different reality. If text data lives in a two-dimensional digital space, easy to copy and distribute, then the physical world is a multidimensional, continuous stream of information.
A robot must perceive the world through multiple channels simultaneously: video from several cameras, force sensors, touch sensors, information about joint positions. Each operation performed by the manipulator generates structured data — 57 measurements in some systems. And all these streams must be perfectly synchronized down to the millisecond, otherwise the model will learn pure hallucination instead of cause-and-effect relationships.
In China, more than fifty centers for collecting and processing robotics data are already operating. Just in Beijing, one such center produces around six thousand training example recordings daily. If we extrapolate roughly, annual production could reach tens of billions of examples.
This sounds impressive, but in reality it is a drop in the ocean of needs. Experts from PowerTech company conducted a conservative calculation: to teach a robot one movement, you need approximately one to five thousand examples. A simple task consisting of several movements — ten to twenty thousand.
But if we're talking about a universal robot capable of handling eighty percent of human work in one industry, a dataset of hundreds of millions would be needed. And if ambitions extend to thousands of industries — we're talking about trillions of examples. The deficit is four to five orders of magnitude.
But this is not even the most serious problem. Far more insidious proved to be equipment incompatibility. Different manufacturers create robots with different sensor configurations, different control protocols, different physical parameters. Data collected on one manipulator model often simply doesn't work on another — one machine's language remains completely foreign to another. This means knowledge doesn't accumulate, doesn't build into a single asset for the industry. Every manufacturer is forced to collect their own dataset from scratch, repeating the same expensive work over and over again.
Some centers address this dilemma by focusing on popular models — essentially ignoring diversity. Others take a more ambitious path: collecting data from robots of different manufacturers in a single space, trying to teach the model to generalize knowledge across heterogeneous equipment. No approach has yet proven its universality.
All this reminds us of the early days of autopilot — an era when it seemed the problem was in algorithms, not data. Nearly twenty years have passed, billions of investments, and it turns out the truth is somewhere nearby, but not quite where we're looking for it. Before robots actually enter our homes, there will be a long and tedious work ahead in data centers, where people in gloves will patiently teach machines to understand the physical world. Viral videos are marketing. Real progress is a completely different story.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.