IEEE Spectrum AI→ original

Boston Dynamics and Google DeepMind Teach Spot to Reason During Industrial Inspections

Boston Dynamics has integrated Google DeepMind's Gemini Robotics-ER 1.6 model into Spot and is betting on industrial inspection. The robot can now…

AI-processed from IEEE Spectrum AI; edited by Hamidun News
Boston Dynamics and Google DeepMind Teach Spot to Reason During Industrial Inspections
Source: IEEE Spectrum AI. Collage: Hamidun News.
◐ Listen to article

Boston Dynamics is transitioning Spot from the category of impressive demonstrations into a more practical class of industrial robots: the four-legged apparatus has received the Gemini Robotics-ER 1.6 model from Google DeepMind and now should not just execute commands, but interpret the surrounding environment, notice deviations, and make decisions during inspections without constant operator involvement. For robotics, this is an important shift.

For a long time, robots could do many things, but only if a human had described the scenario in advance almost like a program. The more complex the task, the harder it was to make the interface convenient. The idea of embodied AI—that is, AI with a physical body and access to the real world—is precisely about bridging this gap.

Boston Dynamics is one of the few companies that has managed to bring walking robots to commercial scale: Spot is already operating in thousands of units. Therefore, the integration of the new model is not an academic experiment, but an attempt to improve a product that is already being used in the field. The primary scenario is not homework assignments from videos, but industrial inspection.

At factories, power facilities, and other complex sites, Spot should patrol the territory and check whether anything dangerous is happening. With Gemini Robotics-ER 1.6, the robot can autonomously search for spilled liquids and foreign debris, read complex sensors, pressure gauges, and inspection windows, as well as connect visual-language models if it needs to better understand the situation around it.

In other words, the goal is not to teach the robot to beautifully fetch things, but to reduce risk in places where the cost of a missed problem can be high. But the most interesting thing here is what developers call understanding and reasoning. In the world of robots, these words sound more and more often, although in practice they do not mean philosophy, but closeness of machine behavior to human logic.

If a person asks a robot to clear cans from a room, he expects not only the fact of executing the command, but also common sense: pick up the can in such a way as not to spill any remaining liquid, not put a glass of water on the edge of the table, not create a new danger instead of eliminating an old one. At Google DeepMind, they say they monitor such cases through internal scenarios of semantic safety. The goal is for the robot not just to understand the verb in the command, but to consider the consequences of the action in the physical world.

At the same time, the limitations of the current approach are quite noticeable. The current version of the model for Spot relies mainly on vision. For example, one of the new features evaluates the success of grasping an object through multiple cameras.

This is useful, but in robotics, there have long been other ways to understand that an object has been grasped reliably: force sensors, touch sensors, contact feedback. The problem is in the data. There are many visual examples on the Internet of how to pick up a pen or open a door, but there are almost no massive datasets with tactile information.

Therefore, teaching models the physics of contact is currently much harder than teaching images and text. To close this gap, Boston Dynamics intends to receive more field data from customers who will use Spot's new inspection features. There is also a second practical question—trust.

Boston Dynamics directly acknowledges that it is rolling out new capabilities through beta programs and only advertising what it is confident about. For commercial inspection, robots do not need absolute perfection, but there is a threshold of utility. If the system makes mistakes too often and raises false alarms, operators will stop listening to it.

The company believes that real value begins somewhere above the 80 percent level, when the robot is already helping rather than annoying. This is especially important at facilities where part of the critical infrastructure is equipped with sensors, and part of potentially dangerous details can still only be noticed by the eye during rounds. The conclusion is simple: the union of Boston Dynamics and Google DeepMind is not a story about another flashy robot video, but an attempt to turn embodied AI into a practical tool with measurable benefit.

If Spot really learns to reliably detect leaks, read instruments, and act more safely in ambiguous environments, the market will receive one of the first convincing examples of how reasoning AI works not on a screen, but in a workshop. And the accumulated experience can then be transferred to other platforms, including more complex humanoid robots.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…