IEEE Spectrum AI→ original

General Motors showed how it trains autopilot in simulations 50,000 times faster than reality

General Motors revealed how it teaches AI for autonomous driving through simulations, reinforcement learning, and VLA models. The company claims that the…

AI-processed from IEEE Spectrum AI; edited by Hamidun News
General Motors showed how it trains autopilot in simulations 50,000 times faster than reality
Source: IEEE Spectrum AI. Collage: Hamidun News.
◐ Listen to article

General Motors, in a sponsored article, explained how it builds scalable AI for autonomous driving. The emphasis is on simulations, reinforcement learning, and VLA models that help train not on typical trips but on rare and dangerous situations that determine whether the system can actually be released to roads.

Why edge cases are hard

For an autopilot, the problem is not in driving on an empty highway in good weather. The main risk is the so-called long tail: rare, ambiguous, and poorly predictable episodes that occur infrequently but are precisely what shows whether a system can be deployed on roads without constant human oversight. GM directly states that the path to eyes-off mode on highways and further toward full autonomy hinges on this final percent of complexity.

This includes not only exotic cases like a mattress on the road, a burst hydrant, or a mass traffic light outage. Equally problematic are everyday scenarios in dense city traffic, where a driver must show courtesy, common sense, and quickly understand context. For example, how to merge into a parking queue without blocking traffic flow, or how to navigate a construction site where movement is regulated by a worker's gestures rather than standard signs.

unexpected obstacles on the road temporary traffic schemes in repair zones traffic controller gestures contradicting traffic light signals complex maneuvers in tight parking spaces * cascading failures of city infrastructure ## How GM trains its model One of the key components is Vision Language Action models. Essentially, the company takes a basic vision-language architecture that understands images at the level of general concepts and fine-tunes it for driving tasks. After this, the model not only "sees" an image but interprets vehicle trajectories, isolates 3D objects, and helps understand what is actually happening in a road scene.

This is necessary so the machine can recognize that a police officer's gesture takes precedence over a red light or that ahead is a terminal drop-off zone, not an ordinary lane. The problem is that deep semantic understanding often introduces unnecessary latency, and in driving, every fraction of a second is critical. So GM is developing a Dual Frequency VLA scheme: a large model works more slowly and is responsible for high-level semantic decisions, while a compact one handles fast control loops—steering, braking, and trajectory maintenance.

This hybrid, according to the company's plan, should combine the "common sense" of foundation models and reaction speed sufficient for real roads.

Simulations instead of roads The bulk of training happens not on actual streets but in simulators.

GM reports that it daily runs millions of high-precision closed-loop scenarios—equivalent to tens of thousands of days of human driving compressed into hours of computation. The company can take real drives, change weather and lighting through diffusion models, add new vehicles, or assemble scenes from scratch based on textual descriptions and spatial bounding boxes. For tactical behavior tasks, photorealism is not always necessary, so GM uses an abstract environment called Boxworld within its own RL simulator GM Gym.

Only important parameters remain: object position, velocity, traffic rules, and vehicle interactions. This allows running enormous volumes of experiments where the model learns not to copy humans but to find strategy with measurable goals like safety and progress. This training happens at different speeds: up to 50,000 times faster than real-time approximately 1,000 km of virtual driving per second of GPU time thousands of virtual drivers per second in a single environment 30 minutes of distillation versus approximately 12 hours of raw RL After this, knowledge from the abstract environment is transferred to a more realistic model through On Policy Distillation: a simplified RL policy acts as a "teacher" for the model that will then work in the vehicle.

Separately, GM uses a SHIFT3D pipeline to specifically create objects where the perception system might fail and adds an epistemic uncertainty module that flags scenes where the model is genuinely "uncertain." According to the company, fine-tuning on such difficult cases has already reduced near-miss collisions by more than 30%.

What this means GM's approach shows where the autonomous driving

industry is heading: not toward one "smart" model but toward an entire ecosystem of simulators, generative world models, RL, and uncertainty assessment systems. If such a scheme truly scales, the key asset in the autopilot race will not only be a fleet of vehicles on roads but also the quality of infrastructure that can quickly imagine, test, and break rare scenarios before users encounter them.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…