A Physical AI pipeline for SO-101 was assembled on top of ROS2 and LeRobot for 30,000 rubles

A rare end-to-end Physical AI pipeline that can be reproduced at home has appeared: an SO-101 manipulator, ROS2-native control, demonstration recording, conversion into a LeRobot dataset, and running the policy back on the robot. An important detail is that inference can be offloaded to a remote GPU via policy_server without breaking the ROS2 loop. Such a stack significantly lowers the barrier to entry for embodied AI and imitation learning.

Khamidun Zhemal

AI monitoring · Habr AI

Apr 30, 2026· 3 min

AI-processed from Habr AI; edited by Hamidun News

A Physical AI pipeline for SO-101 was assembled on top of ROS2 and LeRobot for 30,000 rubles — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

The open-source community has developed a practical Physical AI stack for the inexpensive SO-101 manipulator: it covers the entire path from teleoperated demonstration to autonomous task execution on a real robot. Instead of scattered scripts, the project connects ROS2, LeRobot, and imitation learning into a single reproducible pipeline for roughly 30 thousand rubles of setup cost.

How the stack is organized

The main idea of the project is not a new model, but that robotics and ML no longer live separately. At the bottom is the SO-101 manipulator itself, above it is the ros2_control layer with a hardware interface for Feetech STS3215 servo drives, and then teleoperation, cameras, episode recording, and inference rise up. As a result, the robot is visible to the system as a normal ROS2 device, not as a set of scripts tied to a single board. This makes the stack portable and convenient for modification.

On top of this, leader/follower teleoperation is launched: the operator demonstrates the needed movement, and the follower arm repeats it while simultaneously generating training data. During demonstrations, the project records episodes in rosbag or MCAP, works with multiple cameras, and allows checking observations and actions through visualization in Rerun. This is an important point: data can not only be collected but quickly filtered before training if synchronization, camera angles, or trajectories turned out to be unsuccessful.

Path from data

After recording, the project converts episodes from ROS formats into a LeRobot dataset. This is a bridge between the ROS2 world and ML, which removes homemade intermediate formats and allows faster transition to policy training. Next, you can try end-to-end imitation learning with models like ACT or SmolVLA and then return the obtained policy back into the ROS2 circuit of the robot. This path is important also because it relies on the already existing ecosystem of tools.

Practically the entire workflow looks like this:

robot manipulator bringup and ros2_control startup
demonstration collection through leader/follower teleop
episode recording in rosbag or MCAP
checking camera streams, actions, and observations in Rerun
conversion to LeRobot dataset, policy training, and deployment on the robot

A separate strength of the stack is the separation of robot-side runtime and heavy model. If local compute near the arm is insufficient, policy can run on an external GPU server through policy_server, while the robot side keeps only the inference client and execution circuit. For Physical AI this is not cosmetic but normal engineering decoupling: the control loop stays near the hardware, and the "brain" scales independently. This simplifies experiments with heavier models and reduces requirements for robot-side hardware.

Where is the practical value

Such projects usually break at the intersection of disciplines: the robot can move but data is collected poorly; the dataset exists but cannot be returned to hardware without pain; the model trains but doesn't live in real runtime. Here precisely the most boring but most valuable parts are covered — bringup, recording, visual control, conversion, and back-integration into ROS2. Therefore, the stack looks not like a research demo but like an educational-practical platform for embodied AI.

The project is especially useful for those who want to enter Physical AI without a lab budget. On a cheap SO-101 you can first debug basic things — power, calibration, teleop, cameras, data schema, and inference latency — and only then move to more complex manipulators. This approach saves months: first you build a reproducible pipeline, then experiment with policy, not the other way around. For students, research teams, and small startups this noticeably lowers the entry barrier.

What does this mean

Physical AI is gradually moving out of presentation mode and into reproducible open-source stacks. If an inexpensive manipulator can be brought up as a ROS2 system, demonstrations collected on it, policy trained, and returned back into hardware, then the entry barrier to embodied AI drops noticeably for both engineers and small teams. This means more experiments will happen not on slides but on real tables and in laboratories.

Hamidun News

AI news without noise. Daily editorial selection from 50+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation