Scientists show predictable training can outperform complex robot learning data

Date:2026-06-08 09:43:40

Teaching robots to manipulate objects with human-like dexterity remains one of the biggest challenges in robotics. A new study suggests that the answer may not lie in feeding robots more complex training data, but in giving them more consistent examples to learn from.

Researchers from New York University Tandon School of Engineering and the Robotics and AI Institute found that robots trained on structured, predictable demonstrations performed significantly better than those trained on highly variable examples. The work could help improve how robots learn tasks that involve complex hand movements, changing grips, and coordination between multiple limbs.

Many robot-learning systems rely on imitation learning, where machines learn by copying demonstrations performed by humans. But collecting demonstrations for highly dexterous tasks is difficult because teleoperation systems struggle to capture the fine finger movements and contact-rich interactions involved.

To overcome that limitation, the researchers turned to motion-planning algorithms that automatically generate demonstrations inside physics simulations. Instead of learning from humans, the robots learned from virtual examples created by software.

The team soon discovered a problem. Popular planning methods known as rapidly exploring random trees (RRTs) produced solutions that varied too much from one demonstration to another, making it harder for robots to identify the behavior they were supposed to imitate.

Consistency beats randomness

“These planners are very good at finding solutions,” said lead author Huaijiang Zhu.

“But when every solution looks different, the learning system struggles to figure out what behavior it should imitate.”

According to the researchers, the randomness in RRT-generated demonstrations creates what is known as high-entropy data. While such diversity helps planning algorithms explore different solutions, it can reduce the effectiveness of imitation learning.

To address the issue, the team developed alternative planning approaches designed to generate more consistent demonstrations. One method prioritized steady progress toward a goal, while another relied on a library of predefined motions to reduce variation between examples.

The researchers evaluated the approach using two challenging manipulation tasks. In one experiment, two robotic arms had to rotate a large cylinder by 180 degrees while repeatedly adjusting their grips. In another, a dexterous robotic hand manipulated a cube within its palm to match target orientations.

Virtual lessons, real results

Robots trained on the more consistent demonstrations achieved substantially higher success rates than those trained on standard RRT-generated data. In the dual-arm task, the system reached near-perfect performance using only 100 demonstrations.

The team also transferred the learned policies directly from simulation to physical hardware without additional retraining. The dual-arm robot succeeded in 90% of real-world trials, while the robotic hand completed about 62% of its attempts.

The findings highlight a growing trend in robotics that combines traditional motion planning with machine learning. Rather than treating the two approaches separately, researchers are increasingly using planning algorithms to generate training data for learning systems.

The study also reinforces a broader lesson in artificial intelligence: larger amounts of data do not always lead to better learning. In some cases, carefully structured examples may be more valuable than large collections of noisy or inconsistent demonstrations.

The study was published in the journal IEEE Robotics and Automation Letters.

0.012638s