NVIDIA AI Robots Learn To Install Graphics Cards Without Human Help
The system is a closed feedback loop with four parts. An Environment module resets the scene, runs safety checks, and verifies each result. A Policy Improvement module writes and refines the control code by learning from reward signals, camera footage, run-time traces, and whatever went wrong. A Rollout module runs the physical trials, often across several robots at once, and logs everything. An Evolution module then compares agent branches, keeping code that works and scrapping what flops. It runs the way software teams lean on continuous testing, except the trials play out on physical robot arms.
To find the best performer, researchers set three coding agents loose: Codex on GPT-5.5, Claude Code on Opus 4.7, and Kimi Code on Kimi K2.6. Each pitched ideas, tested them on hardware, and kept only what improved. The result was striking. On fiddly, high-precision tasks, the trained robots hit a 99% success rate under pass@8, a metric that allows up to eight attempts per subtask, with each retry shaped by the last failure. That gauges genuine recovery, not luck.
The jobs were no joke. Robots sorted tiny pins into a box, cut zip ties with real cutters, and seated expansion cards and graphics cards directly into motherboard slots. Anyone who has fought to get a stubborn card into a PCIe slot knows that last one earns respect.
Fleet size shifts the math. The team tested groups of one, four, and eight agents running in parallel. A single agent took close to five hours to crack a task. Eight cut that to about two. To measure the tradeoffs, NVIDIA introduced two metrics, Mean Robot Utilization and Mean Token Utilization.
There is a catch. Larger teams reach a working policy much faster, yet token use balloons, since more agents spend more time reading logs, summarizing peers' branches, and coordinating. Robot utilization also slips when the models stall on debugging or wait for inference, leaving costly hardware idle. Speed buys results and burns tokens doing it.
Jim Fan, who co-leads the GEAR lab, framed the project as enabling "AutoResearch in the physical world for the first time." NVIDIA plans to open-source it, which could let universities, startups, and hobbyists build their own self-improving robot labs.
It falls in line with NVIDIA's past year of pushing physical AI hard, including the robotics agenda it laid out ahead of GTC 2026. According to NVIDIA's research the bottleneck was never the robots. It was us, flesh and blood humans.
