RL Adventures with Garry

From SolidWorks CAD ➜ MuJoCo simulation ➜ TD3 training — chasing a stable bipedal gait on a custom-built robot.

Getting Garry into MuJoCo

SolidWorks assemblies were converted to .urdf and then to MuJoCo XML. Passive joints weren’t preserved, so I exported two variants of the model, measured the offsets, and stitched them back together with <equality joint …/> tags.

Dual-export trick for passive joint coordinates
Custom script to patch equality joints in XML
One-to-one sensor mapping for MuJoCo logs

TD3 + Reward Shaping

A lightweight MuJoCo wrapper mimics OpenAI Gym’s step() API. Experience is stored in a replay buffer and optimised with TD3. Most of the iteration happened in reward design:

Phase 1 — reward Δx only ⇒ learned to jump
Phase 2 — add upright penalty ⇒ cautious one-leg hop
Phase 3 — ankle joint → ball joint hack ⇒ baby-steps

After fine-tuning the weights, Garry produced a semi-successful walk in sim — but the real servos couldn’t handle torque control, so deployment awaits a new hardware build.

Read the full build log →