← Back to projects

RL Adventures with Garry

From SolidWorks CAD ➜ MuJoCo simulation ➜ TD3 training — chasing a stable bipedal gait on a custom-built robot.

Garry robot model overlay

Getting Garry into MuJoCo

SolidWorks assemblies were converted to .urdf and then to MuJoCo XML. Passive joints weren’t preserved, so I exported two variants of the model, measured the offsets, and stitched them back together with <equality joint …/> tags.

  • Dual-export trick for passive joint coordinates
  • Custom script to patch equality joints in XML
  • One-to-one sensor mapping for MuJoCo logs
CAD to MuJoCo
TD3 training graph

TD3 + Reward Shaping

A lightweight MuJoCo wrapper mimics OpenAI Gym’s step() API. Experience is stored in a replay buffer and optimised with TD3. Most of the iteration happened in reward design:

  • Phase 1 — reward Δx only ⇒ learned to jump
  • Phase 2 — add upright penalty ⇒ cautious one-leg hop
  • Phase 3 — ankle joint → ball joint hack ⇒ baby-steps

After fine-tuning the weights, Garry produced a semi-successful walk in sim — but the real servos couldn’t handle torque control, so deployment awaits a new hardware build.

Semi-successful walk
Read the full build log →