RL Adventures with Garry
From SolidWorks CAD ➜ MuJoCo simulation ➜ TD3 training — chasing a stable bipedal gait on a custom-built robot.

Getting Garry into MuJoCo
SolidWorks assemblies were converted to .urdf
and then to MuJoCo XML. Passive joints weren’t preserved, so I exported two variants of the model, measured the offsets, and stitched them back together with <equality joint …/> tags.
- Dual-export trick for passive joint coordinates
- Custom script to patch equality joints in XML
- One-to-one sensor mapping for MuJoCo logs


TD3 + Reward Shaping
A lightweight MuJoCo wrapper mimics OpenAI Gym’s step()
API. Experience is stored in a replay buffer and optimised with TD3. Most of the iteration happened in reward design:
- Phase 1 — reward Δx only ⇒ learned to jump
- Phase 2 — add upright penalty ⇒ cautious one-leg hop
- Phase 3 — ankle joint → ball joint hack ⇒ baby-steps
After fine-tuning the weights, Garry produced a semi-successful walk in sim — but the real servos couldn’t handle torque control, so deployment awaits a new hardware build.
