Instant-NGP Mobile NeRF Demos
Technical Details
The original MobileNeRF paper represents a scene as a fixed-topology triangle mesh whose per-face textures store an 8-D learned feature vector plus a binary opacity mask. At render time the mesh is rasterised with a conventional Z-buffer, and a tiny, view-dependent MLP embedded in the fragment shader converts those features into final pixel colours. Training happens in three stages: (1) continuous vertex/feature optimisation, (2) opacity binarisation with a straight-through estimator, and (3) baking everything into an OBJ + texture atlas so the scene ships as OBJ + PNGs + 2-layer MLP
.
- Instant-NGP hash encoding. Swapped MobileNeRF’s plain MLPs for the multiresolution hash grid from ash-awkey’s torch-ngp. The hash encoder delivers high-frequency detail with far fewer parameters, and Tiny-CUDA-NN kernels give a 6–8 × speed-up in the first optimisation stage.
- CUDA / PyTorch rewrite. The Google prototype is written in JAX; I re-implemented every stage with PyTorch + tiny-cuda-nn so it trains on commodity NVIDIA GPUs without TPU/JAX tooling.
- Opacity-first ray marcher. Instant-NGP is density-based by default, whereas MobileNeRF expects alpha values. I patched the ray-marcher and loss to predict opacity directly, avoiding an extra σ→α conversion and keeping baked textures compatible with the WebGL viewer.
Result → a full MobileNeRF trains from scratch in ≈ 5 h on a single consumer end GPU (≈ 2 h continuous, 2 h binarised, 1 h bake).
I have forked the instant-ngp repo, and implemented my training scripts there. Source code is below!
https://github.com/wwangg22/torch-ngp/