RL 101 - Lesson 7 - Neural Painter
10 Nov 2025
What if a neural network was the image? Instead of classifying or compressing pictures, we train a network $f_\theta(x, y) \to (R, G, B)$ that maps pixel coordinates directly to colors. This is called an implicit neural representation.
The spectral bias problem
Vanilla MLPs with ReLU activations have a known weakness: they learn low-frequency components first and struggle to represent fine detail. A network fit to a checkerboard pattern will initially output a blurry gray blob.
The fix is Fourier feature encoding: lift the 2-D coordinate $(x, y)$ into a higher-dimensional space using sinusoids at multiple frequencies before feeding it to the network:
\[\gamma(x, y) = \bigl[x,\; y,\; \sin(2\pi k_1 x),\; \cos(2\pi k_1 x),\; \ldots,\; \sin(2\pi k_m y),\; \cos(2\pi k_m y)\bigr].\]This effectively gives the network a basis for representing high-frequency spatial patterns, bypassing the spectral bias.
Architecture
FourierFeatures(10 frequencies) → [1→10 dim]
→ Dense(48) → ReLU → Dense(48) → ReLU → Dense(3) → Sigmoid
The output is an $(R, G, B)$ triple in $[0, 1]$. We minimize per-pixel MSE against the 48×48 target image.
Live demo
The left panel is the target; the right panel is the network’s current reconstruction. Watch colour blobs sharpen into recognizable shapes as training steps accumulate.
Key takeaways
- Implicit neural representations are a powerful alternative to discrete grid-based image storage — they scale to arbitrary resolution and support smooth interpolation.
- Fourier feature encoding is a simple but highly effective technique to overcome spectral bias in coordinate networks.
- The same idea scales to 3-D (NeRF — Neural Radiance Fields) for view synthesis of real scenes.
Next up — Lesson 8: we bridge supervised learning and sequential decision-making with Deep Q-Networks.