RL 101 - Lesson 6 - Autoencoders & Latent Space
05 Nov 2025
What if we could learn a compact description of data with no labels at all? Autoencoders do exactly that: an encoder compresses input to a tiny bottleneck, and a decoder reconstructs the original from that bottleneck alone.
Encoder–decoder structure
\[z = f_\phi(x), \qquad \hat x = g_\theta(z), \qquad \mathcal{L} = \|x - \hat x\|^2.\]The network minimizes reconstruction error end-to-end. The bottleneck $z$ (here: 2-D) is forced to encode the most information-dense summary of $x$ — anything redundant gets discarded.
What the latent space reveals
Once trained, we can scatter the 2-D latent codes $z$ colored by class label. A good autoencoder will cluster similar inputs nearby in latent space without ever seeing class labels during training — the structure emerges purely from reconstruction pressure.
Architecture
Input(64) → Dense(32) → ReLU → Dense(2) ← encoder
Dense(2) → Dense(32) → ReLU → Dense(64) ← decoder
The full network is treated as one flat sequence of layers during backprop — the encoder and decoder aren’t trained separately; gradients flow through the bottleneck in a single pass.
Live demo
The left panel plots reconstruction quality (input vs output side-by-side). The right panel is the 2-D latent scatter plot — watch clusters form as training progresses.
Key takeaways
- Autoencoders are the simplest form of representation learning — useful for compression, anomaly detection, and pre-training.
- A 2-D bottleneck is small enough to visualize directly and dramatic enough to force meaningful compression.
- The latent geometry reflects training data structure: well-separated clusters indicate the network has discovered discriminative features despite having no supervision signal.
Next up — Lesson 7: we use the same idea of latent coordinates to teach a network to paint an image pixel by pixel.