CV Lab 4 - Color Clustering and Segmentation
20 Apr 2026
Not every computer vision task needs a labelled dataset. Unsupervised methods cluster or group pixels based on their raw values — no annotation required. K-means colour clustering is one of the oldest and most widely-used: it partitions an image’s pixels into $k$ groups so that each pixel is assigned to the group whose mean colour is closest to it. The result is a palette-quantised image and, implicitly, a coarse segmentation.
The k-means algorithm
Given a set of pixels ${p_1, \ldots, p_N} \subset \mathbb{R}^3$ (each pixel is an RGB triple) and a target number of clusters $k$:
- Initialise $k$ centroids ${\mu_1, \ldots, \mu_k}$ by sampling $k$ distinct random pixels.
- Assign each pixel to its nearest centroid: $a_i = \arg\min_j |p_i - \mu_j|^2$.
-
Update each centroid to the mean of its assigned pixels: $\mu_j = \frac{1}{ C_j }\sum_{i \in C_j} p_i$. - Repeat steps 2–3 until convergence (no reassignments) or a fixed number of iterations.
The objective minimised is the within-cluster sum of squared distances:
\[J = \sum_{j=1}^{k} \sum_{i \in C_j} \|p_i - \mu_j\|^2\]K-means is guaranteed to converge (since $J$ decreases or stays constant at each step) but not to find the global minimum — different initialisations produce different results.
Colour quantisation vs. segmentation
This demo performs colour quantisation: every pixel in the output is replaced by its centroid’s colour, reducing the image to exactly $k$ distinct colours. This is the algorithm used in GIF compression and palette-based image formats.
When objects in an image happen to have distinct colours (a red car on a grey road, a green apple on a white table), colour quantisation also produces a rough segmentation — each segment corresponds to a cluster. This breaks down when objects share colours or when lighting creates gradients across a uniform surface.
Choosing k
| k | Effect |
|---|---|
| 2–3 | Coarse foreground/background split, heavy colour loss. |
| 4–8 | Recognisable image with a distinct palette. |
| 8–16 | Most structural detail preserved; subtle colours may merge. |
| > 16 | Near-lossless for natural images. |
There is no universally correct $k$ — it depends on the image and the task. Methods like the elbow criterion (plot $J$ vs $k$ and look for the kink) or the silhouette score can help choose automatically.
Live demo
Things to try
- Set $k = 2$ and observe which colours end up on each side of the boundary.
- Increase $k$ until the gradient background is faithfully reproduced — note how the palette grows.
- Click “New image” to get a different scene and run k-means again — the result depends on what colours dominate.
- Run k-means twice on the same image with the same $k$ — do you always get the same palette? (Probably not, because random initialisation varies.)
Key takeaways
- K-means partitions pixels into $k$ groups by minimising within-cluster colour variance.
- It is unsupervised: no labels, no training set, no gradient descent.
- Colour quantisation and coarse segmentation are two faces of the same operation.
- The result is sensitive to initialisation — multiple runs with the best outcome is a standard practice.