Color Space History

I recently read about the history of color spaces, especially the experiments conducted in the 1920s by William David Wright and John Guild, where observers matched a given wave length using some combination of three fixed primary wavelengths. Since they couldn’t visually match every wavelength by their primaries (which is also not possible in general), they allowed negative values as well, which meant that the primary was added to the reference color instead. That this would work out was not completely intuitive to me, so I wanted to dig into the math a bit.

The simplified model of human color vision operates as follows: The retina contains three types of cone cells, referred to as $\mathsf{L}$ (long-wavelength sensitive), $\mathsf{M}$ (medium-wavelength sensitive), and $\mathsf{S}$ (short-wavelength sensitive). When these cone cells are individually stimulated, they produce sensations of blue, green, or red, respectively. Each cone type $t \in\{\mathsf{L},\mathsf{M},\mathsf{S}\}$ is associated with a spectral sensitivity function $s_t$ , which maps the wavelength $\lambda$ to a sensitivity value $s_t(\lambda) \geq 0$ at that wavelength:

Any source of electromagnetic radiation (like visible light) comes with a spectral power distribution, which tells us how strong a certain frequency is present in the spectrum. There are more variables to describe the physical reality of waves, like polarization, but these are not important for vision itself. For simplicity, we assume a large but finite set of wavelengths $\Lambda$ within the visible spectrum, the SPD is then a function $P$ , associating each wavelength $\lambda\in\Lambda$ with the non-negative spectral irradiance value $P(\lambda)$ . We write $P_\lambda$ for a monochromatic source emitting only wavelength $\lambda$ with intensity $1$ . The sensory response of a cone type $t$ is then given by the sum $S_t(P):=\sum_\lambda P(\lambda)s_t(\lambda)$ . This is a linear functional, so $S_t$ is uniquely determined by the values $s_t(\lambda)$ .

As a result, a cone cell’s response to any light stimulus can be represented as a triple of three non-negative values, or put differently, the retina’s inner color space is three-dimensional. This property is known as trichromacy. Certain forms of color blindness are due to the presence of only two cone cell types, called dichromacy, and there is speculation about some human individuals actually being tetrachromats, possessing four individual types of cone cells. Additionally, rod cells, which play a significant role under low-light conditions, have a broader sensitivity curve than cone cells. While rods could theoretically enhance color perception at certain brightness levels, in practice, they do not significantly contribute to color vision.

So my question about Wright and Guild’s experiments were:

Does this only work with monochromatic primary sources? And can they choose any primary sources they likes?
Can they really measure the actual cone responses, or just build an isomorphic model?

Let $p_c$ for $c\in\{\mathsf{R},\mathsf{G},\mathsf{B}\}$ denote the SPDs of the three primary sources (or channels), tuned to ~700 nm, ~546 nm and ~435 nm in wavelength respectively. What happens in the test, you have some monochromatic light source with wave length $\lambda$ and the user tries to find values $m_\mathsf{R}(\lambda)$ , $m_\mathsf{G}(\lambda)$ , $m_\mathsf{B}(\lambda)$ , which can be negative as well, such that for each type $t\in\{\mathsf{L},\mathsf{M},\mathsf{S}\}$ , we have:

S_t(P_\lambda+\sum_c\text{max}(0,-m_c(\lambda))p_c)=S_t(\sum_c\text{max}(0,m_c(\lambda))p_c)

The trick is to just allow negative values in the SPD, then $S_t$ becomes a linear functional on $\rr^\Lambda$ and this equality holds if and only if it holds after subtracting $\sum_c\text{max}(0,-m_c(\lambda))p_c$ on both sides:

S_t(P_\lambda)=S_t(\sum_c\text{max}(0,m_c(\lambda))p_c-\sum_c\text{max}(0,-m_c(\lambda))p_c)=S_t(\sum_c m_c(\lambda) p_c)=\sum_c m_c(\lambda) S_t(p_c)

Combining the three functionals $S_\mathsf{L},S_\mathsf{M},S_\mathsf{S}$ into a single linear map

S:\rr^\Lambda\to\rr^3,\quad P\mapsto (S_\mathsf{L},S_\mathsf{M},S_\mathsf{S}),

we find that each monochromatic stimulus satisfies $S(P_\lambda)=\sum_c m_c(\lambda) S(p_c)$ . In other words, every $\mathbf{s}(\lambda)$ can be expressed as a linear combination of the three vectors $S(p_c)$ , and by extension, every $\mathbf{s}(P)$ as well. If these vectors are linearly independent in $\rr^3$ , then for every wavelength $\lambda$ there exists a unique triple $\mathbf{m}(\lambda)$ . Also, we never made use of the $p_c$ being monochromatic in the experiment, so in theory, any choice of three primary sources inducing independent responses in cone space will do.

The measured outcomes of Wright and Guild’s experiments do not reveal the absolute cone sensitivities $\mathbf{s}(\lambda)$ directly. Instead, they describe these responses in the coordinate system constructed from $S(p_c)$ , the so-called RGB color space.

In order to express all colors using only non-negative coefficients, a linear isomorphism $F:\rr^3\to\rr^3$ was chosen so that all transformed vectors $F(\mathbf{m}(\lambda))$ lie in the non-negative orthant $[0,\infty)^3$ . Such an $F$ must exist because the underlying cone responses $s_t(\lambda)$ themselves are non-negative, and this change of basis leads to the CIE 1931 XYZ color space.

Because perception of luminance if largely relative, color science often ignores the magnitude of vectors and focuses on chromaticity, typically by setting $x:=X/(X+Y+Z)$ and $y:=Y/(X+Y+Z)$ , together they define the chromaticity plane. For example, orange and brown describes the same chromacity, but how the brain interprets it, depends a lot on context. An orange square in dim light conditions and a brown square under bright light might elicit the same response on the retina but get parsed totally different by the brain.

If we trace out the locus $\{F(\mathbf{m}(\lambda)):\lambda\in\Lambda\}$ , we obtain a curve in color space that starts and ends near the origin, reflecting the weak cone responses to wavelengths at the edges of the visible spectrum. When projected onto the chromaticity plane, this curve forms the spectral locus, the characteristic sail-shaped arc shown in the famous plot below. The resulting chromaticity diagram is the convex hull of this locus, obtained by connecting its endpoints with a straight segment (the line of purples, shown as dashed line) and including all points within, representing all chromaticities producible by physical mixtures of light:

This also explains why displays using three fixed primaries can never reproduce the entire visible gamut, their possible mixtures form the convex hull of the the three primaries in chromaticity space, a triangle that always fits inside the sail. Adding more primaries can expand this polygon, but never fully cover the spectral locus.

Everything above describes the range of possible stimuli produced by light sources. The situation is more complex for materials, because the same surface can look different under different illumination spectra. In principle, if the reflectance of two materials would match for all wavelengths except one, the difference would still be visible when shined on with monochromatic light in that exact wavelength.

Also, vice versa, two light sources with SPDs $P_1$ and $P_1$ are metamers if and only if $S(P_1)=S(P_2)$ , that means they invoke the same cone response and hence look identically to the human eye, even when $P_1(\lambda)\neq P_2(\lambda)$ for most $\lambda$ . But when looking not only at the light but at various materials illuminated by the sources, they can sometimes be easily differentiated. For example, red and green light together is perceived as yellow, just as light from sodium-vapor lamps. But the latter is truly monochromatic, meaning that every material illuminated by such a lamp, would reflect the same wavelength (in varying intensity) back to the eye, so the scene looks like a black-and-white photo with yellow tint, while materials illuminated by a mixture of red and green light can look red, yellow, green or anything in between.

Beyond this, there are so-called impossible colors, points in cone space which no physical light source can ever produce. These typically correspond to strong activation of M-cones with very little activation of the other cone types, since the sensitivity curves of M-cone overlap everywhere with at least one other cone type. On the other hand, light at the ends of the visible spectrum can fairly well activate L- or S-cones in isolation. Recently an experiment allowed percipitants to see these impossible colors by stimulating isolated cone cells using lasers, bypassing the usual optical constraints.

One last concept I want to bring up is color temperature. I always wondered how the two-dimensional space of colors can be reduced to a one-dimensional number. This arises from the SPD which materials emit when being heated, like a glowing light bulb. For most materials, this corresponds strongly with black-body radiation, a special SPD parametrized by temperature $T$ , following Planck’s law with $c_1$ and $c_2$ being the two radiation constants:

B(\lambda,T)=\frac{c_1}{\lambda^5(\text{exp}(c_2/(\lambda T))-1)}

This is a rather broad SPD, producing reddish hues at low temperature (where most of its energy lies in the infrared spectrum), then whitish hues around 6000\nbsp;K (the sun is around 5800\nbsp;K) and blueish hues at higher temperatures. Plotting the chromaticities of black-body spectra at different temperatures produces the Planckian locus

\{(x(B(\lambda,T)),y(B(\lambda,T))):T>0\},

a curve running roughly through the center of the chromaticity diagram. The color temperature of any light source is defined as the temperature of the point on the Planckian locus closest to that the $xy$ -value of that source, so it makes mostly sense for reddish, whiteish or bluish hues and not so much for green or purple hues, which sit further apart from the locus.

All of this only scratches the surface of color science. The CIE 1931 space remains foundational, but later work has gone further, from perceptually uniform spaces like the Oklab color space, where geometric distance more closely reflects perceived difference, to appearance models that account for adaptation and context.