79

In music, when two or more pitches are played together at the same time, they form a chord. If each pitch has a corresponding wave frequency (a pure, or fundamental, tone), the pitches played together make a superposition waveform, which is obtained by simple addition. This wave is no longer a pure sinusoidal wave.

For example, when you play a low note and a high note on a piano, the resulting sound has a wave that is the mathematical sum of the waves of each note. The same is true for light: when you shine a 500nm wavelength (green light) and a 700nm wavelength (red light) at the same spot on a white surface, the reflection will be a superposition waveform that is the sum of green and red.

My question is about our perception of these combinations. When we hear a chord on a piano, we’re able to discern the pitches that comprise that chord. We’re able to “pick out” that there are two (or three, etc) notes in the chord, and some of us who are musically inclined are even able to sing back each note, and even name it. It could be said that we’re able to decompose a Fourier Series of sound.

But it seems we cannot do this with light. When you shine green and red light together, the reflection appears to be yellow, a “pure hue” of 600nm, rather than an overlay of red and green. We can’t “pick out” the individual colors that were combined. Why is this?

Why can’t we see two hues of light in the same way we’re able to hear two pitches of sound? Is this a characteristic of human psychology? Animal physiology? Or is this due to a fundamental characteristic of electromagnetism?

Qmechanic
  • 220,844
chharvey
  • 878

5 Answers5

77

This is because of the physiological differences in the functioning of the cochlea (for hearing) and the retina (for color perception).

The cochlea separates out a single channel of complex audio signals into their component frequencies and produces an output signal that represents that decomposition.

The retina instead exhibits what is called metamerism, in which only three sensor types (for R/G/B) are used to encode an output signal that represents the entire spectrum of possible colors as variable combinations of those RGB levels.

niels nielsen
  • 99,024
44

Our sensory organs for light and sound work quite differently on a physiological level. The eardrum directly reacts to pressure waves while the photoreceptors on the retina are only senstive to a narrow range around the frequencies associated with red, green and blue. All light frequencies in between partly excite these receptors and the impression of seeing for example yellow arises due to the green and red receptors being exited with certain relative intensities. That's why you can fake out the color spectrum with only 3 different colors at each pixel of the display.

Seeing color in this sense is also more of a useful illusion than direct sensing of physical properties. Mixing colors in the middle of the visible spectrum retains a good approximation of the average frequency of the light mix. If colors from the edges of the spectrum are mixed, i.e. red and blue, the brain invents the color purple or pink to make sense of that sensory input. This however doesn't correspond to the average of the frequencies (which would result in a greenish color) nor does it correspond to any physical frequency of light. Same goes for seeing white or any shade of grey, as these correspond to all receptors being activated with equal intensity.

Mammal eyes also evolved in a way to distinguish intensity rather than color, since most mammals are nocturnal creatures. But I'm not sure if the ability to see in color was only established recently, that would be question for a biologist.

Halbeard
  • 754
21

This is due mostly to physiology. There is a fundamental difference in the way we perceive sound vs. light: For sound we can sense actual waveform, whereas for light we can sense only the intensity. To elaborate:

  • Sound waves entering your ear cause synchronous vibrations in your cochlea. Different regions of the cochlea have tiny hairs which vibrate in a frequency-selective way. The vibrations of these hairs are turned into electrical signals which are passed on to the brain. Due to the frequency selectivity of the hairs, the cochlea essentially performs a Fourier transform, which is why we can perceive superpositions of waves.
  • Light has such a high frequency that almost nothing can resolve the actual waveform (even state of the art electronics nowadays cannot do this). All we can effectively measure is the intensity of the light, and this is all that the eyes can perceive as well. Knowing the intensity of a light beam is not sufficient to determine its spectral content. E.g. a superposition of two monochromatic waves can have the same intensity as a pure monochromatic wave of a different frequency.

    We can differentiate superpositions of light in a limited way, due to the fact that eyes perceive three separate color channels (roughly RGB). This is why we can distinguish equal intensities of red and blue light. People with colorblindness have a defective receptor, and so color combinations that most humans can distinguish appear identical to them.

    Not all colors that we perceive correspond to a color of a monochromatic light wave. Famously, there is an entire "line of purples" which do not represent any monochromatic light wave. So for people trained in distinguishing purple colors, they can actually differentiate superpositions of light waves in a limited way.

    enter image description here

Yly
  • 3,729
16

Rod (1 type) plus cone (3 types) neurons in the eye give you the potential for 4-D sensation. Since the rod signal is nearly redundant to the totality of cone signals, this is effectively a 3-D sensation.

Cochlear (roughly 3500 "types" simply due to 3500 different inner hair positions) neurons in the ear give you the potential for 3500-D sensation, so trained ears can potentially recognize the simulatenous amplitudes from thousands of frequencies.

So, to answer your question, eyes simply didn't evolve to have many cone types. An improvement, however, is seen through the eyes of mantis shrimp (with the potential for 16-D sensation). Notice the trade-off between spatial image resolution and color perception (and that audio spatial resolution was less important in evolution, and more difficult due to the longer wavelength).

bobuhito
  • 1,046
  • 1
  • 8
  • 16
6

The hairs form a 1D-array along the frequency axis, while rods and rods and cones form a spatial 2D array. In addition, that 2D array has 4 channels (rods and 3 types of cones). So the 2 ears have a poor spatial resolution, while the eyes have poor frequency resolution.

You could imagine an eye with many more types of cones, giving you a better frequency resolution. However, that would mean that the cones for a single color would be spaced further apart, limiting spatial resolution. In the end, that's an evolutionary trade-off. Physics tells us you can't have both at the same time, but biology is why we end up with this particular outcome.

MSalters
  • 5,649