Neuroscience 6e Web Topic 13.1 - Music

Even though everyone recognizes music when they hear it, the concept of music is a difficult one. The Oxford English Dictionary defines it as “The art or science of combining vocal or instrumental sounds with a view toward beauty or coherence of form and expression of emotion.” Music obviously entails temporal aspects such as rhythm that are closely tied to motor behavior (think of foot-tapping and dancing) and emotional aspects that are augmented by lyrics and cultural preferences. In terms of the present chapter, however, music chiefly concerns the aspect of human audition that is experienced as tones. The stimuli that give rise to tonal percepts are periodic, meaning that they repeat systematically over time, like the sine wave in Figure 13.1 of the textbook. However, natural periodic stimuli do not occur as sine waves, but rather as complex repetitions involving several different frequencies; such stimuli give rise to a sense of harmony when sounded relatively close together in appropriate combinations, and they generate a sense of melody when they occur sequentially.

Although we usually take for granted the way tone-evoking stimuli are heard, this aspect of audition presents some profound puzzles. The most obvious of these is that humans perceive periodic stimuli whose fundamental frequencies have a 2:1 ratio as highly similar and, for the most part, musically interchangeable. These 2:1 intervals are called octaves. Thus, in Western musical terminology, any two tones related by an interval of one or more octaves are given the same name (i.e., the notes A, B, C, … G) and are distinguished only by a qualifier that denotes their relative ordinal position (e.g., C1, C2, C3). A key question, then, is why periodic sound stimuli whose fundamental frequencies have a 2:1 ratio are perceived as similar.

A second puzzling feature is that most, if not all, musical traditions subdivide octaves into a relatively small set of intervals for composition and performance, each interval being defined by its relationship to the lowest tone of the set. Such sets are called musical scales. The scales predominantly employed in all cultures over the centuries have used some (or occasionally all) of the 12 tonal intervals that in Western musical terminology are referred to as the chromatic scale (Figure 1). But certain intervals of the chromatic scale—such as the fifth, fourth, major third, and major sixth—are used more often than others in composition and performance. These form the majority of the intervals employed in the pentatonic and diatonic major scales, the two most frequently used scales in music worldwide. Again, there is no principled explanation for these preferences among all the possible intervals within the octave that humans can discriminate (~240 over an octave in the middle range of human hearing).

Figure 1 Ten of the 12 tones in the chromatic scale, related to a piano keyboard. The function above the keyboard indicates that these tones correspond statistically to peaks of power in normalized human speech. (After Schwartz et al., 2003.)

Perhaps the most fundamental question in music—and arguably the “common denominator” of all musical tonality—is why certain combinations of tones are perceived as relatively consonant, or “harmonious,” and others as relatively dissonant, or “inharmonious.” These perceived differences among the possible combinations of tones making up the chromatic scale are the basis for polytonal music in which the perception of relative harmony guides the composition of chords and melodic lines. The more compatible of these combinations are typically used to convey “resolution” at the end of a musical phrase or piece, whereas less compatible combinations are used to indicate a transition or a lack of resolution, or to introduce a sense of tension in a chord or melodic sequence. As with octaves and scales, the reason for this phenomenology remains a mystery.

The classic approaches to rationalizing octaves, scales, and consonance have been based on the fact that the musical intervals corresponding to octaves, fifths, and fourths (in modern musical terminology) are produced by physical sources whose relative proportions (e.g., the relative lengths of two plucked strings or their fundamental frequencies) have ratios of 2:1, 3:2, or 4:3, respectively; these relationships were first described by Pythagoras. This coincidence of numerical simplicity and perceptual effect has been so impressive over the centuries that attempts to rationalize phenomena such as consonance and scale structure in terms of mathematical relationships have tended to dominate thinking about these issues. This conceptual framework, however, fails to account for many perceptual observations, including, most famously, why people hear the pitch of the fundamental frequency of stimuli comprising only upper harmonics (called “hearing the missing fundamental”) and why, when the frequencies of a set of harmonics are changed by a constant value such that they lack a common divisor, the pitch heard corresponds to neither the fundamental frequency nor the frequency spacing between the harmonics (called the “pitch shift of the residue”).

It seems likely that a better way to consider all these musical issues is in terms of the biological rationale for evolving a sense of tonality in the first place rather than by pondering mathematical or other abstract relationships. Since the auditory system evolved in the world of natural sounds, it is presumably important that the majority of periodic sounds humans were exposed to over evolutionary time were those made by the human vocal tract in the process of communication, initially prelinguistic but, more recently, speech sounds (see Chapter 33 of the textbook). Developing a sense of tonality would enable listeners to respond not only to the distinctions among the different speech sounds that are important for understanding spoken language, but also to information about the probable sex, age, and emotional state of the speaker. It may thus be that music reflects the advantage of facilitating a listener’s ability to glean the linguistic intent and biological state of fellow humans through vocal utterances.

In keeping with this general idea, Michael Lewicki and his colleagues have argued that both music and speech are based on the demands of processing vocal sounds compared with the demands of processing nonvocal environmental sounds. Other recent work has shown that scale preferences can be explained in terms of the harmonic series that characterize voiced speech and that the emotional impact of major and minor scales tracks the differences in the harmonic ratios evident in excited versus subdued speech. This text has emphasized that auditory systems may have evolved to deal with different categories of natural sound stimuli, a premise that seems a promising framework for ultimately rationalizing the phenomenology of tonal music.


Bowling, D. L., K. Z. Gill, J. D. Choi, J. Prinz and D. Purves (2010) Major and minor music compared to excited and subdued speech. J. Acoust. Soc. Amer. 127: 491–503.

Burns, E. M. (1999) Intervals, scales, and tuning. In The Psychology of Music, D. Deutsch (ed.). New York: Academic Press, pp. 215–264.

Carterette, E. C. and R. A. Kendall (1999) Comparative music perception and cognition. In The Psychology of Music, D. Deutsch (ed.). New York: Academic Press, pp. 725–791.

Gill, K. Z. and D. Purves (2009) A biological rationale for musical scales. PLoS ONE 4: e8144. doi: 10.1371/ journal.pone.0008144.

Lewicki, M. S. (2002) Efficient coding of natural sounds. Nature Neurosci. 5: 356–363.

Pierce, J. R. (1983, 1992) The Science of Musical Sound. New York: W. H. Freeman, Chapters 4–6.

Plomp, R. and W. J. Levelt (1965) Tonal consonance and critical bandwidth. J. Acoust. Soc. Amer. 28: 548–560.

Rasch, R. and R. Plomp (1999) The perception of musical tones. In The Psychology of Music, D. Deutsch (ed.). New York: Academic Press, pp. 89–113.

Schwartz, D. A., C. Q. Howe and D. Purves (2003) The statistical structure of human speech sounds predicts musical universals. J. Neurosci. 23: 7160–7168.

Schwartz, D. A. and D. Purves (2004) Pitch is determined by naturally occurring periodic sounds. Hearing Res. 194: 31–46.

Smith, E. and M. S. Lewicki (2006). Efficient auditory coding. Nature 439: 978–982.

Terhardt, E. (1974) Pitch, consonance, and harmony. J. Acoust. Soc. Amer. 55: 1061–1069.