On the heels of research that shows that music you find pleasant does, in fact, reduce pain (although, given what I find “pleasant,” I’m going to need a private hospital bed) comes this bit of fun: “Musical Intervals in Speech,” by Duke University neuroscientists Deborah Ross, Jonathan Choi, and Dale Purves. Ross et al. analyzed the vowel formants of everyday speech and found, more often than not, that the frequency relationships correspond to the intervals of the 12-note chromatic scale.
To test the hypothesis that chromatic scale intervals are specifically embedded in the frequency relationships in voiced speech sounds (i.e., phones whose acoustical structure is characterized by periodic repetition), we analyzed the spectra of different vowel nuclei in neutral speech uttered by adult native speakers of American English, as well as a smaller database of Mandarin.
… [We calculated] the distribution of all F2/F1 ratios derived from the spectra of the 8 different vowels uttered by the 10 English-speaking participants (i.e., the relationships in 1,000 utterances of each of the vowels). Sixty-eight percent of these ratios fall on intervals of the chromatic scale (red bars), and all 12 chromatic intervals are represented over a span of 4 octaves.
In other words, the 12-note scale isn’t so arbitrary after all. Interestingly, there’s preference for tuning systems in speech as well:
In so far as the observations here inform this argument, the observed ratios in speech spectra accord most closely with a just intonation tuning system. Ten of the 12 intervals generated by the analysis of either English or Mandarin vowel spectra are those used in just intonation tuning, whereas 4 of the 12 match the Pythagorean tuning and only 1 of the 12 intervals matches those used in equal temperament. The two anomalies in our data with respect to just intonation concern the minor second and the tritone.
That minor-second/tritone anomaly brings up a good chicken-egg question, given that composers who work with more chromatic than diatonic sounds tend not to explore alternate tunings so much: does a preference for crunchy dissonance mean that just intonation sounds “wrong”? Or is it that, in our predominantly equal-temperament world, it’s those clashing seconds that sound the most “natural,” so that’s where the preference comes from? As someone who likes the sound of diatonic music in pure ratios, but opts for equal-tempered dissonance in my own, I’m inclined towards the latter, but I would imagine this is a highly personal impression.
Anyway, turns out Harold Hill was right: singing is just sustained talking.