The harmonic structure of speech sounds and musical intervals. (A) The spectrum of a voiced speech sound comprises a single harmonic series generated by the vibration of the vocal folds (the vertical green lines indicate the loci of harmonic peaks); the relative amplitude of the harmonics is modulated by the resonance of the rest of the vocal tract, thus defining speech formants (asterisks indicate the harmonic peaks of the first two formants, F1 and F2). The voiced speech segment shown here as an example was taken from the single word database and has a fundamental frequency of 150 Hz (black arrow in the lower panel). (B) The spectra of musical intervals entail two harmonic series, one from each of the two relevant notes (see text). The example in the upper panel shows the superimposed spectra of two musical notes related by a major third (the harmonic series of the lower note is shown in orange, and the harmonic series of the higher note in green, and the harmonics common to both series in brown). Each trace was generated by averaging the spectra of 100 recordings of tones played on an acoustic guitar with fundamentals of ∼ 440 and ∼ 550 Hz , respectively; the recordings were made under the same conditions as speech (see Sec. II ). The implied fundamental frequency (black arrow in the lower panel) is the greatest common divisor of the two harmonic series.
Source: ResearchGate