For the sake of explanation, I’ll be dividing vocal production into three parts: breathing, vocal cords, and resonance. This final section focuses on resonance. Here, the original sound produced by the vocal cords is either reduced or emphasized in certain frequencies. In synthesizer terms, this corresponds to a filter.
■ Formant
The sound produced by the vocal cords alone is like a buzzer—it has pitch, but it can’t form vowels such as “a, i, u, e, o.” Vowels are primarily created by the shape of the oral cavity. By controlling the intensity of different frequency bands within the sound, vowels are realized. The frequency bands that are emphasized are called formants. When producing vowels, multiple formants combine, resulting in a somewhat complex behavior. Additionally, some formant frequency bands remain fixed, while others shift depending on the pitch. The combination of these characteristics creates an individual’s unique voice.
■ Vowels
In Japanese, there are five vowels, but if you include the “n” sound, you could say there are six. The number of vowels varies by language—for example, in the Nordic region, there are around forty, with a fine distinction made among many different vowels. Speech sounds are composed of vowels and consonants. Generally speaking, vowels are accompanied by the vibration of the vocal cords, while consonants are not.
Resonance plays an important role in vowels. Through resonance, vowels are formed, and at the same time, volume is maintained. I have electronically synthesized the vowels “a, i, u, e, o.” The video below shows these sounds visualized by frequency. From a synthesis standpoint, formants for pronouncing “a, i, u, e, o” are applied to a sawtooth wave, which contains many harmonic components. Although the specific formant frequencies are difficult to identify, you can observe the peaks of the harmonics shifting depending on each vowel. In practice, the harmonic distribution serves only as a rough guide. It can vary greatly depending on the vocal range.

You can also observe that “i” and “e” require higher-order harmonics. Meanwhile, “u” and “o” are somewhat similar, but “u” contains fewer harmonics, producing a darker and more muffled sound.
Lower vowels like those mentioned above contain a rich array of harmonics, so the pronunciation of “a, i, u, e, o” can be expressed fairly clearly. However, at higher pitches, the lack of harmonics makes the vowels less distinct. This is one of the reasons why soprano singers find it difficult to articulate vowels clearly in their upper range.
■ The Cat’s “ieaou (eeyao) [Meow]”
The sequence “a, i, u, e, o” can be a bit difficult to say quickly, don’t you think? If I rearrange the vowels to move from brighter to darker sounds, we get “i, e, a, o, u.” When this sequence is spoken rapidly while the pitch descends, it sounds like “nyaon,” which resembles a cat’s meow in Japanese. Even for a human speaker, “i, e, a, o, u” is relatively easy to pronounce because the shape of the oral cavity transitions smoothly between each vowel. However, if the “a” is articulated too distinctly, it becomes harder to pronounce smoothly—keeping it somewhat vague makes the transition more natural. In this way, a cat’s meow can be heard as an efficient vocalization that includes many vowels in a single sound.

■ Easily Pronounced Sounds
There are certain pronunciations that are common across the world. Sequences of sounds that are easy for humans to pronounce naturally evolve into meaningful words. Perhaps the most universal sound made by babies is “mama.” In Japanese, this becomes “ma-mm-ma,” while in English-speaking regions it’s “mama-”. It’s simply a sound that comes out naturally when the mouth is opened and a random vocalization is made. I once conducted an experiment using an artificial vocal cord, and the sound that came out most easily was also “mama”. When made into a plosive, “papa” was likewise easy to produce.
■ Nasal Resonance
While vowels are shaped by changes of the mouth in the oral cavity, resonance becomes especially important when singing. Although resonance occurs in the mouth alone, incorporating nasal resonance adds greater richness and projection to the sound.
As I mentioned in the section on breathing, many mammals vocalize while exhaling. This is often because they make effective use of nasal resonance. Mammalian animals’ anatomical structures differ greatly from those of humans, so we cannot directly compare them, but it is true that inhaled sounds tend to resonate more easily. Humans don’t normally vocalize during inhalation, so it’s not something most people try, but if you do experiment with it, you can directly feel the resonant power of the nasal cavity. It’s easier to do this by placing the voice in a falsetto position—keeping the vocal cords half-open and allowing partial vibration. It’s similar to the situation when a sound escapes as you yawn. If you can retain that sensation while singing during exhalation, your voice will resonate more effectively. Exhaled vocalization is primitive but quite logical, and it can provide useful insights. However, since most articulation occurs in the oral cavity, increasing the proportion of nasal resonance will make vowel pronunciation more indistinct. Additionally, as pitch rises, the ratio of nasal resonance naturally increases. This is another reason why producing and articulating high-pitched sounds is more difficult.
■ In Conclusion
We have briefly covered the topics of breathing, vocal cords, and resonance in relation to the voice. Among these, the most complex and troublesome aspect is likely the vocal cords. There are still many things we do not fully understand, and it is an area where subjective interpretation easily comes into play. Although it’s possible to observe the movement of the vocal cords using a scope or similar equipment, doing so requires quite an elaborate setup, and the relationship between the physical state and the resulting sound has yet to be fully systematized.
The “sound & person” column is made up of contributions from you.
For details about contributing, click here.






MXLマイク購入ガイド
タイプで選ぶ「良音」カラオケ配信機材
BOSS ボーカル・エフェクターのススメ
コンデンサーマイクとは
ワンランク上のボーカルマイク選び
虎の巻 カラオケ初心者講座

