For the sake of explanation, I’ll be dividing vocal production into three parts: breathing, vocal cords, and resonance. This final section focuses on resonance. Here, the original sound produced by the vocal cords is either reduced or emphasized in certain frequencies. In synthesizer terms, this corresponds to a filter.
■ Formant
The sound produced by the vocal cords alone is like a buzzer—it has pitch, but it can’t form vowels such as “a, i, u, e, o.” Vowels are primarily created by the shape of the oral cavity. By controlling the intensity of different frequency bands within the sound, vowels are realized. The frequency bands that are emphasized are called formants. When producing vowels, multiple formants combine, resulting in a somewhat complex behavior. Additionally, some formant frequency bands remain fixed, while others shift depending on the pitch. The combination of these characteristics creates an individual’s unique voice.
■ Vowels
In Japanese, there are five vowels, but if you include the “n” sound, you could say there are six. The number of vowels varies by language—for example, in the Nordic region, there are around forty, with a fine distinction made among many different vowels. Speech sounds are composed of vowels and consonants. Generally speaking, vowels are accompanied by the vibration of the vocal cords, while consonants are not.
Resonance plays an important role in vowels. Through resonance, vowels are formed, and at the same time, volume is maintained. I have electronically synthesized the vowels “a, i, u, e, o.” The video below shows these sounds visualized by frequency. From a synthesis standpoint, formants for pronouncing “a, i, u, e, o” are applied to a sawtooth wave, which contains many harmonic components. Although the specific formant frequencies are difficult to identify, you can observe the peaks of the harmonics shifting depending on each vowel. In practice, the harmonic distribution serves only as a rough guide. It can vary greatly depending on the vocal range.

You can also observe that “i” and “e” require higher-order harmonics. Meanwhile, “u” and “o” are somewhat similar, but “u” contains fewer harmonics, producing a darker and more muffled sound.
Lower vowels like those mentioned above contain a rich array of harmonics, so the pronunciation of “a, i, u, e, o” can be expressed fairly clearly. However, at higher pitches, the lack of harmonics makes the vowels less distinct. This is one of the reasons why soprano singers find it difficult to articulate vowels clearly in their upper range.
■ The Cat’s “ieaou (eeyao) [Meow]”
The sequence “a, i, u, e, o” can be a bit difficult to say quickly, don’t you think? If I rearrange the vowels to move from brighter to darker sounds, we get “i, e, a, o, u.” When this sequence is spoken rapidly while the pitch descends, it sounds like “nyaon,” which resembles a cat’s meow in Japanese. Even for a human speaker, “i, e, a, o, u” is relatively easy to pronounce because the shape of the oral cavity transitions smoothly between each vowel. However, if the “a” is articulated too distinctly, it becomes harder to pronounce smoothly—keeping it somewhat vague makes the transition more natural. In this way, a cat’s meow can be heard as an efficient vocalization that includes many vowels in a single sound.

■ Easily Pronounced Sounds
There are certain pronunciations that are common across the world. Sequences of sounds that are easy for humans to pronounce naturally evolve into meaningful words. Perhaps the most universal sound made by babies is “mama.” In Japanese, this becomes “ma-mm-ma,” while in English-speaking regions it’s “mama-”. It’s simply a sound that comes out naturally when the mouth is opened and a random vocalization is made. I once conducted an experiment using an artificial vocal cord, and the sound that came out most easily was also “mama”. When made into a plosive, “papa” was likewise easy to produce.
■ Nasal Resonance
While vowels are shaped by changes of the mouth in the oral cavity, resonance becomes especially important when singing. Although resonance occurs in the mouth alone, incorporating nasal resonance adds greater richness and projection to the sound.
As I mentioned in the section on breathing, many mammals vocalize while exhaling. This is often because they make effective use of nasal resonance. Mammalian animals’ anatomical structures differ greatly from those of humans, so we cannot directly compare them, but it is true that inhaled sounds tend to resonate more easily. Humans don’t normally vocalize during inhalation, so it’s not something most people try, but if you do experiment with it, you can directly feel the resonant power of the nasal cavity. It’s easier to do this by placing the voice in a falsetto position—keeping the vocal cords half-open and allowing partial vibration. It’s similar to the situation when a sound escapes as you yawn. If you can retain that sensation while singing during exhalation, your voice will resonate more effectively. Exhaled vocalization is primitive but quite logical, and it can provide useful insights. However, since most articulation occurs in the oral cavity, increasing the proportion of nasal resonance will make vowel pronunciation more indistinct. Additionally, as pitch rises, the ratio of nasal resonance naturally increases. This is another reason why producing and articulating high-pitched sounds is more difficult.
■ In Conclusion
We have briefly covered the topics of breathing, vocal cords, and resonance in relation to the voice. Among these, the most complex and troublesome aspect is likely the vocal cords. There are still many things we do not fully understand, and it is an area where subjective interpretation easily comes into play. Although it’s possible to observe the movement of the vocal cords using a scope or similar equipment, doing so requires quite an elaborate setup, and the relationship between the physical state and the resulting sound has yet to be fully systematized.
The “sound & person” column is made up of contributions from you.
For details about contributing, click here.

















 
                             
                             
                             
                             
                             
                             MXLマイク購入ガイド
                                
                                MXLマイク購入ガイド
                             タイプで選ぶ「良音」カラオケ配信機材
                                
                                タイプで選ぶ「良音」カラオケ配信機材
                             BOSS ボーカル・エフェクターのススメ
                                
                                BOSS ボーカル・エフェクターのススメ
                             コンデンサーマイクとは
                                
                                コンデンサーマイクとは
                             ワンランク上のボーカルマイク選び
                                
                                ワンランク上のボーカルマイク選び
                             虎の巻 カラオケ初心者講座
                                
                                虎の巻 カラオケ初心者講座
                             Discount Sale
Discount Sale Outlet
Outlet New Arrivals
New Arrivals Podcast (streaming)
Podcast (streaming) Headphones / Earphones
Headphones / Earphones Microphones
Microphones Wireless Equipment
Wireless Equipment Speakers
Speakers Power Amps
Power Amps Mixers
Mixers Processors
Processors Portable PA Systems
Portable PA Systems Recorders
Recorders Karaoke
Karaoke Guitars
Guitars Basses
Basses Ukuleles
Ukuleles Drums & Percussion
Drums & Percussion Pianos / Synthesizers
Pianos / Synthesizers Wind Instruments
Wind Instruments Stringed Instruments
Stringed Instruments Japanese Instruments
Japanese Instruments Harmonicas, Other
Harmonicas, Other Software
Software DJ & VJ
DJ & VJ Stands
Stands Cables & Connectors
Cables & Connectors Racks & Cases
Racks & Cases Lighting
Lighting Stage & Truss
Stage & Truss Video Equipment
Video Equipment Computer Accessories
Computer Accessories Power Supplies
Power Supplies Studio Furniture
Studio Furniture Household Items, Other
Household Items, Other Alcoholic Beverages
Alcoholic Beverages





 
  


 
    





 
   
   
   
  

