The Difficult Sound Creation of FM Sound Sources
The FM sound source has a wide range of tone variations, and it is possible to create most musical instrument sounds. However, the sound creation process is so complicated that many users have given up trying to create their own sounds. For this reason, presets have been selling well.
Basically, sounds are created by combining operators (about 2 to 6) that have a sine wave like the one shown below. There are various ways to combine them, but they develop as follows. The operator at the bottom, called the carrier, is the output.

What are the operators stacked vertically doing? In the case of two layers, the formula is as follows.
sin(2pi * (x + sin(2pi * x)))
Furthermore, the sound is created by setting the frequency ratio, amplitude and envelope (changes in value over time) for each operator. However, as this type of information is widely available on the internet, this article will focus on a topic that is not generally discussed.
The Cause of the Complexity of FM Sound Sources
When creating sounds using FM synthesis, I think it is important to understand the Nyquist frequency and aliasing noise. Another important factor is the relationship between harmonics and timbre. Sounds with a sense of pitch are mainly composed of integer harmonics. As the proportion of sounds other than integer harmonics increases, the sense of pitch is lost. The figure below shows the timbre of integer harmonics. As the frequency axis is displayed linearly rather than logarithmically, you can see that the relationship between harmonics is evenly spaced.

Nyquist Frequency
In digital audio processing, if you don't know this, you'll run into trouble. The highest frequency that can be recorded in digital audio is fixed. This upper limit is called the Nyquist frequency. To put it another way, the Nyquist frequency is half the sampling frequency. For example, if the sampling frequency is 48,000Hz, the Nyquist frequency is 24,000Hz. It is said that the upper limit of the frequency that humans can hear is 20,000Hz.
When recording through a microphone, the signal is filtered, so if it exceeds the usable frequency range, it will simply not be recorded. However, when synthesizing sound internally, the upper limit becomes an issue. If the waveform calculated internally exceeds the Nyquist frequency, the phenomenon of the sound bouncing back without being deleted will occur. In FM sound sources, these are used in a sense to create sound.
Folding Noise (Aliasing Noise)
The phenomenon of sound bouncing back is referred to here as “folding noise”. When you increase the parameters of an FM sound source, folding noise will be introduced and the sound will start to sound random. This can be confusing because the sound will change in a way that you can't imagine from the previous sound. Furthermore, because digital processing is discrete and jumps to the next value at each step, you can't expect smooth changes. In the case of early DX7s with low resolution, the values change rapidly, which can add to the confusion.
The video below shows the frequency spectrum in linear display when the pitch is smoothly raised using FM sound source. The Nyquist frequency is on the far right, and the harmonics gradually move towards the high frequency side. When they reach the far right, they bounce back and the sound appears in the audible range. Folding noise means that non-integer harmonics are definitely included, so it impairs the sense of pitch. If the level is low, it is not so noticeable, but as shown in the diagram, it is quite noticeable when the level of the harmonics is high.
The FM sound source is a method of creating sound by working with this folding noise, but the changes it produces do not match human sensibilities. It does not produce the natural changes you would expect from an analog synthesizer. However, if the folding noise does not have a significant effect on the sound, it is not actually that troublesome.
This folding noise is always a problem when creating sounds digitally, but in the 1980s, instruments were created without any countermeasures due to the calculation cost. With FM sound sources, it could be said that this was used to create metallic sounds. Ideally, it should be eliminated. Nowadays, it can be made less noticeable by oversampling, or it can be eliminated completely. However, doing so spoils the traditional FM sound. For this reason, many of today's software synthesizers are created using the traditional method of using folding noise.
I've always wondered about this, but I've never seen an article in a magazine or anywhere else that mentions the above. Personally, I think it's very important, and I think that understanding these things alone will make it easier to create FM sounds. I think that a spectrum analyzer is essential for monitoring harmonics when creating sounds. However, since most of them have a logarithmic frequency axis, I find them a little difficult to use when I want to monitor harmonics. The spectrum analyzer above is one that has been modified to have a linear frequency axis.
The Key to Create Sounds with the Above in Mind
Creating sounds with FM sound sources is interesting, so I would like to write about it in detail, but there is no end to it, so I will just note the points that are directly related to the above.
- Folding noise is a non-integer harmonic, and it will sound metallic or without a sense of pitch.
- Since most pitched instruments are made up of integer harmonics, if you want to create a sound like that, use only the attack of the folding noise, and make the sustain part consist of integer harmonics to make it sound like an instrument.
- On the other hand, if you want to make the sense of pitch more diluted, or if you want to make a sound like hitting a metal or percussion instrument, you should actively use folding noise.
Sounds that FM Sound Source are Not Good at
The thing that FM sound sources created from sine waves are not good at is that it is difficult to create a sawtooth wave with a clean edge. It tends to be somewhat round and become woodwind-like, and if you force it, it becomes metallic.
The DX7 had a function called feedback, which could be used to generate everything from pseudo-sawtooth waves to noise. The feedback function seems to simply take the average of the operator values, but it is simple and effective, and it follows the original FM sound source well. Even so, it does not produce a clean sawtooth wave, but rather a sound that is characteristic of the FM sound source, as shown in the diagram below. The soft string sounds cannot compare to analog synthesizers.
In recent years, FM sound sources often have basic waveforms and filters other than sine waves, so the range of sounds that can be created has increased. If you only listen to the sound, you would never guess that it was an FM sound source. However, in terms of differentiating it from other sound sources, it can be said that the slightly inflexible FM sound source of the past still emits a strong individuality.

The “sound & person” column is made up of contributions from you.
For details about contributing, click here.