Introduction

Choosing the Right Window Size

Time and Frequency Resolution of the FFT

Comparison of two window resolutions. The left schema shows a better time resolution, with an important succession of windows in time. The right one shows a better frequency resolution, with an important number of bins in the same window.
Comparison of two window resolutions. The left schema shows a better time resolution, with an important succession of windows in time. The right one shows a better frequency resolution, with an important number of bins in the same window.

The analysis window has a fixed resolution, which determines whether there is either a good frequency resolution – frequency components close together can be separated – or good time resolution – the time at which frequencies change – . A wide window gives better frequency resolution but poor time resolution. A narrower window gives good time resolution but poor frequency resolution. These are called narrowband and wideband transforms, respectively. The size of the FFT can improve the frequency definition of the analysis.

Sounds Characteristics

All sounds don't have the same characteristics, and these characteristics can change in time, or not. Selecting an FFT size involves making a compromise in termes of time and frequency accuracy. The more accurate the analysis is in one domain, the less accurate it will be in the other. The user most often make a compromise...

Temporal Variations

Variations in a stable sound occur every 2000 to 4000 samples, that is, 44 to 88 ms.

Variations in a rhythmic sound occure every 50 to 1000 samples, that is, 11,3 to 22,6 ms.

  • If we adapt the window size to the frequency of a 100 Hz sound – G2 –, and take a 2048 samples and 50 ms analysis window, we can easily analyse a stable sound, but not a fluctuating sound.

  • If we adapt the window size to the frequency of a 440 Hz sound – A3 –, we have a 512 samples and 11 ms analysis window, which is more appropriate for a fluctuating sound.

  • Nevertheless, with a 2048 window size, our frequency resolution is equal to 44100/2048, that is 21,5 Hz, which is quite precise. With a 512 window size, we get a frequency resolution of 86 Hz, which is poor.

Frequency Variations

If we want to analyse a sound with a low and/or fluctuating pitch, we should take an important window size.

In the case of a low pitch, a C1 for instance, we have a 32 Hz frequency with a 31 ms period. We would need a 8192 samples window size.

Frequency Resolution Linearity and Human Ear

The FFT size is linear, but the response of the human ear to frequencies is logarithmic.

For instance, with a 50 Hz frequency resolution, bins go from 0 to 50 Hz, 50 to 100Hz, 100 to 150 Hz, etc. If we take the frequencies of the octaves from G1 to G6, we get : 100, 200, 400, 800, 1600 Hz...

In a low frequency range, 50 Hz is quite a wide interval. From a G1, a fifth. But from a G6, 50 Hz represent a semitone.

The same FFT has very fine high frequency pitch resolution, but very poor low-frequency resolution.

  • Choosing the Right Window Size
A propos...IRCAMRéalisé avec Scenari