Windows are not juxtaposed successively, but overlap to improve the precision of the analysis. The window step setting say every how often on the image a spectrogram line must appear. Just as the oversampling factor of the FFT improves the frequency resolution of the analysis, the oversampling of the window step improves its temporal resolution.
The window type determines how the successive windows are overlapped, with a windowing curve.
The overlap factor corresponds to a proportion of the window size. This proportion is defined by the Adaptive Oversampling factor. By default, this factor is set to 8, or 12,5% of the window size, as shown in Window Step part of the dialogue window.
Note that the number of samples in the window step is also known as hop size.
Window Step = Window Size/Adaptive Oversampling Factor For instance, with a 44100 sample rate, a 1024 window size , and an 8 factor, we have Window Step = 1024/8 = 128. The window step is shorter than the buffer filling time, and we get more spectrogram strips, with almost the same representation. For a 1024 window step of 23 ms, with an 8 oversampling factor, a spectral representation is calculated every 2,9 ms – window duration/8. T (window step) = WS/(SR*Adaptative Oversampling Factor) |
The window step can be set independently, so that, with a big window size adapted to a low frequency sound with a good frequency definition and poor temporal definition, a short window step can improve the detection of transients of temporal variations of the signal.
The Adaptive Oversampling mode adapts the window step automatically, in the case of some treatments, such as a time stretch or transposition with a temporal correction.
The Manual mode implies that the oversampling rate will not be modified, no matter which type of treatment is applied. This mode is not advised to execute treatments implying time stretch or transpositions.
In most cases, a 4 overlapping factor is enough, except for transient detection, for which a higher factor could be more efficient.
An overlapping factor of 8 is generally advised. A lower factor would produce a slight degradation of the rendering. An increase up to 16 can sensibly improve the transients detection and preservation.
The overlap determins the calculus precision, but it can also increase the calculus time. Overlapping can be problematic to locate a rhythm precisely, with slight delays or anticipations. An exaggerated overlap doesn't garantee an efficient analysis. The spectrogram runs quickly, the calculus time increases, and the same data are reiterated more and more times on the same FFT strip. The space on the image which the signal represents is wasted.
This amount to taking a picture with a 60° view and stretching to get a 180° view : the image will be distorted.
The dialogue window offers three types of analysis windows : Hanning, Hamming and Blackman. These windows have a specific curve, which determines how windows are convolved. The main effect of this convolution is that an amount of energy spreads around each spectral component, creating a non significant background noise.
Analysis windows are rarely square, because the beginning and ending of the windows rarely fit the sound periodicity. To avoid clicks, the analysis window is tapered at its endpoints. This curve varies depending on the window type.
These windows should be used for the analysis of stable sounds with a low pitch.
This window should be used for unstable and noisy sounds.