At the beginning of the analysis, a sinusoid with a linear attack shows a COGE
Although musical signals have more than one sound source, the COGE can be calculated for each peak.
To determine if we have a transient or not, a detection threshold must be defined, with a limiting signal is used.
The COGE of the limiting signal is used as a temporal threshold : a COGE located at the left of the threshold belongs to a peak that is not forming an attack.
The COGE of the limiting signal is about 7.4% of the length of the window : a COGE must be 1.4 times the limiting value, and 10.4% of the length of the analysis window.
If no noise precedes the attack, the COGE is important. It can reach 50% of the analysis window. Here, on the left, the attack transient starts from silence, and has an important COGE.
With background noise or preceding notes, the maximum value of the COGE will generally be lower than 50%. The energy is distributed over the whole window and the COGE is reduced, as shown on the right.
The threshold detection is a means to control the sensitivity of the detection algorithm. For a larger threshold the number of false alarms in noise regions will be lower and the detection of transients will be more reliable. But the sensitivity of the detector will be reduced : attacks with smaller level compared to the background energy will produce only a small increase in the COGE and will no longer be detected.
Onsets are non-transient peaks. To distinguish transient spectral peaks from non transient signal components, the onset detector needs
a short detection delay
a high frequency resolution
a precise estimate of the location of the sharpest energy ascend at the attack.
Soft onsets do not need to be detected, since their duration is equal or superior to the length of the analysis window.
In the case of transients, the COGE is at the far right side of the analysis window. Peaks in transients are synchronized between each others.
In the case of noise signals, the COGE can also be also far of the center of the window, but peaks are not synchronized between each others.
This synchronization allows to discriminates noise peaks and transients.
These characteristics allow to detect a transient is most cases. Nevertheless, spectral peaks in noise components can have their COGE anywhere, and peaks with large COGE can form a transient or be part of a random signal – noise. The use of a statistical model allows to distinguish random peaks – and onsets – from transients.
To distinguish a random peak or an onset from a transient signal, a statistical model is calculated from a number of tested peaks. The spectrum is analysed band by band, with overlapping equal frequency bands, or statistical bands
For each band, the average probability of a transient peak is estimated : the average number of transient peaks for an attack and the time between transients are defined. The number of transient peaks in the current frames is then compared with the number of peaks in the model.
We have a transient when :
We have a random peak – noise component is the number of peaks is not larger than the number in the model
We have an onset when :
the number of peaks increases in time,
the change of state concerns a single sinusoid in the frequency band