Introduction

Energy Histogram and Feature Scoring

Principles

Application

Both methods are especially dedicated to the detection of the F0 with harmonic instruments and voice.

F0 estimation

Two different algorithm are used by SuperVP to detect an F0 : the feature scoring, or the energy histogram algorithm. A fundamental frequency is estimated according to the harmonicity of the spectrum and smoothness of the amplitude envelope.

Harmonicity and Envelope Smoothness

These two criteria are important for the detection of the fundamental frequency.

  • The frequency of the partials, or harmonics is the produce of the fundamental frequency with an integer

  • The amplitude of a harmonic partial is usually close to the amplitudes of the nearby partials of the same sound. The spectral envelope of a harmonic sound is smooth and varies slowly with the frequency.

Energy Histogram and Feature Scoring Common Parameters

Fundamental Frequency Range

Fundamental frequency detection range in Hz, by default, between 50 and 1000 Hz.

Maximum Frequency in Spectrum

A cutoff frequency in Hz for the analysis : spectral peaks located above this frequency will be ignored.

Smooth Order

Smooth order controls the median smoothing of the F0 analysis results. Using this median filter amounts to lowering points that are higher than the immediately adjacent points – for instance, because of noise – and increasing the value of points that are lower. This naturally leads to a smoother signal.

SuperVP applies smoothing with a filter to the F0 values in time, in order to avoid local estimation errors. This filter is applied with a window over n F0 observed values. The smoothing order gives the order of the median filter that is applied to the data.

  • By default, this filter is equal to 3, and its value should always be odd. If smoothing order is equal to 1, there is no smoothing.

  • The higher the value, the smoother the resulting F0 is, with little possible estimation errors, but this implies a tradeoff with the F0 precision.

Relative Noise Threshold

A noise level : amplitude difference threshold in dB between two peaks. If the amplitude difference excedes this value, the stronger peak is ignored.

By default, the threshold value is set to 50 dB.

Feature Scoring Expert Settings

Use

The feature scoring analysis has an additional feature that can be found in the expert settings. In some cases, indeed, as with clarinet sounds for instance, the envelope is not that smooth. The weight of the envelope smoothness needs to be ponderated relatively to the spectral match of the peaks, depending on the instruments characteristics. These criteria are also used underlyingly by the energy histogram, but they cannot be ponderated by the user. The feature scoring analysis is a good method for the F0 estimation with instrumental harmonic sounds.

Envelope Weight and Spectral Match Weight

The ponderation of the envelope smoothing is an alternative way to favor the correct F0 estimation, minimizing the centroid – center of gravity – partial of the amplitudes sequence.

  • The envelope weight (EW) gives the impact of the smoothness measure. By default, the weight is set to 0.14.

  • The spectral match weight (SMW) gives the impact of the energy match measure. By default, the weight is set to 0.26

  • SMW-EW = impact of the harmonic centroid measure.

These weights are notably modified depending on the presets delivered with AS for the various types of instruments.

The sum of both coefficients shouldn't be superior to 1. If not, these values will be corrected automatically in the dialogue window.

  • Energy Histogram and Feature Scoring
A propos...IRCAMRéalisé avec Scenari