Time-scale-modification using the phase vocoder
Diplomarbeit (Acrobat Reader), Soundfiles
The phase vocoder has been used as a time-scale-modification tool for several
decades.
Applying large positive modification factors to different kinds of sounds
(time-stretching), the result will always sound "phasy" or "reverberant".
The sound quality can be improved by "locking" the phases. Phase-locking
preserves the phase relations around a local maximum in the magnitude spectrum.
For large modification factors, locking the entire phase spectrum sounds "rigid".
In this work, the deterministic and stochastic components of a sound are
separated in the frequency domain, and only the phases of sinusoids are locked,
while the remaining phases are set to random numbers. The deterministic part
is detected
within a "reduced variance" magnitude spectrum using heuristic conditions.
The reduced variance spectrum is calculated by weighting the actual magnitude
spectrum, the spectrum of a frame about 11 ms in advance and the reduced variance
spectrum 11 ms
before the actual spectrum. In this way, the variance of the approximation
can be reduced for the peak-detection yielding better results for noise-like
signals.
For resynthesis, the deterministic magnitudes are combined with the smoothed
stochastic
magnitudes, and the locked phases are combined with the random phases.