Mixing Perceptual Coded Audio Streams
Diplomarbeit (999 KB pdf)
Perceptual audio coders based on the Modulated Discrete Cosine Transform (MDCT) and utilizing psychoacoustically based noise shaping for irrelevancy removal are widely used today. After giving an overview of algorithms of perceptual audio coders and current coding standards, this thesis explores the possibilities to mix two audio streams based on the MDCT without completely decoding the streams back into the time domain. For this methods for adjusting block lengths of streams and mixing them together within the MDCT domain are developed. The devised scheme is investigated in terms of latency, computational complexity and effects on psychoacoustic processing. As an example application a simple mixer for combining two Ogg Vorbis files with fixed window lengths is developed in MATLAB.