Vocal melody transcription from popular music recordings
Diploma thesis (pdf. 5.043KB)
Extracting the vocal melody from polyphonic music recordings is a nontrivial research problem. Knowledge of the sound source and of mixing techniques common in popular music can help stating assumptions simplifying the overall problem.
This work will focus on the design, the development and the implementation of procedures facilitating the detection, transcription and possibly the removal of the vocal melody from polyphonic popular music recordings. Vocals in music and voiced speech in general exhibit a harmonic structure (formants) that differs from the structure musical instruments tend to produce (strong fundamental, smooth roll off). This knowledge shall help to discriminate between concurring voice and instrument sounds in one analysis frame. In music recordings “panning” is used to increase the clarity and discriminability of instruments in a mix. This spatial information might be valuable for detecting the vocal melody since it is common to “place” the main voice in the center with no panning at all.