Detection of singing voice segments and singer identification
The detection of singing voice segments within music signals is an important object of research in the field of music information retrieval, since it serves as an essential pre-stage for applications like singer identification, lyrics recognition, singing melody extraction and many more.
The objective of this thesis is the implementation and evaluation of an pattern recognition system with the capability of detecting singing voice in music signals. For this purpose, a support vector machine classifier is utilized in conjunction with MFCC features, including also long-term features of MFCCs and their delta features. Furthermore, an energy-based feature is proposed. Feature subset selection has been carried out by means of linear discriminant analysis together with sequential forward and backward elimination subset space search strategies. The resulting subsets were evaluated in combination with the classifier using 10-fold cross-validation, resulting in a mean accuracy of 75.6 % with a standard deviation of 2.5 % for the best subset.
Finally, the system was tested with a database, which was provided by Mathieu Ramona and obtained a mean accuracy of 69.7 %.