Subject:
|
Linguistic Information Processing |
Book title:
|
Proceedings of Interspeech 2009 |
Abstract:
|
In this paper, we describe emotion recognition experiments car- ried out for spontaneous affective speech with the aim to com- pare the added value of annotation of felt emotion versus an- notation of perceived emotion. Using speech material avail- able in the TNO-GAMING corpus (a corpus containing audio- visual recordings of people playing videogames), speech-based affect recognizers were developed that can predict Arousal and Valence scalar values. Two types of recognizers were devel- oped in parallel: one trained with felt emotion annotations (generated by the gamers themselves) and one trained with perceived/observed emotion annotations (generated by a group of observers). The experiments showed that, in speech, with the methods and features currently used, observed emotions are easier to predict than felt emotions. The results suggest that recognition performance strongly depends on how and by whom the emotion annotations are carried out
|