Tendencies Regarding The Effect Of Emotional Intensity In Inter Corpus Phoneme-Level Speech Emotion Modelling
Bogdan Vlasenko, University of Passau, Chair of Complex & Intelligent Systems
Björn Schuller, University of Passau, Chair of Complex & Intelligent Systems
Andreas Wendemuth, Otto von Guericke University Magdeburg

As emotion recognition from speech has matured to a degree where it becomes suitable for real-life applications, it is time for developing techniques for matching different types of emotional data with multi-dimensional and categories-based annotations. The categorical approach is usually applied for acted ‘full blown" emotions and multi-dimensional annotation is often preferred for spontaneous real life emotions. A particularly realistic task we consider in this contribution is cross-corpus emotion recognition and its evaluation. General and phoneme-level emotional models on acted and spontaneous emotions (‘very intense" and ‘intense") are used in our experimental study. The emotional models were trained on spontaneous emotions from the complete VAM dataset and subsets with variable emotional intensities and evaluated on acted emotions from the Berlin EMO-DB dataset. We observe a significant classification performance gap for general models trained on very intense spontaneous emotions. As a consequence, we address the importance of collecting large corpora with very intense emotional content for training more reliable phoneme-level emotional models.