Context-Aware Attention Mechanism for Speech Emotion Recognition

Ramet, GaetanGarner, Philip N.Baeriswyl, MichaelLazaridis, Alexandros2019-02-062019-02-062019-02-06201810.1109/SLT.2018.8639633https://infoscience.epfl.ch/handle/20.500.14299/154378WOS:000463141800019In this work, we study the use of attention mechanisms to enhance the performance of the state-of-the-art deep learning model in Speech Emotion Recognition (SER). We introduce a new Long Short-Term Memory (LSTM)-based neural network attention model which is able to take into account the temporal information in speech during the computation of the attention vector. The proposed LSTM-based model is evaluated on the IEMOCAP dataset using a 5-fold cross-validation scheme and achieved 68.8% weighted accuracy on 4 classes, which outperforms the state-of-the-art models.speech emotion recognitionattentiondeep learningneural networkContext-Aware Attention Mechanism for Speech Emotion Recognitiontext::conference output::conference proceedings::conference paper