Technical Program

Paper Detail

Paper IDF-1-1.6
Paper Title DEEP MULTILAYER PERCEPTRONS FOR DIMENSIONAL SPEECH EMOTION RECOGNITION
Authors Bagus Tris Atmaja, JAIST, Japan; Masato Akagi, Japan Advanced Institute of Science and Technology, Japan
Session F-1-1: Emotion, Dialect, and Age Recognition
TimeTuesday, 08 December, 12:30 - 14:00
Presentation Time:Tuesday, 08 December, 13:45 - 14:00 Check your Time Zone
All times are in New Zealand Time (UTC +13)
Topic Speech, Language, and Audio (SLA):
Abstract Modern deep learning architectures are ordinarily performed in high-performance computing facilities due to the large size of their input features and complexity of their models. This paper proposes traditional multilayer perceptrons (MLP) with deep layers and small input sizes to tackle this computation requirement limitation. This paper compares the proposed deep MLP method to the more modern deep learning architectures with the same number of layers, batch size, and optimizer. The result shows that our proposed deep MLP outperformed modern deep learning architectures, i.e., LSTM and CNN, on the same number of layers and value of parameters. Both proposed and benchmark methods were optimized in the same way. The deep MLP exhibited the highest performance on both speaker-dependent and speaker-independent scenarios on IEMOCAP and MSP-IMPROV datasets.