Technical Program

Paper Detail

Paper ID	D-1-1.5
Paper Title	3D SKELETAL MOVEMENT ENHANCED EMOTION RECOGNITION NETWORK
Authors	Jiaqi Shi, Osaka University, Japan; Chaoran Liu, Carlos Toshinori Ishi, Advanced Telecommunications Research Institute International, Japan; Hiroshi Ishiguro, Osaka University, Japan
Session	D-1-1: Image/Video Recognition
Time	Tuesday, 08 December, 12:30 - 14:00
Presentation Time:	Tuesday, 08 December, 13:30 - 13:45 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Image, Video, and Multimedia (IVM):
Abstract	Automatic emotion recognition has become an important trend in the field of human-computer natural interaction and artificial intelligence. Although gesture is one of the most important components of nonverbal communication, which has a considerable impact on emotion recognition, motion modalities are rarely considered in the study of affective computing. An important reason is the lack of large open emotion databases containing skeletal movement data. In this paper, we extract 3D skeleton information from video, and apply the method to IEMOCAP database to add a new modality. We propose an attention based convolutional neural network which takes the extracted data as input to predict the speaker's emotion state. We also combine our model with models using other modalities to provide complementary information in the emotion classification task. The combined model utilizes audio signals, text information and skeletal data simultaneously. The performance of the model significantly outperforms the bimodal model, proving the effectiveness of the method.