MMSP-L4: Pose, Gesture, and Action in Multimedia
Thu, 18 Apr, 08:20 - 10:20 (UTC +9)
Location: Room 205A
Session Type: Lecture
Session Co-Chairs: Naveen Kumar, Disney Research, US and Jianquan Liu, Visual Intelligence Research Laboratories, NEC Corporation, Japan
Track: Multimedia Signal Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Thu, 18 Apr, 08:20 - 08:40 (UTC +9)
 

MMSP-L4.1: MOMA: MIXTURE-OF-MODALITY-ADAPTATIONS FOR TRANSFERRING KNOWLEDGE FROM IMAGE MODELS TOWARDS EFFICIENT AUDIO-VISUAL ACTION RECOGNITION

Kai Wang, Dimitrios Hatzinakos, University of Toronto, Canada
Thu, 18 Apr, 08:40 - 09:00 (UTC +9)
 

MMSP-L4.2: MUSIC-TO-DANCE POSES: LEARNING TO RETRIEVE DANCE POSES FROM MUSIC

Bo-Wei Tseng, Kenneth Yang, Yu-Hua Hu, Wen-Li Wei, Jen-Chun Lin, Academia Sinica, Taiwan
Thu, 18 Apr, 09:00 - 09:20 (UTC +9)
 

MMSP-L4.3: GESTURE GENERATION VIA DIFFUSION MODEL WITH ATTENTION MECHANISM

Lingling Li, Weicong Li, Qiyuan Ding, Sun Yat-Sen University, China; Chengpei Tang, Keze Wang, Sun Yat-sen University, China
Thu, 18 Apr, 09:20 - 09:40 (UTC +9)
 

MMSP-L4.4: Video-language Graph Convolutional Network for Human Action Recognition

Rui Zhang, Xiaoran Yan, Zhejiang Lab, China
Thu, 18 Apr, 09:40 - 10:00 (UTC +9)
 

MMSP-L4.5: Exploring Latent Cross-Channel Embedding for Accurate 3D Human Pose Reconstruction in a Diffusion Framework

Junkun Jiang, Jie Chen, Hong Kong Baptist University, Hong Kong
Thu, 18 Apr, 10:00 - 10:20 (UTC +9)
 

MMSP-L4.6: UNIFIED SPEECH AND GESTURE SYNTHESIS USING FLOW MATCHING

Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter, KTH Royal Institute of Technology, Sweden