MMSP-P25.4
PERFORMSINGER: MULTIMODAL SINGING VOICE SYNTHESIS LEVERAGING SYNCHRONIZED LIP CUES FROM SINGING PERFORMANCE VIDEOS
Ke Gu, Zhicong Wu, Peng Bai, Xiamen University, China; Sitong Qiao, Zhiqi Jiang, University of Science and Technology Beijing, China; Junchen Lu, National University of Singapore, China; Xiaodong Shi, Xiamen University, China; Xinyuan Qian, University of Science and Technology Beijing, China
Session:
MMSP-P25: Talking Head Generation and Facial Animation Poster
Track:
Multimedia Signal Processing [MM]
Location:
Poster Area 21
Presentation Time:
Fri, 8 May, 09:00 - 11:00
Presentation
Discussion
Resources
No resources available.
Session MMSP-P25
MMSP-P25.1: DepthTalk: Few-Shot Talking Head Generation with Depth-Aware 3D Gaussian Field Motion
Shucheng Ji, Junqing Huang, Yang Lian, Xiaochen Yuan, Macao Polytechnic University, China
MMSP-P25.2: Benchmarking Emotional Accuracy and Identity Consistency in Facial Image-to-Video Generation
Songchao Tan, University of Science and Technology Beijing, China; Ruiqi Li, Peking University, China; Hanwei Zhu, Nanyang Technological University, Singapore; Shiqi Wang, City University of Hong Kong, China; Huimin Ma, University of Science and Technology Beijing, China; Siwei Ma, Peking University, China
MMSP-P25.3: KSDIFF: KEYFRAME-AUGMENTED SPEECH-AWARE DUAL-PATH DIFFUSION FOR FACIAL ANIMATION
Tianle Lyu, Junchuan Zhao, Ye Wang, National University of Singapore, Singapore
MMSP-P25.4: PERFORMSINGER: MULTIMODAL SINGING VOICE SYNTHESIS LEVERAGING SYNCHRONIZED LIP CUES FROM SINGING PERFORMANCE VIDEOS
Ke Gu, Zhicong Wu, Peng Bai, Xiamen University, China; Sitong Qiao, Zhiqi Jiang, University of Science and Technology Beijing, China; Junchen Lu, National University of Singapore, China; Xiaodong Shi, Xiamen University, China; Xinyuan Qian, University of Science and Technology Beijing, China
MMSP-P25.5: VT-Heads: Voice Cloning and Talking Head Generation From Text Based on V-DiT
Yali Cai, Peng Qiao, Dongsheng Li, National University of Defense Technology, China
MMSP-P25.6: VividTalker: A Modular Framework for Expressive 3D Talking Avatars with Controllable Gaze and Blink
Hangyu Xiong, Technical University of Denmark (DTU), Denmark; jinyi zhang, University of California, Los Angeles, United States of America; Zheng Wang, Tsinghua University, China; Tianlun Pan, Xi’an Jiaotong-Liverpool University, China; Qingzheng Hu, INTI International University, Malaysia
MMSP-P25.7: MULTIMODAL TRANSFORMER WITH MULTIPERSPECTIVE TRAINING FOR PREDICTING SELF-EXPRESSION SKILLS FROM VIDEO INTERVIEW
Ryo Masumura, Shota Orihashi, Mana Ihori, Tomohiro Tanaka, Naoki Makishima, Suzuka Yamada, Taiga Yamane, Naotaka Kawata, Satoshi Suzuki, NTT, Inc., Japan
MMSP-P25.8: UNCERTAINTY-AWARE 3D EMOTIONAL TALKING FACE SYNTHESIS WITH EMOTION PRIOR DISTILLATION
Nanhan Shen, Zhilei Liu, Tianjin University, China
MMSP-P25.9: RECOM: REALISTIC CO-SPEECH MOTION GENERATION WITH RECURRENT EMBEDDED TRANSFORMER
Yong Xie, Yunlian Sun, Nanjing University of Science and Technology, China; Hongwen Zhang, Beijing Normal University, China; Yebin Liu, Tsinghua University, China; Jinhui Tang, Nanjing Forestry University, China
MMSP-P25.10: DIFFEMOTALK: AUDIO-DRIVEN FACIAL ANIMATION WITH FINE-GRAINED EMOTION CONTROL VIA DIFFUSION MODELS
Kexin Gao, Yuyu Zhu, Jian Liu, Xin Jie Wang, Ocean University of China, China; Xiaogang Jin, Zhejiang University, China; Jie Nie, Ocean University of China, China
Contacts