MLSP-L30: Diffusion Models for Audio and Video Generation
Oral
Fri, 8 May, 09:00 - 11:00
Location: Room 117
Session Type: Oral
Track: Machine Learning for Signal Processing [ML]
Click the to view the manuscript on IEEE Xplore Open Preview
Fri, 8 May, 09:00 - 09:20

MLSP-L30.1: FC-VFI: FAITHFUL AND CONSISTENT VIDEO FRAME INTERPOLATION FOR HIGH-FPS SLOW MOTION VIDEO GENERATION

Ganggui Ding, Hao Chen, Zhejiang University, China; Xiaogang Xu, Chinese University of Hong Kong, China
Fri, 8 May, 09:20 - 09:40

MLSP-L30.2: PLANPERCEIVER: A UNIFIED FRAMEWORK FOR MULTI-LEVEL SCENE INFORMATION FUSION IN AUTONOMOUS DRIVING PLANNING

Yuxuan Wu, Harbin Institute of Technology, China; Guo Yang, Chengcheng Tang, CHONGQING CHANGAN AUTOMOBILE Co., Ltd, China; Qiuju Gao, China Software Testing Center, China; Ping Wu, CHONGQING CHANGAN AUTOMOBILE Co., Ltd, China; Jianxun Cui, Harbin Institute of Technology, China
Fri, 8 May, 09:40 - 10:00

MLSP-L30.3: RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Fangyu Du, Xi'an Jiaotong University, China; Taiqing Li, Dalian University of Technology, China; Qian Qiao, Tan Yu, Dingcheng Zhen, Ziwei Zhang, Soul AI, China; xu jia, Dalian University of Technology, China; yang yang, Xi'an Jiaotong University, China; Shunshun Yin, Siyuan Liu, Soul AI, China
Fri, 8 May, 10:00 - 10:20

MLSP-L30.4: VIRTUAL CONSISTENCY FOR AUDIO EDITING

Matthieu Cervera, Independent Researcher, Canada; Francesco Paissan, Laval University, Mila-Quebec AI Institute, Canada; Mirco Ravanelli, Concordia University, Mila-Quebec AI Institute, University of Montreal, Canada; Cem Subakan, Laval University, Mila-Quebec AI Institute, Concordia University, Canada
Fri, 8 May, 10:20 - 10:40

MLSP-L30.5: TRAINING-FREE FRAMEWORK FOR DEFENDING UNSAFE IMAGE SYNTHESIS ATTACK

Junha Park, Yonsei University, Korea, Republic of; Jaehui Hwang, Naver AI Lab, Korea, Republic of; Ian Ryu, Hyungkeun Park, Jiyoon Kim, Jong-Seok Lee, Yonsei University, Korea, Republic of
Fri, 8 May, 10:40 - 11:00

MLSP-L30.6: SIGN-SALD: A SKELETON-AWARE LATENT DIFFUSION MODEL FOR TEXT-DRIVEN SIGN LANGUAGE PRODUCTION

Jiayu Shen, Kalin Stefanov, Lay-Ki Soon, Vee Yee Chong, KokSheik Wong, Monash University, Malaysia