SLP-P47.1
VISUAL-INFORMED SPEECH ENHANCEMENT USING ATTENTION-BASED BEAMFORMING
Chihyun Liu, Jiaxuan Fan, Mingtung Sun, Michael Anthony, Mingsian R. Bai, National Tsing Hua University, Taiwan; Yu Tsao, Academia Sinica, Taiwan
Session:
SLP-P47: Multimodal and Multichannel Approaches to Speech Enhancement Poster
Track:
Speech and Language Processing [SL]
Location:
Poster Area 27
Presentation Time:
Fri, 8 May, 09:00 - 11:00
Presentation
Discussion
Resources
No resources available.
Session SLP-P47
SLP-P47.1: VISUAL-INFORMED SPEECH ENHANCEMENT USING ATTENTION-BASED BEAMFORMING
Chihyun Liu, Jiaxuan Fan, Mingtung Sun, Michael Anthony, Mingsian R. Bai, National Tsing Hua University, Taiwan; Yu Tsao, Academia Sinica, Taiwan
SLP-P47.2: Audiovisual Speech Enhancement and Voice Activity Detection Using Generative and Speech Recognition Features
Cheng Yu, Vahid Ahmadi Kalkhorani, The Ohio State University, United States of America; Anurag Kumar, Ke Tan, Buye Xu, Meta, United States of America; DeLiang Wang, The Chinese University of Hong Kong, China
SLP-P47.3: Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
Hsiang-Cheng Yang, You-Jin Li, Rong Chao, National Taiwan University, Taiwan; Yu Tsao, Academia Sinica, Taiwan; Borching Su, Shao-Yi Chien, National Taiwan University, Taiwan
SLP-P47.4: BEYOND LIPS: INTEGRATING GESTURE AND LIP CUES FOR ROBUST AUDIO-VISUAL SPEAKER EXTRACTION
Zexu Pan, Alibaba Group, Singapore; Xinyuan Qian, University of Science and Technology Beijing, China; Shengkui Zhao, Kun Zhou, Bin Ma, Alibaba Group, Singapore
SLP-P47.5: PAS-SE: PERSONALIZED AUXILIARY-SENSOR SPEECH ENHANCEMENT FOR VOICE PICKUP IN HEARABLES
Mattes Ohlenbusch, Fraunhofer IDMT, Germany; Mikolaj Kegler, Marko Stamenovic, Bose Corporation, United States of America
SLP-P47.6: EGGCodec: A Robust Neural Encodec Framework for EGG Reconstruction and F0 Extraction
Rui Feng, Yuang Chen, Yu Hu, Jiahong Yuan, University of Science and Technology of China, China
SLP-P47.7: ARRAYDPS-REFINE: GENERATIVE REFINEMENT OF DISCRIMINATIVE MULTI-CHANNEL SPEECH ENHANCEMENT
Zhongweiyang Xu, University of Illinois Urbana-Champaign, United States of America; Ashutosh Pandey, Juan Azcarreta, Zhaoheng Ni, Sanjeel Parekh, Buye Xu, Meta, United States of America
SLP-P47.8: PTSE-T: PRESENTATION TARGET SPEAKER EXTRACTION USING UNALIGNED TEXT CUES
Ziyang Jiang, Jiahe Lei, Xueyan Chen, Yifan Zhang, University of Science and Technology Beijing, China; Zexu Pan, Alibaba Group, Singapore, Singapore; Xue Wei, Hong Kong University of Science and Technology (HKUST), China; Xinyuan Qian, University of Science and Technology, China
SLP-P47.9: Rethinking Speech Representation Aggregation in Speech Enhancement: a Phonetic Mutual Information Perspective
Seungu Han, Sungho Lee, Kyogu Lee, Seoul National University, Korea, Republic of
SLP-P47.10: MC-LExt: Multi-Channel Target Speaker Extraction with Onset-Prompted Speaker Conditioning Mechanism
Tongtao Ling, Shulin He, Southern University of Science and Technology, China; Pengjie Shen, Inner Mongolia University, China; Zhong-Qiu Wang, Southern University of Science and Technology, China
Contacts