SLP-P46.10

TWO-STAGE AUDIO-VISUAL TARGET SPEAKER EXTRACTION SYSTEM FOR REAL-TIME PROCESSING ON EDGE DEVICE

Zixuan Li, Xueliang Zhang, Inner Mongolia University, China; Lei Miao, Zhipeng Yan, Ying Sun, Chong Zhu, Lenovo, China

Session:
SLP-P46: Speech Enhancement for Real-World Applications Poster

Track:
Speech and Language Processing [SL]

Location:
Poster Area 31

Presentation Time:
Thu, 7 May, 16:30 - 18:30

Presentation
Discussion
Resources
No resources available.
Session SLP-P46
SLP-P46.1: TEST-TIME ADAPTATION FOR SPEECH ENHANCEMENT VIA MASK POLARIZATION
Tobias Raichle, Erfan Amini, Bin Yang, University of Stuttgart, Germany
SLP-P46.2: LExTra: Folded Prompt and Split-Role Attention for Target Speaker Extraction
Pengjie Shen, Inner Mongolia University, China; Shulin He, Southern University of Science and Technology, China; Xueliang Zhang, Inner Mongolia University, China; Zhong-Qiu Wang, Southern University of Science and Technology, China
SLP-P46.3: TOWARDS NOISE-ROBUST SPEECH INVERSION THROUGH MULTI-TASK LEARNING WITH SPEECH ENHANCEMENT
Saba Tabatabaee, Carol Espy-Wilson, University of Maryland, United States of America
SLP-P46.4: ENSEMBLE FOR REDUCING TARGET SPEECH EXTRACTION ERRORS
Tsubasa Ochiai, Marc Delcroix, Naoyuki Kamo, Takanori Ashihara, Naohiro Tawara, Tomohiro Nakatani, NTT, Inc., Japan
SLP-P46.5: LOW-POWER END-TO-END COCHLEAR IMPLANT SPEECH DENOISING WITH SPIKING NEURAL NETWORKS
Ludovic Boulanger, Sean U N Wood, University of Sherbrooke, Canada
SLP-P46.6: NLDSI-BWE: NON LINEAR DYNAMICAL SYSTEMS-INSPIRED MULTI RESOLUTION DISCRIMINATORS FOR SPEECH BANDWIDTH EXTENSION
Tarikul Islam Tamiti, Anomadarshi Barua, George Mason University, United States of America
SLP-P46.7: GENERATING TRAINING TARGETS FOR REAL-WORLD SPEECH ENHANCEMENT VIA CLOSE-TO-DISTANT MICROPHONE PROJECTION
Tomohiro Nakatani, Rintaro Ikeshita, Naoyuki Kamo, Marc Delcroix, Shoko Araki, NTT, Inc., Japan
SLP-P46.8: JUND-F0: A Novel Deep Learning Framework for Joint Unvoiced/Voiced Detection and F0 Estimation
Yuang Chen, Rui Feng, Yin-Long Liu, Yu Hu, Jiahong Yuan, University of Science and Technology of China, China
SLP-P46.9: Joint Enhancement and Bandwidth Extension for Radar Through-Barrier Speech Acquisition
Zhi-Wei Tan, V. G. Reju, Ritesh Chandra Tewari, Ruotong Ding, Nanyang Technological University, Singapore; Andy W. H. Khong, Nanyang Technological University; Lee Kong Chian School of Medicine, Singapore
SLP-P46.10: TWO-STAGE AUDIO-VISUAL TARGET SPEAKER EXTRACTION SYSTEM FOR REAL-TIME PROCESSING ON EDGE DEVICE
Zixuan Li, Xueliang Zhang, Inner Mongolia University, China; Lei Miao, Zhipeng Yan, Ying Sun, Chong Zhu, Lenovo, China
Contacts