SLP-L8: Multichannel/Multimodal Speech Recognition
Wed, 17 Apr, 08:20 - 10:20 (UTC +9)
Location: Room 104
Session Type: Lecture
Session Co-Chairs: Marc Delcroix, NTT and Lei Xie, Northwestern Polytechnical University
Track: Speech and Language Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Wed, 17 Apr, 08:20 - 08:40 (UTC +9)
SLP-L8.1: VISUAL SPEECH RECOGNITION FOR LANGUAGES WITH LIMITED LABELED DATA USING AUTOMATIC LABELS FROM WHISPER
Wed, 17 Apr, 08:40 - 09:00 (UTC +9)
SLP-L8.2: MULTI-MODALITY SPEECH RECOGNITION DRIVEN BY BACKGROUND VISUAL SCENES
Wed, 17 Apr, 09:00 - 09:20 (UTC +9)
SLP-L8.3: SELF-SUPERVISED ADAPTIVE AV FUSION MODULE FOR PRE-TRAINED ASR MODELS
Wed, 17 Apr, 09:20 - 09:40 (UTC +9)
SLP-L8.4: AUTOMATIC CHANNEL SELECTION AND SPATIAL FEATURE INTEGRATION FOR MULTI-CHANNEL SPEECH RECOGNITION ACROSS VARIOUS ARRAY TOPOLOGIES
Wed, 17 Apr, 09:40 - 10:00 (UTC +9)
SLP-L8.5: AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
Wed, 17 Apr, 10:00 - 10:20 (UTC +9)