SLP-L9: Robust Speech Modeling for ASR
Oral
Wed, 6 May, 16:30 - 18:30
Location: Room 114
Session Type: Oral
Session Chair: Xiaodong Cui, IBM Research
Track: Speech and Language Processing [SL]
Click the to view the manuscript on IEEE Xplore Open Preview
Wed, 6 May, 16:30 - 16:50

SLP-L9.1: WHISPER-FEST: SINGLE-CHANNEL FAR-FIELD ENHANCED SPEECH-TO-TEXT WITHOUT PARALLEL DATA

M A Basha Shaik, Samsung Research Institute Bengaluru, India; Vijendra R Apsingekar, Samsung Research America, CA, USA, United States of America; Vineeth Rao, Manonmani Viswanathan Amarnath, Rahil Khan, Mohammed Iqbal, Manonmani Srinivasan, R V College of Engineering, India
Wed, 6 May, 16:50 - 17:10

SLP-L9.2: LATTICE-GUIDED CONSISTENCY REGULARIZATION OF DUAL-MODE TRANSDUCERS FOR AUTOMATIC SPEECH RECOGNITION

Wen Ding, Hainan Xu, Jagadeesh Balam, Junjie Lai, NVIDIA, China
Wed, 6 May, 17:10 - 17:30

SLP-L9.3: BiRQ: Bi-Level Self-Labeling Random Quantization for Self-Supervised Speech Recognition

Liuyuan Jiang, University of Rochester, United States of America; Xiaodong Cui, Brian Kingsbury, IBM, United States of America; Tianyi Chen, Cornell University, United States of America; Lisha Chen, University of Rochester, United States of America
Wed, 6 May, 17:30 - 17:50

SLP-L9.4: STREAMING SPEECH RECOGNITION WITH DECODER-ONLY LARGE LANGUAGE MODELS AND LATENCY OPTIMIZATION

Genshun Wan, University of Science and Technology of China, China; Wenhui Zhang, iFLYTEK Co., Ltd., China; Jing-Xuan Zhang, Shaanxi Normal University, China; Shifu Xiong, University of Science and Technology of China, China; Jianqing Gao, iFLYTEK Co., Ltd., China; Zhongfu Ye, University of Science and Technology of China, China
Wed, 6 May, 17:50 - 18:10

SLP-L9.5: REDUCING PROMPT SENSITIVITY IN LLM-BASED SPEECH RECOGNITION THROUGH LEARNABLE PROJECTION

Sergio Burdisso, Esaú Villatoro-Tello, Shashi Kumar, Idiap Research Institute, Switzerland; Srikanth Madikeri, University of Zurich, Switzerland; Andrés Carofilis, Pradeep Rangappa, Idiap Research Institute, Switzerland; Manjunath K E, Kadri Hacioğlu, Uniphore, India; Petr Motlicek, Idiap Research Institute, Switzerland; Andreas Stolcke, Uniphore, United States of America
Wed, 6 May, 18:10 - 18:30

SLP-L9.6: PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition

Li Fu, Yu Xin, Sunlu Zeng, Lu Fan, Youzheng Wu, Xiaodong He, JD, China