SLP-L9.6
PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition
Li Fu, Yu Xin, Sunlu Zeng, Lu Fan, Youzheng Wu, Xiaodong He, JD, China
Session:
SLP-L9: Robust Speech Modeling for ASR Oral
Track:
Speech and Language Processing [SL]
Location:
Room 114
Presentation Time:
Wed, 6 May, 18:10 - 18:30
Session Chair:
Xiaodong Cui, IBM Research
Presentation
Discussion
Resources
No resources available.
Session SLP-L9
SLP-L9.1: WHISPER-FEST: SINGLE-CHANNEL FAR-FIELD ENHANCED SPEECH-TO-TEXT WITHOUT PARALLEL DATA
M A Basha Shaik, Samsung Research Institute Bengaluru, India; Vijendra R Apsingekar, Samsung Research America, CA, USA, United States of America; Vineeth Rao, Manonmani Viswanathan Amarnath, Rahil Khan, Mohammed Iqbal, Manonmani Srinivasan, R V College of Engineering, India
SLP-L9.2: LATTICE-GUIDED CONSISTENCY REGULARIZATION OF DUAL-MODE TRANSDUCERS FOR AUTOMATIC SPEECH RECOGNITION
Wen Ding, Hainan Xu, Jagadeesh Balam, Junjie Lai, NVIDIA, China
SLP-L9.3: BiRQ: Bi-Level Self-Labeling Random Quantization for Self-Supervised Speech Recognition
Liuyuan Jiang, University of Rochester, United States of America; Xiaodong Cui, Brian Kingsbury, IBM, United States of America; Tianyi Chen, Cornell University, United States of America; Lisha Chen, University of Rochester, United States of America
SLP-L9.4: STREAMING SPEECH RECOGNITION WITH DECODER-ONLY LARGE LANGUAGE MODELS AND LATENCY OPTIMIZATION
Genshun Wan, University of Science and Technology of China, China; Wenhui Zhang, iFLYTEK Co., Ltd., China; Jing-Xuan Zhang, Shaanxi Normal University, China; Shifu Xiong, University of Science and Technology of China, China; Jianqing Gao, iFLYTEK Co., Ltd., China; Zhongfu Ye, University of Science and Technology of China, China
SLP-L9.5: REDUCING PROMPT SENSITIVITY IN LLM-BASED SPEECH RECOGNITION THROUGH LEARNABLE PROJECTION
Sergio Burdisso, Esaú Villatoro-Tello, Shashi Kumar, Idiap Research Institute, Switzerland; Srikanth Madikeri, University of Zurich, Switzerland; Andrés Carofilis, Pradeep Rangappa, Idiap Research Institute, Switzerland; Manjunath K E, Kadri Hacioğlu, Uniphore, India; Petr Motlicek, Idiap Research Institute, Switzerland; Andreas Stolcke, Uniphore, United States of America
SLP-L9.6: PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition
Li Fu, Yu Xin, Sunlu Zeng, Lu Fan, Youzheng Wu, Xiaodong He, JD, China
Contacts