SLP-P21.3
MULTILINGUAL DISTILWHISPER: EFFICIENT DISTILLATION OF MULTI-TASK SPEECH MODELS VIA LANGUAGE-SPECIFIC EXPERTS
Thomas Palmeira Ferraz, Télécom Paris, Institut Polytechnique de Paris, France; Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina, NAVER LABS Europe, France
Session:
SLP-P21: Multilingual speech recognition and identification Poster
Track:
Speech and Language Processing
Location:
Poster Zone 6A
Poster Board PZ-6A.3
Poster Board PZ-6A.3
Presentation Time:
Thu, 18 Apr, 08:20 - 10:20 (UTC +9)
Session Co-Chairs:
Yangyang Shi, Meta and Peter Bell, University of Edinburgh
Session SLP-P21
SLP-P21.1: ATTENTION-GUIDED ADAPTATION FOR CODE-SWITCHING SPEECH RECOGNITION
Bobbi Aditya, Mahdin Rohmatillah, National Yang Ming Chiao Tung University, Taiwan; Liang-Hsuan Tai, Industrial Technology Research Institute, Taiwan; Jen-Tzung Chien, National Yang Ming Chiao Tung University, Taiwan
SLP-P21.2: EXTENDING MULTILINGUAL ASR TO NEW LANGUAGES USING SUPPLEMENTARY ENCODER AND DECODER COMPONENTS
Yerbolat Khassanov, Zhipeng Chen, Tianfeng Chen, Tze Yuang Chong, Wei Li, Lu Lu, Zejun Ma, ByteDance, Singapore
SLP-P21.3: MULTILINGUAL DISTILWHISPER: EFFICIENT DISTILLATION OF MULTI-TASK SPEECH MODELS VIA LANGUAGE-SPECIFIC EXPERTS
Thomas Palmeira Ferraz, Télécom Paris, Institut Polytechnique de Paris, France; Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina, NAVER LABS Europe, France
SLP-P21.4: Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
Junwen Bai, Bo Li, Qiujia Li, Tara Sainath, Trevor Strohman, Google, United States of America
SLP-P21.5: Enhancing Code-switching Speech Recognition with Interactive Language Biases
Hexin Liu, Nanyang Technological University, Singapore; Leibny Paola Garcia, Johns Hopkins University, United States of America; Xiangyu Zhang, University of New South Wales, Australia; Andy W. H. Khong, Nanyang Technological University, Singapore; Sanjeev Khudanpur, Johns Hopkins University, United States of America
SLP-P21.6: ENHANCING MULTILINGUAL SPEECH RECOGNITION THROUGH LANGUAGE PROMPT TUNING AND FRAME-LEVEL LANGUAGE ADAPTER
Song Li, Yongbin You, Xuezhi Wang, Ke Ding, Guanglu Wan, Meituan, China, China
SLP-P21.7: MULTIMODAL MODELING FOR SPOKEN LANGUAGE IDENTIFICATION
Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa, Google, India
SLP-P21.8: Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints
Hao Yen, Georgia Institute of Technology, United States of America; Sabato Marco Siniscalchi, Kore University of Enna, Italy; Chin-Hui Lee, Georgia Institute of Technology, United States of America
SLP-P21.9: SPEECH COLLAGE: CODE-SWITCHED AUDIO GENERATION BY COLLAGING MONOLINGUAL CORPORA
Amir Hussein, Johns Hopkins University, United States of America; Dorsa Zeinali, Northeastern University, United States of America; Ondřej Klejch, University of Edinburgh, United Kingdom of Great Britain and Northern Ireland; Matthew Wiesner, Johns Hopkins University, United States of America; Brian Yan, Carnegie Mellon University, United States of America; Shammur Chowdhury, Ahmed Ali, Qatar Computing Research Institute, Qatar; Shinji Watanabe, Carnegie Mellon University, United States of America; Sanjeev Khudanpur, Johns Hopkins University, United States of America
SLP-P21.10: Dynamic ASR pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model
Jiamin Xie, University of Texas at Dallas, United States of America; Ke Li, Jinxi Guo, Andros Tjandra, Shangguan Yuan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli, Meta AI, United States of America
Contacts