SLP-P44: Multilingual ASR Architectures & Scaling
Poster
Thu, 7 May, 16:30 - 18:30
Location: Poster Area 29
Session Type: Poster
Track: Speech and Language Processing [SL]
Click the to view the manuscript on IEEE Xplore Open Preview

SLP-P44.2: BRIDGING THE GAP: A COMPARATIVE EXPLORATION OF SPEECH-LLM AND END-TO-END ARCHITECTURE FOR MULTILINGUAL CONVERSATIONAL ASR

Yuxiang Mei, Shanghai Normal University, China; Dongxing Xu, Jiaen Liang, Unisound AI Technology Co., Ltd., China; Yanhua Long, Shanghai Normal University, China

SLP-P44.3: DUAL-GRAINED ROUTING GUIDED MULTI-LORA EXPERTS FOR MULTILINGUAL LOW-RESOURCE SPEECH RECOGNITION

Yizhi Wang, Haofei Zhang, Huiqiong Wang, Li Sun, Mingli Song, Zhejiang University, China

SLP-P44.4: ENHANCING MULTILINGUAL LLM-BASED ASR WITH MIXTURE OF EXPERTS AND DYNAMIC DOWNSAMPLING

Guodong Lin, Ziqi Chen, Yuxiang Fu, Tsinghua University, China; Ke Li, Beijing Haitian Ruisheng Science Technology Ltd., China; Wei-Qiang Zhang, Tsinghua University, China

SLP-P44.5: LAMER-SSL: LAYER-AWARE MIXTURE OF LORA EXPERTS FOR CONTINUAL MULTILINGUAL EXPANSION OF SELF-SUPERVISED MODELS WITHOUT FORGETTING

Jing Xu, Minglin Wu, Xueyuan Chen, Xixin Wu, Helen Meng, The Chinese University of Hong Kong, Hong Kong

SLP-P44.6: MILORE-SSL: SCALING MULTILINGUAL CAPABILITIES IN SELF-SUPERVISED MODELS WITHOUT FORGETTING

Jing Xu, Minglin Wu, Xueyuan Chen, Xixin Wu, Helen Meng, The Chinese University of Hong Kong, Hong Kong

SLP-P44.7: MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR

Junjie Li, Jing Peng, Shanghai Jiao Tong University, China; Yangui Fang, Huazhong University of Science and Technology, China; Shuai Wang, Nanjing University, China; Kai Yu, Shanghai Jiao Tong University, China

SLP-P44.8: LOW-RANK AND SPARSE MODEL MERGING FOR MULTI-LINGUAL SPEECH RECOGNITION AND TRANSLATION

Qiuming Zhao, Tsinghua University, China; Guangzhi Sun, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Chao Zhang, Tsinghua University, China

SLP-P44.9: Multi Stage Training With Dynamic Data Balancing For Multilingual Speech Recognition and Translation

Nithin Rao Koluguri, Monica Sekoyan, Nune Tadevosyan, Nikolay Karpov, Jagadeesh Balam, Boris Ginsburg, NVIDIA, Armenia

SLP-P44.10: SCALE: Semantic Chunking And Label-delay Engine for Streaming Speech-LLM

Akshat Jaiswal, Debmalya Chakrabarty, Ritwik Kotra, Harish Arsikere, Nikhil Bhave, Sambuddha Bhattacharya, Sri Garimella, Amazon, India