SLP-P37.1
DECODER-ONLY CONFORMER WITH MODALITY-AWARE SPARSE MIXTURES OF EXPERTS FOR ASR
Jaeyoung Lee, Masato Mimura, NTT, Inc., Japan
Session:
SLP-P37: Robust ASR Modeling and Analytical Approaches Poster
Track:
Speech and Language Processing [SL]
Location:
Poster Area 27
Presentation Time:
Thu, 7 May, 14:00 - 16:00
Presentation
Discussion
Resources
No resources available.
Session SLP-P37
SLP-P37.1: DECODER-ONLY CONFORMER WITH MODALITY-AWARE SPARSE MIXTURES OF EXPERTS FOR ASR
Jaeyoung Lee, Masato Mimura, NTT, Inc., Japan
SLP-P37.2: AUDIO-CONDITIONED DIFFUSION LLMS FOR ASR AND DELIBERATION PROCESSING
Mengqi Wang, University of Illinois at Urbana-Champaign, United States of America; Zhan Liu, Zengrui Jin, Tsinghua University, China; Guangzhi Sun, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Chao Zhang, Tsinghua University, China; Philip Woodland, University of Cambridge, United Kingdom of Great Britain and Northern Ireland
SLP-P37.3: HETEROGENEOUS SELF-SUPERVISED ACOUSTIC PRE-TRAINING WITH LOCAL CONSTRAINTS
Xiaodong Cui, IBM Research, United States of America; A F M Saif, Rensselaer Polytechnic Institute, United States of America; Brian Kingsbury, IBM Research, United States of America; Tianyi Chen, Cornell Tech, United States of America
SLP-P37.4: FEDERATED HETEROGENEOUS LANGUAGE MODEL OPTIMIZATION FOR HYBRID AUTOMATIC SPEECH RECOGNITION
Mengze Hong, Hong Kong Polytechnic University, Hong Kong; Yi Gu, Ant Group, China; Di Jiang, Hong Kong Polytechnic University, Hong Kong; Hanlin Gu, Hong Kong University of Science and Technology, China; Chen Jason Zhang, Hong Kong Polytechnic University, Hong Kong; Lu Wang, Shenzhen University, China; Zhiyang Su, Hong Kong University of Science and Technology, China
SLP-P37.5: OCR-Enhanced Multimodal ASR Can Read While Listening
Junli Chen, Changli Tang, Yixuan Li, Tsinghua University, China; Guangzhi Sun, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Chao Zhang, Tsinghua University, China
SLP-P37.6: ACCENT-INVARIANT AUTOMATIC SPEECH RECOGNITION VIA SALIENCY-DRIVEN SPECTROGRAM MASKING
Mohammad Hossein Sameti, Sepehr Harfi Moridani, Sharif University of Technology, Iran (Islamic Republic of); Ali Zarean, University of Tehran, Iran (Islamic Republic of); Hossein Sameti, Sharif University of Technology, Iran (Islamic Republic of)
SLP-P37.7: COMBINING X-VECTORS AND BAYESIAN BATCH ACTIVE LEARNING: TWO-STAGE ACTIVE LEARNING PIPELINE FOR SPEECH RECOGNITION
Ognjen Kundacina, Vladimir Vincan, Dragisa Miskovic, The Institute for Artificial Intelligence Research and Development of Serbia, Serbia
SLP-P37.8: MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA
Yutong Song, University of California, Irvine, United States of America; Shiva Shrestha, Kennesaw State University, United States of America; Chenhan Lyu, Elahe Khatibi, Pengfei Zhang, University of California, Irvine, United States of America; Honghui Xu, Kennesaw State University, United States of America; Amir Rahmani, Nikil Dutt, University of California, Irvine, United States of America
SLP-P37.9: SILENT SPEECH SENTENCE RECOGNITION WITH SIX-AXIS ACCELEROMETERS USING CONFORMER AND CTC ALGORITHM
Yudong Xie, Zhifeng Han, Qinfan Xiao, Liwei Liang, Lu-Qi Tao, Tian-Ling Ren, Tsinghua University, China
SLP-P37.10: Leveraging Beam Search Information for Confidence Estimation in E2E ASR
Yichen Jia, Hugo Van Hamme,
Contacts