SLP-P37: Robust ASR Modeling and Analytical Approaches
Poster
Thu, 7 May, 14:00 - 16:00
Location: Poster Area 27
Session Type: Poster
Track: Speech and Language Processing [SL]
Click the to view the manuscript on IEEE Xplore Open Preview

SLP-P37.2: AUDIO-CONDITIONED DIFFUSION LLMS FOR ASR AND DELIBERATION PROCESSING

Mengqi Wang, University of Illinois at Urbana-Champaign, United States of America; Zhan Liu, Zengrui Jin, Tsinghua University, China; Guangzhi Sun, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Chao Zhang, Tsinghua University, China; Philip Woodland, University of Cambridge, United Kingdom of Great Britain and Northern Ireland

SLP-P37.3: HETEROGENEOUS SELF-SUPERVISED ACOUSTIC PRE-TRAINING WITH LOCAL CONSTRAINTS

Xiaodong Cui, IBM Research, United States of America; A F M Saif, Rensselaer Polytechnic Institute, United States of America; Brian Kingsbury, IBM Research, United States of America; Tianyi Chen, Cornell Tech, United States of America

SLP-P37.4: FEDERATED HETEROGENEOUS LANGUAGE MODEL OPTIMIZATION FOR HYBRID AUTOMATIC SPEECH RECOGNITION

Mengze Hong, Hong Kong Polytechnic University, Hong Kong; Yi Gu, Ant Group, China; Di Jiang, Hong Kong Polytechnic University, Hong Kong; Hanlin Gu, Hong Kong University of Science and Technology, China; Chen Jason Zhang, Hong Kong Polytechnic University, Hong Kong; Lu Wang, Shenzhen University, China; Zhiyang Su, Hong Kong University of Science and Technology, China

SLP-P37.5: OCR-Enhanced Multimodal ASR Can Read While Listening

Junli Chen, Changli Tang, Yixuan Li, Tsinghua University, China; Guangzhi Sun, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Chao Zhang, Tsinghua University, China

SLP-P37.6: ACCENT-INVARIANT AUTOMATIC SPEECH RECOGNITION VIA SALIENCY-DRIVEN SPECTROGRAM MASKING

Mohammad Hossein Sameti, Sepehr Harfi Moridani, Sharif University of Technology, Iran (Islamic Republic of); Ali Zarean, University of Tehran, Iran (Islamic Republic of); Hossein Sameti, Sharif University of Technology, Iran (Islamic Republic of)

SLP-P37.7: COMBINING X-VECTORS AND BAYESIAN BATCH ACTIVE LEARNING: TWO-STAGE ACTIVE LEARNING PIPELINE FOR SPEECH RECOGNITION

Ognjen Kundacina, Vladimir Vincan, Dragisa Miskovic, The Institute for Artificial Intelligence Research and Development of Serbia, Serbia

SLP-P37.8: MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA

Yutong Song, University of California, Irvine, United States of America; Shiva Shrestha, Kennesaw State University, United States of America; Chenhan Lyu, Elahe Khatibi, Pengfei Zhang, University of California, Irvine, United States of America; Honghui Xu, Kennesaw State University, United States of America; Amir Rahmani, Nikil Dutt, University of California, Irvine, United States of America

SLP-P37.9: SILENT SPEECH SENTENCE RECOGNITION WITH SIX-AXIS ACCELEROMETERS USING CONFORMER AND CTC ALGORITHM

Yudong Xie, Zhifeng Han, Qinfan Xiao, Liwei Liang, Lu-Qi Tao, Tian-Ling Ren, Tsinghua University, China