SS-L22: Efficient Modeling of Long Sequences with Applications to Speech and Audio
Fri, 19 Apr, 13:10 - 15:10 (UTC +9)
Location: Room 104
Session Type: Lecture
Session Co-Chairs: Roshan Sharma, Google and Suyoun Kim, Meta
Track: Special Sessions
Click the to view the manuscript on IEEE Xplore Open Preview
Fri, 19 Apr, 13:10 - 13:30 (UTC +9)
SS-L22.1: Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing
Fri, 19 Apr, 13:30 - 13:50 (UTC +9)
SS-L22.2: UPDATED CORPORA AND BENCHMARKS FOR LONG-FORM SPEECH RECOGNITION
Fri, 19 Apr, 13:50 - 14:10 (UTC +9)
SS-L22.3: MULTILINGUAL AND FULLY NON-AUTOREGRESSIVE ASR WITH LARGE LANGUAGE MODEL FUSION: A COMPREHENSIVE STUDY
Fri, 19 Apr, 14:30 - 14:50 (UTC +9)
SS-L22.5: Investigating End-to-end ASR Architectures for Long form Audio Transcription
Fri, 19 Apr, 14:50 - 15:10 (UTC +9)