SPE-L1: End-to-end Speech Recognition I: Streaming |
Session Type: Lecture |
Time: Tuesday, 5 May, 11:30 - 13:30 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chair: Shinji Watanabe, Johns Hopkins University
|
|
SPE-L1.1: A STREAMING ON-DEVICE END-TO-END MODEL SURPASSING SERVER-SIDE CONVENTIONAL MODEL QUALITY AND LATENCY |
Tara Sainath; Google, Inc. |
Yanzhang He; Google, Inc. |
Bo Li; Google, Inc. |
Arun Narayanan; Google, Inc. |
Ruoming Pang; Google, Inc. |
Antoine Bruguier; Google, Inc. |
Shuo-yiin Chang; Google, Inc. |
Wei Li; Google, Inc. |
Raziel Alvarez; Google, Inc. |
Zhifeng Chen; Google, Inc. |
Chung-cheng Chiu; Google, Inc. |
David Garcia; Google, Inc. |
Alex Gruenstein; Google, Inc. |
Ke Hu; Google, Inc. |
Minho Jin; Google, Inc. |
Anjuli Kannan; Google, Inc. |
Qiao Liang; Google, Inc. |
Ian McGraw; Google, Inc. |
Cal Peyser; Google, Inc. |
Rohit Prabhavalkar; Google, Inc. |
Golan Pundak; Google, Inc. |
David Rybach; Google, Inc. |
Yuan Shangguan; Google, Inc. |
Yash Sheth; Google, Inc. |
Trevor Strohman; Google, Inc. |
Mirko Visontai; Google, Inc. |
Yonghui Wu; Google, Inc. |
Yu Zhang; Google, Inc. |
Ding Zhao; Google, Inc. |
|
SPE-L1.2: MINIMUM LATENCY TRAINING STRATEGIES FOR STREAMING SEQUENCE-TO-SEQUENCE ASR |
Hirofumi Inaguma; Kyoto University |
Yashesh Gaur; Microsoft Corporation |
Liang Lu; Microsoft Corporation |
Jinyu Li; Microsoft Corporation |
Yifan Gong; Microsoft Corporation |
|
SPE-L1.3: TOWARDS FAST AND ACCURATE STREAMING END-TO-END ASR |
Bo Li; Google, Inc. |
Shuo-Yiin Chang; Google, Inc. |
Tara Sainath; Google, Inc. |
Ruoming Pang; Google, Inc. |
Yanzhang He; Google, Inc. |
Trevor Strohman; Google, Inc. |
Yonghui Wu; Google, Inc. |
|
SPE-L1.4: STREAMING AUTOMATIC SPEECH RECOGNITION WITH THE TRANSFORMER MODEL |
Niko Moritz; Mitsubishi Electric Research Laboratories (MERL) |
Takaaki Hori; Mitsubishi Electric Research Laboratories (MERL) |
Jonathan Le Roux; Mitsubishi Electric Research Laboratories (MERL) |
|
SPE-L1.5: CIF: CONTINUOUS INTEGRATE-AND-FIRE FOR END-TO-END SPEECH RECOGNITION |
Linhao Dong; Institute of Automation, Chinese Academy of Sciences |
Bo Xu; Institute of Automation, Chinese Academy of Sciences |
|
SPE-L1.6: TRANSFORMER-BASED ONLINE CTC/ATTENTION END-TO-END SPEECH RECOGNITION ARCHITECTURE |
Haoran Miao; Key Laboratory of Speech Acoustics and Content Understanding |
Gaofeng Cheng; Key Laboratory of Speech Acoustics and Content Understanding |
Changfeng Gao; Key Laboratory of Speech Acoustics and Content Understanding |
Pengyuan Zhang; Key Laboratory of Speech Acoustics and Content Understanding |
Yonghong Yan; Key Laboratory of Speech Acoustics and Content Understanding |
|