SPE-L1: End-to-end Speech Recognition I: Streaming |
| Session Type: Lecture |
| Time: Tuesday, 5 May, 11:30 - 13:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chair: Shinji Watanabe, Johns Hopkins University |
| SPE-L1.1: A STREAMING ON-DEVICE END-TO-END MODEL SURPASSING SERVER-SIDE CONVENTIONAL MODEL QUALITY AND LATENCY |
| Tara Sainath; Google, Inc. |
| Yanzhang He; Google, Inc. |
| Bo Li; Google, Inc. |
| Arun Narayanan; Google, Inc. |
| Ruoming Pang; Google, Inc. |
| Antoine Bruguier; Google, Inc. |
| Shuo-yiin Chang; Google, Inc. |
| Wei Li; Google, Inc. |
| Raziel Alvarez; Google, Inc. |
| Zhifeng Chen; Google, Inc. |
| Chung-cheng Chiu; Google, Inc. |
| David Garcia; Google, Inc. |
| Alex Gruenstein; Google, Inc. |
| Ke Hu; Google, Inc. |
| Minho Jin; Google, Inc. |
| Anjuli Kannan; Google, Inc. |
| Qiao Liang; Google, Inc. |
| Ian McGraw; Google, Inc. |
| Cal Peyser; Google, Inc. |
| Rohit Prabhavalkar; Google, Inc. |
| Golan Pundak; Google, Inc. |
| David Rybach; Google, Inc. |
| Yuan Shangguan; Google, Inc. |
| Yash Sheth; Google, Inc. |
| Trevor Strohman; Google, Inc. |
| Mirko Visontai; Google, Inc. |
| Yonghui Wu; Google, Inc. |
| Yu Zhang; Google, Inc. |
| Ding Zhao; Google, Inc. |
| SPE-L1.2: MINIMUM LATENCY TRAINING STRATEGIES FOR STREAMING SEQUENCE-TO-SEQUENCE ASR |
| Hirofumi Inaguma; Kyoto University |
| Yashesh Gaur; Microsoft Corporation |
| Liang Lu; Microsoft Corporation |
| Jinyu Li; Microsoft Corporation |
| Yifan Gong; Microsoft Corporation |
| SPE-L1.3: TOWARDS FAST AND ACCURATE STREAMING END-TO-END ASR |
| Bo Li; Google, Inc. |
| Shuo-Yiin Chang; Google, Inc. |
| Tara Sainath; Google, Inc. |
| Ruoming Pang; Google, Inc. |
| Yanzhang He; Google, Inc. |
| Trevor Strohman; Google, Inc. |
| Yonghui Wu; Google, Inc. |
| SPE-L1.4: STREAMING AUTOMATIC SPEECH RECOGNITION WITH THE TRANSFORMER MODEL |
| Niko Moritz; Mitsubishi Electric Research Laboratories (MERL) |
| Takaaki Hori; Mitsubishi Electric Research Laboratories (MERL) |
| Jonathan Le Roux; Mitsubishi Electric Research Laboratories (MERL) |
| SPE-L1.5: CIF: CONTINUOUS INTEGRATE-AND-FIRE FOR END-TO-END SPEECH RECOGNITION |
| Linhao Dong; Institute of Automation, Chinese Academy of Sciences |
| Bo Xu; Institute of Automation, Chinese Academy of Sciences |
| SPE-L1.6: TRANSFORMER-BASED ONLINE CTC/ATTENTION END-TO-END SPEECH RECOGNITION ARCHITECTURE |
| Haoran Miao; Key Laboratory of Speech Acoustics and Content Understanding |
| Gaofeng Cheng; Key Laboratory of Speech Acoustics and Content Understanding |
| Changfeng Gao; Key Laboratory of Speech Acoustics and Content Understanding |
| Pengyuan Zhang; Key Laboratory of Speech Acoustics and Content Understanding |
| Yonghong Yan; Key Laboratory of Speech Acoustics and Content Understanding |