SPE-L3: End-to-end Speech Recognition II: New Models |
| Session Type: Lecture |
| Time: Tuesday, 5 May, 16:30 - 18:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chair: Tara Sainath, Google |
| SPE-L3.1: JOINT PHONEME-GRAPHEME MODEL FOR END-TO-END SPEECH RECOGNITION |
| Yotaro Kubo; Google |
| Michiel Bacchiani; Google |
| SPE-L3.2: QUARTZNET: DEEP AUTOMATIC SPEECH RECOGNITION WITH 1D TIME-CHANNEL SEPARABLE CONVOLUTIONS |
| Samuel Kriman; University of Illinois at Urbana-Champaign |
| Stanislav Beliaev; University of Saint Petersburg |
| Boris Ginsburg; NVIDIA |
| Jocelyn Huang; NVIDIA |
| Oleksii Kuchaiev; NVIDIA |
| Vitaly Lavrukhin; NVIDIA |
| Ryan Leary; NVIDIA |
| Jason Li; NVIDIA |
| Yang Zhang; NVIDIA |
| SPE-L3.3: END-TO-END MULTI-TALKER OVERLAPPING SPEECH RECOGNITION |
| Anshuman Tripathi; Google |
| Han Lu; Google |
| Hasim Sak; Google |
| SPE-L3.4: END-TO-END MULTI-SPEAKER SPEECH RECOGNITION WITH TRANSFORMER |
| Xuankai Chang; Johns Hopkins University |
| Wangyou Zhang; Shanghai Jiao Tong University |
| Yanmin Qian; Shanghai Jiao Tong University |
| Jonathan Le Roux; Mitsubishi Electric Research Laboratories (MERL) |
| Shinji Watanabe; Johns Hopkins University |
| SPE-L3.5: HYBRID AUTOREGRESSIVE TRANSDUCER (HAT) |
| Ehsan Variani; Google |
| David Rybach; Google |
| Cyril Allauzen; Google |
| Michael Riley; Google |
| SPE-L3.6: LIGHTWEIGHT AND EFFICIENT END-TO-END SPEECH RECOGNITION USING LOW-RANK TRANSFORMER |
| Genta Indra Winata; Hong Kong University of Science and Technology |
| Samuel Cahyawijaya; Hong Kong University of Science and Technology |
| Zhaojiang Lin; Hong Kong University of Science and Technology |
| Zihan Liu; Hong Kong University of Science and Technology |
| Pascale Fung; Hong Kong University of Science and Technology |