SPE-P9: End-to-end Speech Recognition III: General Topics |
| Session Type: Poster |
| Time: Wednesday, 6 May, 16:30 - 18:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Takaaki Hori, MERL and Yifan Gong, Microsoft |
| SPE-P9.1: IMPROVING SPEECH RECOGNITION USING CONSISTENT PREDICTIONS ON SYNTHESIZED SPEECH |
| Gary Wang; Simon Fraser University |
| Andrew Rosenberg; Google |
| Zhehuai Chen; Google |
| Yu Zhang; Google |
| Bhuvana Ramabhadran; Google |
| Yonghui Wu; Google |
| Pedro Moreno; Google |
| SPE-P9.2: ATTENTION-BASED ASR WITH LIGHTWEIGHT AND DYNAMIC CONVOLUTIONS |
| Yuya Fujita; Yahoo Japan Corporation |
| Aswin Shanmugam Subramanian; Johns Hopkins University |
| Motoi Omachi; Yahoo Japan Corporation |
| Shinji Watanabe; Johns Hopkins University |
| SPE-P9.3: AN ATTENTION-BASED JOINT ACOUSTIC AND TEXT ON-DEVICE END-TO-END MODEL |
| Tara Sainath; Google, Inc. |
| Ruoming Pang; Google, Inc. |
| Ron Weiss; Google, Inc. |
| Yanzhang He; Google, Inc. |
| Chung-cheng Chiu; Google, Inc. |
| Trevor Strohman; Google, Inc. |
| SPE-P9.4: STRUCTURED SPARSE ATTENTION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION |
| Jiabin Xue; Harbin Institute of Technology |
| Tieran Zheng; Harbin Institute of Technology |
| Jiqing Han; Harbin Institute of Technology |
| SPE-P9.5: RNN-TRANSDUCER WITH STATELESS PREDICTION NETWORK |
| Mohammadreza Ghodsi; Google |
| Xiaofeng Liu; Google |
| James Apfel; Google |
| Rodrigo Cabrera; Google |
| Eugene Weinstein; Google |
| SPE-P9.6: SEQUENCE-LEVEL CONSISTENCY TRAINING FOR SEMI-SUPERVISED END-TO-END AUTOMATIC SPEECH RECOGNITION |
| Ryo Masumura; NTT Corporation |
| Mana Ihori; NTT Corporation |
| Akihiko Takashima; NTT Corporation |
| Takafumi Moriya; NTT Corporation |
| Atsushi Ando; NTT Corporation |
| Yusuke Shinohara; NTT Corporation |
| SPE-P9.7: INDEPENDENT LANGUAGE MODELING ARCHITECTURE FOR END-TO-END ASR |
| Van Tung Pham; Nanyang Technological University |
| Haihua Xu; Nanyang Technological University |
| Yerbolat Khassanov; Nazarbayev University |
| Zhiping Zeng; Nanyang Technological University |
| Eng Siong Chng; Nanyang Technological University |
| Chongjia Ni; Alibaba Group |
| Bin Ma; Alibaba Group |
| Haizhou Li; National University of Singapore |
| SPE-P9.8: SPEAKER-AWARE TRAINING OF ATTENTION-BASED END-TO-END SPEECH RECOGNITION USING NEURAL SPEAKER EMBEDDINGS |
| Aku Rouhe; Aalto University |
| Tuomas Kaseva; Aalto University |
| Mikko Kurimo; Aalto University |
| SPE-P9.9: GENERATING SYNTHETIC AUDIO DATA FOR ATTENTION-BASED SPEECH RECOGNITION SYSTEMS |
| Nick Rossenbach; RWTH Aachen University |
| Albert Zeyer; RWTH Aachen University |
| Ralf Schlüter; RWTH Aachen University |
| Hermann Ney; RWTH Aachen University |
| SPE-P9.10: CORRECTION OF AUTOMATIC SPEECH RECOGNITION WITH TRANSFORMER SEQUENCE-TO-SEQUENCE MODEL |
| Oleksii Hrinchuk; Moscow Institute of Physics and Technology, NVIDIA |
| Mariya Popova; Carnegie Mellon University and NVIDIA |
| Boris Ginsburg; NVIDIA |
| SPE-P9.11: EXPLORING PRE-TRAINING WITH ALIGNMENTS FOR RNN TRANSDUCER BASED END-TO-END SPEECH RECOGNITION |
| Hu Hu; Georgia Institute of Technology |
| Rui Zhao; Microsoft |
| Jinyu Li; Microsoft |
| Liang Lu; Microsoft |
| Yifan Gong; Microsoft |
| SPE-P9.12: SELF-TRAINING FOR END-TO-END SPEECH RECOGNITION |
| Jacob Kahn; Facebook |
| Ann Lee; Facebook |
| Awni Hannun; Facebook |