Technical Program

Click on the icon to view the manuscript on IEEE XPlore in the IEEE ICASSP 2020 Open Preview.

SPE-P9: End-to-end Speech Recognition III: General Topics

Session Type: Poster
Time: Wednesday, 6 May, 16:30 - 18:30
Location: On-Demand
Virtual Session: View on Virtual Platform
Session Chairs: Takaaki Hori, MERL and Yifan Gong, Microsoft
 
 SPE-P9.1: IMPROVING SPEECH RECOGNITION USING CONSISTENT PREDICTIONS ON SYNTHESIZED SPEECH
         Gary Wang; Simon Fraser University
         Andrew Rosenberg; Google
         Zhehuai Chen; Google
         Yu Zhang; Google
         Bhuvana Ramabhadran; Google
         Yonghui Wu; Google
         Pedro Moreno; Google
 
 SPE-P9.2: ATTENTION-BASED ASR WITH LIGHTWEIGHT AND DYNAMIC CONVOLUTIONS
         Yuya Fujita; Yahoo Japan Corporation
         Aswin Shanmugam Subramanian; Johns Hopkins University
         Motoi Omachi; Yahoo Japan Corporation
         Shinji Watanabe; Johns Hopkins University
 
 SPE-P9.3: AN ATTENTION-BASED JOINT ACOUSTIC AND TEXT ON-DEVICE END-TO-END MODEL
         Tara Sainath; Google, Inc.
         Ruoming Pang; Google, Inc.
         Ron Weiss; Google, Inc.
         Yanzhang He; Google, Inc.
         Chung-cheng Chiu; Google, Inc.
         Trevor Strohman; Google, Inc.
 
 SPE-P9.4: STRUCTURED SPARSE ATTENTION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
         Jiabin Xue; Harbin Institute of Technology
         Tieran Zheng; Harbin Institute of Technology
         Jiqing Han; Harbin Institute of Technology
 
 SPE-P9.5: RNN-TRANSDUCER WITH STATELESS PREDICTION NETWORK
         Mohammadreza Ghodsi; Google
         Xiaofeng Liu; Google
         James Apfel; Google
         Rodrigo Cabrera; Google
         Eugene Weinstein; Google
 
 SPE-P9.6: SEQUENCE-LEVEL CONSISTENCY TRAINING FOR SEMI-SUPERVISED END-TO-END AUTOMATIC SPEECH RECOGNITION
         Ryo Masumura; NTT Corporation
         Mana Ihori; NTT Corporation
         Akihiko Takashima; NTT Corporation
         Takafumi Moriya; NTT Corporation
         Atsushi Ando; NTT Corporation
         Yusuke Shinohara; NTT Corporation
 
 SPE-P9.7: INDEPENDENT LANGUAGE MODELING ARCHITECTURE FOR END-TO-END ASR
         Van Tung Pham; Nanyang Technological University
         Haihua Xu; Nanyang Technological University
         Yerbolat Khassanov; Nazarbayev University
         Zhiping Zeng; Nanyang Technological University
         Eng Siong Chng; Nanyang Technological University
         Chongjia Ni; Alibaba Group
         Bin Ma; Alibaba Group
         Haizhou Li; National University of Singapore
 
 SPE-P9.8: SPEAKER-AWARE TRAINING OF ATTENTION-BASED END-TO-END SPEECH RECOGNITION USING NEURAL SPEAKER EMBEDDINGS
         Aku Rouhe; Aalto University
         Tuomas Kaseva; Aalto University
         Mikko Kurimo; Aalto University
 
 SPE-P9.9: GENERATING SYNTHETIC AUDIO DATA FOR ATTENTION-BASED SPEECH RECOGNITION SYSTEMS
         Nick Rossenbach; RWTH Aachen University
         Albert Zeyer; RWTH Aachen University
         Ralf Schlüter; RWTH Aachen University
         Hermann Ney; RWTH Aachen University
 
 SPE-P9.10: CORRECTION OF AUTOMATIC SPEECH RECOGNITION WITH TRANSFORMER SEQUENCE-TO-SEQUENCE MODEL
         Oleksii Hrinchuk; Moscow Institute of Physics and Technology, NVIDIA
         Mariya Popova; Carnegie Mellon University and NVIDIA
         Boris Ginsburg; NVIDIA
 
 SPE-P9.11: EXPLORING PRE-TRAINING WITH ALIGNMENTS FOR RNN TRANSDUCER BASED END-TO-END SPEECH RECOGNITION
         Hu Hu; Georgia Institute of Technology
         Rui Zhao; Microsoft
         Jinyu Li; Microsoft
         Liang Lu; Microsoft
         Yifan Gong; Microsoft
 
 SPE-P9.12: SELF-TRAINING FOR END-TO-END SPEECH RECOGNITION
         Jacob Kahn; Facebook
         Ann Lee; Facebook
         Awni Hannun; Facebook