Tue PM2.L1.1
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao, Yosuke Higuchi, Waseda University, Japan; Yusuke Kida, Line Corporation, Japan; Tetsuji Ogawa, Tetsunori Kobayashi, Waseda University, Japan
Session:
Tue PM2.L1: Speech Recognition Lecture
Track:
ASMSP - Acoustic, Speech and Music Signal Processing
Location:
EUROPAEA
Presentation Time:
Tue, 5 Sep, 16:40 - 17:00 Finland Time (UTC +3)
Session Chair:
Stefan Goetze, University of Sheffield
Presentation
Discussion
Resources
No resources available.
Session Tue PM2.L1
Tue PM2.L1.1: Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao, Yosuke Higuchi, Waseda University, Japan; Yusuke Kida, Line Corporation, Japan; Tetsuji Ogawa, Tetsunori Kobayashi, Waseda University, Japan
Tue PM2.L1.2: LOW-RESOURCE TEXT-TO-SPEECH USING SPECIFIC DATA AND NOISE AUGMENTATION
Kishor Kayyar Lakshminarayana, Christian Dittmar, Nicola Pia, Fraunhofer Institute for Integrated Circuits (IIS), Germany; Emanuël A.P. Habets, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Tue PM2.L1.3: CANONICAL VOICE CONVERSION AND DUAL-CHANNEL PROCESSING FOR IMPROVED VOICE PRIVACY OF SPEECH RECOGNITION DATA
Dushyant Sharma, Nuance / Microsoft, United States; Francesco Nespoli, Nuance / Imperial College, United Kingdom; Rong Gong, Nuance / Microsoft, Austria; Patrick Naylor, Imperial College, United Kingdom
Tue PM2.L1.4: ROOM ADAPTATION OF TRAINING DATA FOR DISTANT SPEECH RECOGNITION
James Fosburgh, Dushyant Sharma, Nuance Communications Inc., United States; Patrick Naylor, Imperial College London, United Kingdom
Tue PM2.L1.5: A PRIVACY-PRESERVING METHOD USING SECRET KEY FOR CONVOLUTIONAL NEURAL NETWORK-BASED SPEECH CLASSIFICATION
Shoko Niwa, Sayaka Shiota, Hitoshi Kiya, Tokyo Metropolitan University, Japan