SPE-P6: Speech Recognition: Acoustic Modelling I |
Session Type: Poster |
Time: Wednesday, 6 May, 09:00 - 11:00 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chairs: Sakriani Sakti, Nara Institute of Science and Technology / RIKEN AIP and Rohit Prabhavalkar, Google
|
|
SPE-P6.1: SNDCNN: SELF-NORMALIZING DEEP CNNS WITH SCALED EXPONENTIAL LINEAR UNITS FOR SPEECH RECOGNITION |
Zhen Huang; Apple |
Tim Ng; Apple |
Leo Liu; Apple |
Henry Mason; Apple |
Xiaodan Zhuang; Apple |
Daben Liu; Apple |
|
SPE-P6.2: ROBUST MULTI-CHANNEL SPEECH RECOGNITION USING FREQUENCY ALIGNED NETWORK |
Taejin Park; University of Southern California |
Kenichi Kumatani; Amazon, Inc. |
Minhua Wu; Amazon, Inc. |
Shiva Sundaram; Amazon, Inc. |
|
SPE-P6.3: FULLY LEARNABLE FRONT-END FOR MULTI-CHANNEL ACOUSTIC MODELING USING SEMI-SUPERVISED LEARNING |
Sanna Wager; Indiana University |
Aparna Khare; Amazon, Inc. |
Minhua Wu; Amazon, Inc. |
Kenichi Kumatani; Amazon, Inc. |
Shiva Sundaram; Amazon, Inc. |
|
SPE-P6.4: G2G: TTS-DRIVEN PRONUNCIATION LEARNING FOR GRAPHEMIC HYBRID ASR |
Duc Le; Facebook |
Thilo Koehler; Facebook |
Christian Fuegen; Facebook |
Michael L. Seltzer; Facebook |
|
SPE-P6.5: TRANSFORMER-BASED ACOUSTIC MODELING FOR HYBRID SPEECH RECOGNITION |
Yongqiang Wang; Facebook |
Abdelrahman Mohamed; Facebook |
Duc Le; Facebook |
Chunxi Liu; Facebook |
Alex Xiao; Facebook |
Jay Mahadeokar; Facebook |
Hongzhao Huang; Facebook |
Andros Tjandra; Facebook |
Xiaohui Zhang; Facebook |
Frank Zhang; Facebook |
Christian Fuegen; Facebook |
Geoffrey Zweig; Facebook |
Michael L. Seltzer; Facebook |
|
SPE-P6.6: SPECAUGMENT ON LARGE SCALE DATASETS |
Daniel Park; Google, Inc. |
Yu Zhang; Google, Inc. |
Chung-Cheng Chiu; Google, Inc. |
Youzheng Chen; Google, Inc. |
Bo Li; Google, Inc. |
William Chan; Google, Inc. |
Quoc Le; Google, Inc. |
Yonghui Wu; Google, Inc. |
|
SPE-P6.7: FAST TRAINING OF DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION |
Guojing Cong; IBM |
Brian Kingsbury; IBM |
Chih-Chieh Yang; IBM |
Tianyi Liu; Georgia Institute of Technology |
|
SPE-P6.8: UNSUPERVISED PRE-TRAINING OF BIDIRECTIONAL SPEECH ENCODERS VIA MASKED RECONSTRUCTION |
Weiran Wang; Amazon, Inc. |
Qingming Tang; Amazon, Inc. |
Karen Livescu; TTI-Chicago |
|
SPE-P6.9: DISTILLING ATTENTION WEIGHTS FOR CTC-BASED ASR SYSTEMS |
Takafumi Moriya; NTT Corporation |
Hiroshi Sato; NTT Corporation |
Tomohiro Tanaka; NTT Corporation |
Takanori Ashihara; NTT Corporation |
Ryo Masumura; NTT Corporation |
Yusuke Shinohara; NTT Corporation |
|
SPE-P6.10: DEJA-VU: DOUBLE FEATURE PRESENTATION AND ITERATED LOSS IN DEEP TRANSFORMER NETWORKS |
Andros Tjandra; Nara Institute of Science and Technology |
Chunxi Liu; Facebook AI |
Frank Zhang; Facebook AI |
Xiaohui Zhang; Facebook AI |
Yongqiang Wang; Facebook AI |
Gabriel Synnaeve; Facebook AI |
Satoshi Nakamura; Nara Institute of Science and Technology |
Geoffrey Zweig; Facebook AI |
|
SPE-P6.11: FRAME-LEVEL MMI AS A SEQUENCE DISCRIMINATIVE TRAINING CRITERION FOR LVCSR |
Wilfried Michel; RWTH Aachen University |
Ralf Schlüter; RWTH Aachen University |
Hermann Ney; RWTH Aachen University |
|
SPE-P6.12: CROSS LINGUAL TRANSFER LEARNING FOR ZERO-RESOURCE DOMAIN ADAPTATION |
Alberto Abad; INESC-ID/IST |
Peter Bell; CSTR/University of Edinburgh |
Andrea Carmantini; CSTR/University of Edinburgh |
Steve Renals; CSTR/University of Edinburgh |
|