SPE-P6: Speech Recognition: Acoustic Modelling I |
| Session Type: Poster |
| Time: Wednesday, 6 May, 09:00 - 11:00 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Sakriani Sakti, Nara Institute of Science and Technology / RIKEN AIP and Rohit Prabhavalkar, Google |
| SPE-P6.1: SNDCNN: SELF-NORMALIZING DEEP CNNS WITH SCALED EXPONENTIAL LINEAR UNITS FOR SPEECH RECOGNITION |
| Zhen Huang; Apple |
| Tim Ng; Apple |
| Leo Liu; Apple |
| Henry Mason; Apple |
| Xiaodan Zhuang; Apple |
| Daben Liu; Apple |
| SPE-P6.2: ROBUST MULTI-CHANNEL SPEECH RECOGNITION USING FREQUENCY ALIGNED NETWORK |
| Taejin Park; University of Southern California |
| Kenichi Kumatani; Amazon, Inc. |
| Minhua Wu; Amazon, Inc. |
| Shiva Sundaram; Amazon, Inc. |
| SPE-P6.3: FULLY LEARNABLE FRONT-END FOR MULTI-CHANNEL ACOUSTIC MODELING USING SEMI-SUPERVISED LEARNING |
| Sanna Wager; Indiana University |
| Aparna Khare; Amazon, Inc. |
| Minhua Wu; Amazon, Inc. |
| Kenichi Kumatani; Amazon, Inc. |
| Shiva Sundaram; Amazon, Inc. |
| SPE-P6.4: G2G: TTS-DRIVEN PRONUNCIATION LEARNING FOR GRAPHEMIC HYBRID ASR |
| Duc Le; Facebook |
| Thilo Koehler; Facebook |
| Christian Fuegen; Facebook |
| Michael L. Seltzer; Facebook |
| SPE-P6.5: TRANSFORMER-BASED ACOUSTIC MODELING FOR HYBRID SPEECH RECOGNITION |
| Yongqiang Wang; Facebook |
| Abdelrahman Mohamed; Facebook |
| Duc Le; Facebook |
| Chunxi Liu; Facebook |
| Alex Xiao; Facebook |
| Jay Mahadeokar; Facebook |
| Hongzhao Huang; Facebook |
| Andros Tjandra; Facebook |
| Xiaohui Zhang; Facebook |
| Frank Zhang; Facebook |
| Christian Fuegen; Facebook |
| Geoffrey Zweig; Facebook |
| Michael L. Seltzer; Facebook |
| SPE-P6.6: SPECAUGMENT ON LARGE SCALE DATASETS |
| Daniel Park; Google, Inc. |
| Yu Zhang; Google, Inc. |
| Chung-Cheng Chiu; Google, Inc. |
| Youzheng Chen; Google, Inc. |
| Bo Li; Google, Inc. |
| William Chan; Google, Inc. |
| Quoc Le; Google, Inc. |
| Yonghui Wu; Google, Inc. |
| SPE-P6.7: FAST TRAINING OF DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION |
| Guojing Cong; IBM |
| Brian Kingsbury; IBM |
| Chih-Chieh Yang; IBM |
| Tianyi Liu; Georgia Institute of Technology |
| SPE-P6.8: UNSUPERVISED PRE-TRAINING OF BIDIRECTIONAL SPEECH ENCODERS VIA MASKED RECONSTRUCTION |
| Weiran Wang; Amazon, Inc. |
| Qingming Tang; Amazon, Inc. |
| Karen Livescu; TTI-Chicago |
| SPE-P6.9: DISTILLING ATTENTION WEIGHTS FOR CTC-BASED ASR SYSTEMS |
| Takafumi Moriya; NTT Corporation |
| Hiroshi Sato; NTT Corporation |
| Tomohiro Tanaka; NTT Corporation |
| Takanori Ashihara; NTT Corporation |
| Ryo Masumura; NTT Corporation |
| Yusuke Shinohara; NTT Corporation |
| SPE-P6.10: DEJA-VU: DOUBLE FEATURE PRESENTATION AND ITERATED LOSS IN DEEP TRANSFORMER NETWORKS |
| Andros Tjandra; Nara Institute of Science and Technology |
| Chunxi Liu; Facebook AI |
| Frank Zhang; Facebook AI |
| Xiaohui Zhang; Facebook AI |
| Yongqiang Wang; Facebook AI |
| Gabriel Synnaeve; Facebook AI |
| Satoshi Nakamura; Nara Institute of Science and Technology |
| Geoffrey Zweig; Facebook AI |
| SPE-P6.11: FRAME-LEVEL MMI AS A SEQUENCE DISCRIMINATIVE TRAINING CRITERION FOR LVCSR |
| Wilfried Michel; RWTH Aachen University |
| Ralf Schlüter; RWTH Aachen University |
| Hermann Ney; RWTH Aachen University |
| SPE-P6.12: CROSS LINGUAL TRANSFER LEARNING FOR ZERO-RESOURCE DOMAIN ADAPTATION |
| Alberto Abad; INESC-ID/IST |
| Peter Bell; CSTR/University of Edinburgh |
| Andrea Carmantini; CSTR/University of Edinburgh |
| Steve Renals; CSTR/University of Edinburgh |