SPE-P20: Speech Recognition: Acoustic Modelling II |
Session Type: Poster |
Time: Friday, 8 May, 11:45 - 13:45 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chairs: Dorothea Kolossa, Ruhr University Bochum and Arun Narayanan, Google |
SPE-P20.1: LIBRI-LIGHT: A BENCHMARK FOR ASR WITH LIMITED OR NO SUPERVISION |
Jacob Kahn; Facebook |
Morgane Rivière; Facebook |
Weiyi Zheng; Facebook |
Eugene Kharitonov; Facebook |
Qiantong Xu; Facebook |
Pierre-Emmanuel Mazaré; Facebook |
Julien Karadayi; ENS |
Vitaly Liptchinsky; Facebook |
Ronan Collobert; Facebook |
Christian Fuegen; Facebook |
Tatiana Likhomanenko; Facebook |
Gabriel Synnaeve; Facebook |
Armand Joulin; Facebook |
Abdelrahman Mohamed; Facebook |
Emmanuel Dupoux; Facebook / EHESS |
SPE-P20.2: A COMPREHENSIVE STUDY OF RESIDUAL CNNS FOR ACOUSTIC MODELING IN ASR |
Vitalii Bozheniuk; RWTH Aachen University |
Albert Zeyer; RWTH Aachen University |
Ralf Schlüter; RWTH Aachen University |
Hermann Ney; RWTH Aachen University |
SPE-P20.3: LAYER-NORMALIZED LSTM FOR HYBRID-HMM AND END-TO-END ASR |
Mohammad Zeineldeen; RWTH Aachen University |
Albert Zeyer; RWTH Aachen University |
Ralf Schlüter; RWTH Aachen University |
Hermann Ney; RWTH Aachen University |
SPE-P20.4: SMALL ENERGY MASKING FOR IMPROVED NEURAL NETWORK TRAINING FOR END-TO-END SPEECH RECOGNITION |
Chanwoo Kim; Samsung Research |
Kwangyoun Kim; Samsung Research |
Sathish Reddy Indurthi; Samsung Research |
SPE-P20.5: IMPROVING SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION TRAINING WITH ON-THE-FLY DATA AUGMENTATION |
Thai Son Nguyen; Karlsruhe Institute of Technology |
Sebastian Stueker; Karlsruhe Institute of Technology |
Jan Niehues; Maastricht University |
Alex Waibel; Karlsruhe Institute of Technology |
SPE-P20.6: EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR ASR |
Alexei Baevski; Facebook |
Abdelrahman Mohamed; Facebook |
SPE-P20.7: HIGH-ACCURACY AND LOW-LATENCY SPEECH RECOGNITION WITH TWO-HEAD CONTEXTUAL LAYER TRAJECTORY LSTM MODEL |
Jinyu Li; Microsoft |
Rui Zhao; Microsoft |
Eric Sun; Microsoft |
Jeremy Wong; Microsoft |
Amit Das; Microsoft |
Zhong Meng; Microsoft |
Yifan Gong; Microsoft |
SPE-P20.8: DFSMN-SAN WITH PERSISTENT MEMORY MODEL FOR AUTOMATIC SPEECH RECOGNITION |
Zhao You; Tencent |
Dan Su; Tencent |
Jie Chen; Tencent |
Chao Weng; Tencent |
Dong Yu; Tencent |
SPE-P20.9: DYNAMIC TEMPORAL RESIDUAL LEARNING FOR SPEECH RECOGNITION |
Jiaqi Xie; Tsinghua University |
Ruijie Yan; Tsinghua University |
Shanyu Xiao; Tsinghua University |
Liangrui Peng; Tsinghua University |
Michael T. Johnson; University of Kentucky |
Wei-Qiang Zhang; Tsinghua University |
SPE-P20.10: E2E-SINCNET: TOWARD FULLY END-TO-END SPEECH RECOGNITION |
Titouan Parcollet; University of Oxford |
Mohamed Morchid; University of Avignon |
Georges Linarès; University of Avignon |
SPE-P20.11: SPEAKER AUGMENTATION FOR LOW RESOURCE SPEECH RECOGNITION |
Chenpeng Du; Shanghai Jiao Tong University |
Kai Yu; Shanghai Jiao Tong University |
SPE-P20.12: CGCNN: COMPLEX GABOR CONVOLUTIONAL NEURAL NETWORK ON RAW SPEECH |
Paul-Gauthier Noé; Avignon Université |
Titouan Parcollet; University of Oxford |
Mohamed Morchid; Avignon Université |