SLP-L21.1

IMPROVING ATTENTION-BASED END-TO-END SPEECH RECOGNITION BY MONOTONIC ALIGNMENT ATTENTION MATRIX RECONSTRUCTION

Ziyang Zhuang, Kun Zou, Chenfeng Miao, Ming Fang, Tao Wei, PingAn Technology, China; Zijian Li, Georgia Institute of Technology, United States of America; Wei Hu, Shaojun Wang, Jing Xiao, PingAn Technology, China

Session:
SLP-L21: End-to-end modeling for automatic speech recognition Lecture

Track:
Speech and Language Processing

Location:
Room 103

Presentation Time:
Thu, 18 Apr, 13:10 - 13:30 (UTC +9)

Session Co-Chairs:
Bhuvana Ramabhadran, Google and Jinyu Li, Microsoft
View Manuscript
Presentation
Discussion
Resources
Session SLP-L21
SLP-L21.1: IMPROVING ATTENTION-BASED END-TO-END SPEECH RECOGNITION BY MONOTONIC ALIGNMENT ATTENTION MATRIX RECONSTRUCTION
Ziyang Zhuang, Kun Zou, Chenfeng Miao, Ming Fang, Tao Wei, PingAn Technology, China; Zijian Li, Georgia Institute of Technology, United States of America; Wei Hu, Shaojun Wang, Jing Xiao, PingAn Technology, China
SLP-L21.2: USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Google, United States of America; Jian Li, Google Deepmind, United States of America; Amir Yazdanbakhsh, Google DeepMind, United States of America; Shivani Agrawal, Google, United States of America
SLP-L21.3: KEEP DECODING PARALLEL WITH EFFECTIVE KNOWLEDGE DISTILLATION FROM LANGUAGE MODELS TO END-TO-END SPEECH RECOGNISERS
Michael Hentschel, LINE WORKS Corporation, Japan; Yuta Nishikawa, Nara Institute of Science and Technology, Japan; Tatsuya Komatsu, Yusuke Fujita, LINE Corporation, Japan
SLP-L21.4: EXTREME ENCODER OUTPUT FRAME RATE REDUCTION: IMPROVING COMPUTATIONAL LATENCIES OF LARGE END-TO-END MODELS
Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno, Google LLC, United States of America
SLP-L21.5: IMPROVING MULTI-SPEAKER ASR WITH OVERLAP-AWARE ENCODING AND MONOTONIC ATTENTION
Tao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li, Xiamen University, China
SLP-L21.6: ON THE RELATION BETWEEN INTERNAL LANGUAGE MODEL AND SEQUENCE DISCRIMINATIVE TRAINING FOR NEURAL TRANSDUCERS
Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney, RWTH Aachen University, Germany
Contacts