SLP-P23: End-to-end Speech Recognition V: Modeling Methods |
Session Type: Poster |
Time: Friday, May 17, 16:00 - 18:00 |
Location: Poster Area A, Ground Floor |
Session Chair: Gakuto Kurata, IBM |
SLP-P23.1: END-TO-END SPEECH RECOGNITION USING A HIGH RANK LSTM-CTC BASED MODEL |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Yangyang Shi; Mobvoi AI Lab |
Mei-Yuh Hwang; Mobvoi AI Lab |
Xin Lei; Mobvoi AI Lab |
SLP-P23.2: INVESTIGATION OF MODELING UNITS FOR MANDARIN SPEECH RECOGNITION USING DFSMN-CTC-SMBR |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Shiliang Zhang; Machine Intelligence Technology, Alibaba Group |
Ming Lei; Machine Intelligence Technology, Alibaba Group |
Yuan Liu; Machine Intelligence Technology, Alibaba Group |
Wei Li; Machine Intelligence Technology, Alibaba Group |
SLP-P23.3: END-TO-END ANCHORED SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Yiming Wang; Johns Hopkins University |
Xing Fan; Amazon |
I-Fan Chen; Amazon |
Yuzong Liu; Amazon |
Tongfei Chen; Johns Hopkins University |
Björn Hoffmeister; Amazon |
SLP-P23.4: THE SPEECHTRANSFORMER FOR LARGE-SCALE MANDARIN CHINESE SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Yuanyuan Zhao; Kwai |
Jie Li; Kwai |
Xiaorui Wang; Kwai |
Yan Li; Kwai |
SLP-P23.5: WINDOWED ATTENTION MECHANISMS FOR SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Shucong Zhang; University of Edinburgh |
Erfan Loweimi; University of Edinburgh |
Peter Bell; University of Edinburgh |
Steve Renals; University of Edinburgh |
SLP-P23.6: STREAM ATTENTION-BASED MULTI-ARRAY END-TO-END SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Xiaofei Wang; Johns Hopkins University |
Ruizhi Li; Johns Hopkins University |
Sri Harish Mallidi; Amazon |
Takaaki Hori; Mitsubishi Electric Research Laboratories |
Shinji Watanabe; Johns Hopkins University |
Hynek Hermansky; Johns Hopkins University |
SLP-P23.7: IMPROVING END-TO-END SPEECH RECOGNITION WITH PRONUNCIATION-ASSISTED SUB-WORD MODELING |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Hainan Xu; Johns Hopkins University |
Shuoyang Ding; Johns Hopkins University |
Shinji Watanabe; Johns Hopkins University |
SLP-P23.8: SELF-ATTENTION NETWORKS FOR CONNECTIONIST TEMPORAL CLASSIFICATION IN SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Julian Salazar; Amazon AI |
Katrin Kirchhoff; Amazon AI |
Zhiheng Huang; Amazon AI |
SLP-P23.9: SEMANTIC QUERY-BY-EXAMPLE SPEECH SEARCH USING VISUAL GROUNDING |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Herman Kamper; Stellenbosch University |
Aristotelis Anastassiou; Stellenbosch University |
Karen Livescu; TTI-Chicago |