SLP-P13: New Features, Models and Representations / Audio Visual ASR |
Session Type: Poster |
Time: Thursday, May 16, 13:00 - 15:00 |
Location: Poster Area A, Ground Floor |
Session Chair: Dorothea Kolossa, Ruhr University Bochum |
SLP-P13.1: LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Shruti Palaskar; Carnegie Mellon University |
Vikas Raunak; Carnegie Mellon University |
Florian Metze; Carnegie Mellon University |
SLP-P13.2: TRULY UNSUPERVISED ACOUSTIC WORD EMBEDDINGS USING WEAK TOP-DOWN CONSTRAINTS IN ENCODER-DECODER MODELS |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Herman Kamper; Stellenbosch University |
SLP-P13.3: A FACTORIAL DEEP MARKOV MODEL FOR UNSUPERVISED DISENTANGLED REPRESENTATION LEARNING FROM SPEECH |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Sameer Khurana; Massachusetts Institute of Technology |
Shafiq Rayhan Joty; NTU |
Ahmed Ali; QCRI |
James Glass; Massachusetts Institute of Technology |
SLP-P13.4: M-VECTORS: SUB-BAND BASED ENERGY MODULATION FEATURES FOR MULTI-STREAM AUTOMATIC SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Samik Sadhu; Johns Hopkins University |
Ruizhi Li; Johns Hopkins University |
Hynek Hermansky; Johns Hopkins University |
SLP-P13.5: IMPROVING LAYER TRAJECTORY LSTM WITH FUTURE CONTEXT FRAMES |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Jinyu Li; Microsoft |
Liang Lu; Microsoft |
Changliang Liu; Microsoft |
Yifan Gong; Microsoft |
SLP-P13.6: BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Shoukang Hu; The Chinese University of Hong Kong |
Max W. Y. Lam; The Chinese University of Hong Kong |
Xurong Xie; The Chinese University of Hong Kong |
Shansong Liu; The Chinese University of Hong Kong |
Jianwei Yu; The Chinese University of Hong Kong |
Xixin Wu; The Chinese University of Hong Kong |
Xunying Liu; The Chinese University of Hong Kong |
Helen Meng; The Chinese University of Hong Kong |
SLP-P13.7: IMPROVING AUDIO-VISUAL SPEECH RECOGNITION PERFORMANCE WITH CROSS-MODAL STUDENT-TEACHER TRAINING |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Wei Li; Georgia Institute of Technology |
Sicheng Wang; Georgia Institute of Technology |
Ming Lei; Alibaba |
Sabato Marco Siniscalchi; Kore University of Enna |
Chin-Hui Lee; Georgia Institute of Technology |
SLP-P13.8: MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Pan Zhou; Tsinghua University |
Wenwen Yang; Sogou Technology Incorporated |
Wei Chen; Sogou Technology Incorporated |
Yanfeng Wang; Sogou Technology Incorporated |
Jia Jia; Tsinghua University |
SLP-P13.9: ROBUST AUDIO-VISUAL SPEECH RECOGNITION USING BIMODAL DFSMN WITH MULTI-CONDITION TRAINING AND DROPOUT REGULARIZATION |
Manuscript Link: Click here to view manuscript on IEEE Xplore |
Shiliang Zhang; Machine Intelligence Technology, Alibaba Group |
Ming Lei; Machine Intelligence Technology, Alibaba Group |
Bin Ma; Machine Intelligence Technology, Alibaba Group |
Lei Xie; Northwestern Polytechnical University |