Technical Program

SLP-P13: New Features, Models and Representations / Audio Visual ASR

Session Type: Poster
Time: Thursday, May 16, 13:00 - 15:00
Location: Poster Area A, Ground Floor
Session Chair: Dorothea Kolossa, Ruhr University Bochum
 
SLP-P13.1: LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Shruti Palaskar; Carnegie Mellon University
         Vikas Raunak; Carnegie Mellon University
         Florian Metze; Carnegie Mellon University
 
SLP-P13.2: TRULY UNSUPERVISED ACOUSTIC WORD EMBEDDINGS USING WEAK TOP-DOWN CONSTRAINTS IN ENCODER-DECODER MODELS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Herman Kamper; Stellenbosch University
 
SLP-P13.3: A FACTORIAL DEEP MARKOV MODEL FOR UNSUPERVISED DISENTANGLED REPRESENTATION LEARNING FROM SPEECH
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Sameer Khurana; Massachusetts Institute of Technology
         Shafiq Rayhan Joty; NTU
         Ahmed Ali; QCRI
         James Glass; Massachusetts Institute of Technology
 
SLP-P13.4: M-VECTORS: SUB-BAND BASED ENERGY MODULATION FEATURES FOR MULTI-STREAM AUTOMATIC SPEECH RECOGNITION
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Samik Sadhu; Johns Hopkins University
         Ruizhi Li; Johns Hopkins University
         Hynek Hermansky; Johns Hopkins University
 
SLP-P13.5: IMPROVING LAYER TRAJECTORY LSTM WITH FUTURE CONTEXT FRAMES
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Jinyu Li; Microsoft
         Liang Lu; Microsoft
         Changliang Liu; Microsoft
         Yifan Gong; Microsoft
 
SLP-P13.6: BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Shoukang Hu; The Chinese University of Hong Kong
         Max W. Y. Lam; The Chinese University of Hong Kong
         Xurong Xie; The Chinese University of Hong Kong
         Shansong Liu; The Chinese University of Hong Kong
         Jianwei Yu; The Chinese University of Hong Kong
         Xixin Wu; The Chinese University of Hong Kong
         Xunying Liu; The Chinese University of Hong Kong
         Helen Meng; The Chinese University of Hong Kong
 
SLP-P13.7: IMPROVING AUDIO-VISUAL SPEECH RECOGNITION PERFORMANCE WITH CROSS-MODAL STUDENT-TEACHER TRAINING
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Wei Li; Georgia Institute of Technology
         Sicheng Wang; Georgia Institute of Technology
         Ming Lei; Alibaba
         Sabato Marco Siniscalchi; Kore University of Enna
         Chin-Hui Lee; Georgia Institute of Technology
 
SLP-P13.8: MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Pan Zhou; Tsinghua University
         Wenwen Yang; Sogou Technology Incorporated
         Wei Chen; Sogou Technology Incorporated
         Yanfeng Wang; Sogou Technology Incorporated
         Jia Jia; Tsinghua University
 
SLP-P13.9: ROBUST AUDIO-VISUAL SPEECH RECOGNITION USING BIMODAL DFSMN WITH MULTI-CONDITION TRAINING AND DROPOUT REGULARIZATION
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Shiliang Zhang; Machine Intelligence Technology, Alibaba Group
         Ming Lei; Machine Intelligence Technology, Alibaba Group
         Bin Ma; Machine Intelligence Technology, Alibaba Group
         Lei Xie; Northwestern Polytechnical University