Technical Program

SLP-P13: New Features, Models and Representations / Audio Visual ASR

Session Type: Poster

Time: Thursday, May 16, 13:00 - 15:00

Location: Poster Area A, Ground Floor

Session Chair: Dorothea Kolossa, Ruhr University Bochum

SLP-P13.1: LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS

Manuscript Link: Click here to view manuscript on IEEE Xplore

Shruti Palaskar; Carnegie Mellon University

Vikas Raunak; Carnegie Mellon University

Florian Metze; Carnegie Mellon University

SLP-P13.2: TRULY UNSUPERVISED ACOUSTIC WORD EMBEDDINGS USING WEAK TOP-DOWN CONSTRAINTS IN ENCODER-DECODER MODELS

Manuscript Link: Click here to view manuscript on IEEE Xplore

Herman Kamper; Stellenbosch University

SLP-P13.3: A FACTORIAL DEEP MARKOV MODEL FOR UNSUPERVISED DISENTANGLED REPRESENTATION LEARNING FROM SPEECH

Manuscript Link: Click here to view manuscript on IEEE Xplore

Sameer Khurana; Massachusetts Institute of Technology

Shafiq Rayhan Joty; NTU

Ahmed Ali; QCRI

James Glass; Massachusetts Institute of Technology

SLP-P13.4: M-VECTORS: SUB-BAND BASED ENERGY MODULATION FEATURES FOR MULTI-STREAM AUTOMATIC SPEECH RECOGNITION

Manuscript Link: Click here to view manuscript on IEEE Xplore

Samik Sadhu; Johns Hopkins University

Ruizhi Li; Johns Hopkins University

Hynek Hermansky; Johns Hopkins University

SLP-P13.5: IMPROVING LAYER TRAJECTORY LSTM WITH FUTURE CONTEXT FRAMES

Manuscript Link: Click here to view manuscript on IEEE Xplore

Jinyu Li; Microsoft

Liang Lu; Microsoft

Changliang Liu; Microsoft

Yifan Gong; Microsoft

SLP-P13.6: BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

Manuscript Link: Click here to view manuscript on IEEE Xplore

Shoukang Hu; The Chinese University of Hong Kong

Max W. Y. Lam; The Chinese University of Hong Kong

Xurong Xie; The Chinese University of Hong Kong

Shansong Liu; The Chinese University of Hong Kong

Jianwei Yu; The Chinese University of Hong Kong

Xixin Wu; The Chinese University of Hong Kong

Xunying Liu; The Chinese University of Hong Kong

Helen Meng; The Chinese University of Hong Kong

SLP-P13.7: IMPROVING AUDIO-VISUAL SPEECH RECOGNITION PERFORMANCE WITH CROSS-MODAL STUDENT-TEACHER TRAINING

Manuscript Link: Click here to view manuscript on IEEE Xplore

Wei Li; Georgia Institute of Technology

Sicheng Wang; Georgia Institute of Technology

Ming Lei; Alibaba

Sabato Marco Siniscalchi; Kore University of Enna

Chin-Hui Lee; Georgia Institute of Technology

SLP-P13.8: MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION

Manuscript Link: Click here to view manuscript on IEEE Xplore

Pan Zhou; Tsinghua University

Wenwen Yang; Sogou Technology Incorporated

Wei Chen; Sogou Technology Incorporated

Yanfeng Wang; Sogou Technology Incorporated

Jia Jia; Tsinghua University

SLP-P13.9: ROBUST AUDIO-VISUAL SPEECH RECOGNITION USING BIMODAL DFSMN WITH MULTI-CONDITION TRAINING AND DROPOUT REGULARIZATION

Manuscript Link: Click here to view manuscript on IEEE Xplore

Shiliang Zhang; Machine Intelligence Technology, Alibaba Group

Ming Lei; Machine Intelligence Technology, Alibaba Group

Bin Ma; Machine Intelligence Technology, Alibaba Group

Lei Xie; Northwestern Polytechnical University