SLP-P13: New Features, Models and Representations / Audio Visual ASR |
| Session Type: Poster |
| Time: Thursday, May 16, 13:00 - 15:00 |
| Location: Poster Area A, Ground Floor |
| Session Chair: Dorothea Kolossa, Ruhr University Bochum |
| SLP-P13.1: LEARNED IN SPEECH RECOGNITION: CONTEXTUAL ACOUSTIC WORD EMBEDDINGS |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Shruti Palaskar; Carnegie Mellon University |
| Vikas Raunak; Carnegie Mellon University |
| Florian Metze; Carnegie Mellon University |
| SLP-P13.2: TRULY UNSUPERVISED ACOUSTIC WORD EMBEDDINGS USING WEAK TOP-DOWN CONSTRAINTS IN ENCODER-DECODER MODELS |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Herman Kamper; Stellenbosch University |
| SLP-P13.3: A FACTORIAL DEEP MARKOV MODEL FOR UNSUPERVISED DISENTANGLED REPRESENTATION LEARNING FROM SPEECH |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Sameer Khurana; Massachusetts Institute of Technology |
| Shafiq Rayhan Joty; NTU |
| Ahmed Ali; QCRI |
| James Glass; Massachusetts Institute of Technology |
| SLP-P13.4: M-VECTORS: SUB-BAND BASED ENERGY MODULATION FEATURES FOR MULTI-STREAM AUTOMATIC SPEECH RECOGNITION |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Samik Sadhu; Johns Hopkins University |
| Ruizhi Li; Johns Hopkins University |
| Hynek Hermansky; Johns Hopkins University |
| SLP-P13.5: IMPROVING LAYER TRAJECTORY LSTM WITH FUTURE CONTEXT FRAMES |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Jinyu Li; Microsoft |
| Liang Lu; Microsoft |
| Changliang Liu; Microsoft |
| Yifan Gong; Microsoft |
| SLP-P13.6: BAYESIAN AND GAUSSIAN PROCESS NEURAL NETWORKS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Shoukang Hu; The Chinese University of Hong Kong |
| Max W. Y. Lam; The Chinese University of Hong Kong |
| Xurong Xie; The Chinese University of Hong Kong |
| Shansong Liu; The Chinese University of Hong Kong |
| Jianwei Yu; The Chinese University of Hong Kong |
| Xixin Wu; The Chinese University of Hong Kong |
| Xunying Liu; The Chinese University of Hong Kong |
| Helen Meng; The Chinese University of Hong Kong |
| SLP-P13.7: IMPROVING AUDIO-VISUAL SPEECH RECOGNITION PERFORMANCE WITH CROSS-MODAL STUDENT-TEACHER TRAINING |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Wei Li; Georgia Institute of Technology |
| Sicheng Wang; Georgia Institute of Technology |
| Ming Lei; Alibaba |
| Sabato Marco Siniscalchi; Kore University of Enna |
| Chin-Hui Lee; Georgia Institute of Technology |
| SLP-P13.8: MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Pan Zhou; Tsinghua University |
| Wenwen Yang; Sogou Technology Incorporated |
| Wei Chen; Sogou Technology Incorporated |
| Yanfeng Wang; Sogou Technology Incorporated |
| Jia Jia; Tsinghua University |
| SLP-P13.9: ROBUST AUDIO-VISUAL SPEECH RECOGNITION USING BIMODAL DFSMN WITH MULTI-CONDITION TRAINING AND DROPOUT REGULARIZATION |
| Manuscript Link: Click here to view manuscript on IEEE Xplore |
| Shiliang Zhang; Machine Intelligence Technology, Alibaba Group |
| Ming Lei; Machine Intelligence Technology, Alibaba Group |
| Bin Ma; Machine Intelligence Technology, Alibaba Group |
| Lei Xie; Northwestern Polytechnical University |