Technical Program

Click on the icon to view the manuscript on IEEE XPlore in the IEEE ICASSP 2020 Open Preview.

SPE-L13: Speech Recognition: Representations and Embeddings

Session Type: Lecture
Time: Thursday, 7 May, 16:30 - 18:30
Location: On-Demand
Virtual Session: View on Virtual Platform
Session Chair: Karen Livescu, Toyota Technological Institute - Chicago
 
 SPE-L13.1: MULTILINGUAL ACOUSTIC WORD EMBEDDING MODELS FOR PROCESSING ZERO-RESOURCE LANGUAGES
         Herman Kamper; Stellenbosch University
         Yevgen Matusevych; University of Edinburgh
         Sharon Goldwater; University of Edinburgh
 
 SPE-L13.2: MOCKINGJAY: UNSUPERVISED SPEECH REPRESENTATION LEARNING WITH DEEP BIDIRECTIONAL TRANSFORMER ENCODERS
         Andy T. Liu; National Taiwan University
         Shu-wen Yang; National Taiwan University
         Po-Han Chi; National Taiwan University
         Po-chun Hsu; National Taiwan University
         Hung-yi Lee; National Taiwan University
 
 SPE-L13.3: RECURRENT NEURAL AUDIOVISUAL WORD EMBEDDINGS FOR SYNCHRONIZED SPEECH AND REAL-TIME MRI
         Öykü Deniz Köse; Boğaziçi University
         Murat Saraçlar; Boğaziçi University
 
 SPE-L13.4: DEEP CONTEXTUALIZED ACOUSTIC REPRESENTATIONS FOR SEMI-SUPERVISED SPEECH RECOGNITION
         Shaoshi Ling; Amazon, Inc.
         Yuzong Liu; Amazon, Inc.
         Julian Salazar; Amazon, Inc.
         Katrin Kirchhoff; Amazon, Inc.
 
 SPE-L13.5: WHAT DOES A NETWORK LAYER HEAR? ANALYZING HIDDEN REPRESENTATIONS OF END-TO-END ASR THROUGH SPEECH SYNTHESIS
         Chung-Yi Li; National Taiwan University
         Pei-Chieh Yuan; National Taiwan University
         Hung-Yi Lee; National Taiwan University
 
 SPE-L13.6: LEARNING A SUBWORD INVENTORY JOINTLY WITH END-TO-END AUTOMATIC SPEECH RECOGNTION
         Jennifer Drexler; Massachusetts Institute of Technology
         James Glass; Massachusetts Institute of Technology