Technical Program

Click on the icon to view the manuscript on IEEE XPlore in the IEEE ICASSP 2020 Open Preview.

SPE-P5: Deep Speaker Recognition Models

Session Type: Poster
Time: Wednesday, 6 May, 09:00 - 11:00
Location: On-Demand
Virtual Session: View on Virtual Platform
Session Chair: Kong-Aik Lee, NEC Corporation
 
 SPE-P5.1: FREQUENCY AND TEMPORAL CONVOLUTIONAL ATTENTION FOR TEXT-INDEPENDENT SPEAKER RECOGNITION
         Sarthak Yadav; Staqu Technologies
         Atul Rai; Staqu Technologies
 
 SPE-P5.2: FRAME-LEVEL PHONEME-INVARIANT SPEAKER EMBEDDING FOR TEXT-INDEPENDENT SPEAKER RECOGNITION ON EXTREMELY SHORT UTTERANCES
         Naohiro Tawara; NTT Communication Science Laboratories
         Atsunori Ogawa; NTT Communication Science Laboratories
         Tomoharu Iwata; NTT Communication Science Laboratories
         Marc Delcroix; NTT Communication Science Laboratories
         Tetsuji Ogawa; Waseda University
 
 SPE-P5.3: PROTOTYPICAL NETWORKS FOR SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION
         Tom Ko; South University of Science and Technology
         Yangbin Chen; City University of Hong Kong
         Qing Li; Hong Kong Polytechnic University
 
 SPE-P5.4: TDMF: TASK-DRIVEN MULTILEVEL FRAMEWORK FOR END-TO-END SPEAKER VERIFICATION
         Chen Chen; Harbin Institute of Technology
         Jiqing Han; Harbin Institute of Technology
 
 SPE-P5.5: AN IMPROVED DEEP NEURAL NETWORK FOR MODELING SPEAKER CHARACTERISTICS AT DIFFERENT TEMPORAL SCALES
         Bin Gu; University of Science and Technology of China
         Wu Guo; University of Science and Technology of China
         Li-Rong Dai; University of Science and Technology of China
         Jun Du; University of Science and Technology of China
 
 SPE-P5.6: PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
         Zhongxin Bai; Northwestern Polytechnical University
         Xiao-Lei Zhang; Northwestern Polytechnical University
         Jingdong Chen; Northwestern Polytechnical University
 
 SPE-P5.7: KNOWLEDGE DISTILLATION AND RANDOM ERASING DATA AUGMENTATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
         Victoria Mingote; University of Zaragoza
         Antonio Miguel; University of Zaragoza
         Dayana Ribas; University of Zaragoza
         Alfonso Ortega; University of Zaragoza
         Eduardo Lleida; University of Zaragoza
 
 SPE-P5.8: DISENTANGLED SPEECH EMBEDDINGS USING CROSS-MODAL SELF-SUPERVISION
         Arsha Nagrani; Oxford University
         Joon Son Chung; Oxford University
         Samuel Albanie; Oxford University
         Andrew Zisserman; Oxford University
 
 SPE-P5.9: IMPROVING DEEP CNN NETWORKS WITH LONG TEMPORAL CONTEXT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
         Yong Zhao; Microsoft Corporation
         Tianyan Zhou; Microsoft Corporation
         Zhuo Chen; Microsoft Corporation
         Jian Wu; Microsoft Corporation
 
 SPE-P5.10: MULTI-LEVEL DEEP NEURAL NETWORK ADAPTATION FOR SPEAKER VERIFICATION USING MMD AND CONSISTENCY REGULARIZATION
         Weiwei Lin; Hong Kong Polytechnic University
         Man-Mai Mak; Hong Kong Polytechnic University
         Na Li; Tencent AI Lab
         Dan Su; Tencent AI Lab
         Dong Yu; Tencent AI Lab
 
 SPE-P5.11: MULTI-TASK LEARNING FOR SPEAKER VERIFICATION AND VOICE TRIGGER DETECTION
         Siddharth Sigtia; Apple
         Erik Marchi; Apple
         Sachin Kajarekar; Apple
         Devang Naik; Apple
         John Bridle; Apple
 
 SPE-P5.12: STATISTICS POOLING TIME DELAY NEURAL NETWORK BASED ON X-VECTOR FOR SPEAKER VERIFICATION
         Qian-Bei Hong; National Cheng Kung University and Academia Sinica
         Chung-Hsien Wu; National Cheng Kung University and Academia Sinica
         Hsin-Min Wang; National Cheng Kung University and Academia Sinica
         Chien-Lin Huang; Ping An Technology (Shenzhen) Co., Ltd.