SPE-P5: Deep Speaker Recognition Models |
Session Type: Poster |
Time: Wednesday, 6 May, 09:00 - 11:00 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chair: Kong-Aik Lee, NEC Corporation |
SPE-P5.1: FREQUENCY AND TEMPORAL CONVOLUTIONAL ATTENTION FOR TEXT-INDEPENDENT SPEAKER RECOGNITION |
Sarthak Yadav; Staqu Technologies |
Atul Rai; Staqu Technologies |
SPE-P5.2: FRAME-LEVEL PHONEME-INVARIANT SPEAKER EMBEDDING FOR TEXT-INDEPENDENT SPEAKER RECOGNITION ON EXTREMELY SHORT UTTERANCES |
Naohiro Tawara; NTT Communication Science Laboratories |
Atsunori Ogawa; NTT Communication Science Laboratories |
Tomoharu Iwata; NTT Communication Science Laboratories |
Marc Delcroix; NTT Communication Science Laboratories |
Tetsuji Ogawa; Waseda University |
SPE-P5.3: PROTOTYPICAL NETWORKS FOR SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION |
Tom Ko; South University of Science and Technology |
Yangbin Chen; City University of Hong Kong |
Qing Li; Hong Kong Polytechnic University |
SPE-P5.4: TDMF: TASK-DRIVEN MULTILEVEL FRAMEWORK FOR END-TO-END SPEAKER VERIFICATION |
Chen Chen; Harbin Institute of Technology |
Jiqing Han; Harbin Institute of Technology |
SPE-P5.5: AN IMPROVED DEEP NEURAL NETWORK FOR MODELING SPEAKER CHARACTERISTICS AT DIFFERENT TEMPORAL SCALES |
Bin Gu; University of Science and Technology of China |
Wu Guo; University of Science and Technology of China |
Li-Rong Dai; University of Science and Technology of China |
Jun Du; University of Science and Technology of China |
SPE-P5.6: PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION |
Zhongxin Bai; Northwestern Polytechnical University |
Xiao-Lei Zhang; Northwestern Polytechnical University |
Jingdong Chen; Northwestern Polytechnical University |
SPE-P5.7: KNOWLEDGE DISTILLATION AND RANDOM ERASING DATA AUGMENTATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION |
Victoria Mingote; University of Zaragoza |
Antonio Miguel; University of Zaragoza |
Dayana Ribas; University of Zaragoza |
Alfonso Ortega; University of Zaragoza |
Eduardo Lleida; University of Zaragoza |
SPE-P5.8: DISENTANGLED SPEECH EMBEDDINGS USING CROSS-MODAL SELF-SUPERVISION |
Arsha Nagrani; Oxford University |
Joon Son Chung; Oxford University |
Samuel Albanie; Oxford University |
Andrew Zisserman; Oxford University |
SPE-P5.9: IMPROVING DEEP CNN NETWORKS WITH LONG TEMPORAL CONTEXT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION |
Yong Zhao; Microsoft Corporation |
Tianyan Zhou; Microsoft Corporation |
Zhuo Chen; Microsoft Corporation |
Jian Wu; Microsoft Corporation |
SPE-P5.10: MULTI-LEVEL DEEP NEURAL NETWORK ADAPTATION FOR SPEAKER VERIFICATION USING MMD AND CONSISTENCY REGULARIZATION |
Weiwei Lin; Hong Kong Polytechnic University |
Man-Mai Mak; Hong Kong Polytechnic University |
Na Li; Tencent AI Lab |
Dan Su; Tencent AI Lab |
Dong Yu; Tencent AI Lab |
SPE-P5.11: MULTI-TASK LEARNING FOR SPEAKER VERIFICATION AND VOICE TRIGGER DETECTION |
Siddharth Sigtia; Apple |
Erik Marchi; Apple |
Sachin Kajarekar; Apple |
Devang Naik; Apple |
John Bridle; Apple |
SPE-P5.12: STATISTICS POOLING TIME DELAY NEURAL NETWORK BASED ON X-VECTOR FOR SPEAKER VERIFICATION |
Qian-Bei Hong; National Cheng Kung University and Academia Sinica |
Chung-Hsien Wu; National Cheng Kung University and Academia Sinica |
Hsin-Min Wang; National Cheng Kung University and Academia Sinica |
Chien-Lin Huang; Ping An Technology (Shenzhen) Co., Ltd. |