SLP-P30.2
SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS
Holger Severin Bovbjerg, Jesper Jensen, Jan Østergaard, Zheng-Hua Tan, Aalborg University, Denmark
Session:
SLP-P30: Key Word Spotting and Acoustic Event Detection Poster
Track:
Speech and Language Processing
Location:
Poster Zone 3C
Poster Board PZ-3C.2
Poster Board PZ-3C.2
Presentation Time:
Thu, 18 Apr, 16:30 - 18:30 (UTC +9)
Session Co-Chairs:
Kartik Audhkhasi, Google and Yong Qin, Nankai University
Session SLP-P30
SLP-P30.1: DEEPCOMBOSAD: SPECTRO-TEMPORAL CORRELATION BASED SPEECH ACTIVITY DETECTION FOR NATURALISTIC AUDIO STREAMS
Aditya Joglekar, John H.L. Hansen, The University of Texas at Dallas, United States of America
SLP-P30.2: SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS
Holger Severin Bovbjerg, Jesper Jensen, Jan Østergaard, Zheng-Hua Tan, Aalborg University, Denmark
SLP-P30.3: IMPROVING VISION-INSPIRED KEYWORD SPOTTING USING DYNAMIC MODULE SKIPPING IN STREAMING CONFORMER ENCODER
Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik, Apple, Switzerland
SLP-P30.4: MAXIMUM-ENTROPY ADVERSARIAL AUDIO AUGMENTATION FOR KEYWORD SPOTTING
Zuzhao Ye, University of California, Riverside, United States of America; Gregory Ciccarelli, Amazon Inc., United States of America; Brian Kulis, Boston University, United States of America
SLP-P30.5: AS-PVAD: A FRAME-WISE PERSONALIZED VOICE ACTIVITY DETECTION NETWORK WITH ATTENTIVE SCORE LOSS
Fenting Liu, Zhejiang University/Alibaba Group, China; Feifei Xiong, Yiya Hao, Kechenying Zhou, Alibaba Group, China; Chenhui Zhang, Zhejiang University, China; Jinwei Feng, Alibaba Group, China
SLP-P30.6: ROBUST WAKE WORD SPOTTING WITH FRAME-LEVEL CROSS-MODAL ATTENTION BASED AUDIO-VISUAL CONFORMER
Haoxu Wang, Ming Cheng, Dukekunshan University, China; Qiang Fu, Zhejiang Laboratory, China; Ming Li, Dukekunshan University, China
SLP-P30.7: VIC-KD: VARIANCE-INVARIANCE-COVARIANCE KNOWLEDGE DISTILLATION TO MAKE KEYWORD SPOTTING MORE ROBUST AGAINST ADVERSARIAL ATTACKS
Heitor Guimarães, Arthur Pimentel, Anderson Avila, Tiago Falk, Institut National de la Recherche Scientifique, Canada
SLP-P30.8: Efficient Personal Voice Activity Detection With Wake Word Reference Speech
Bang Zeng, Ming Cheng, Wuhan University, Duke Kunshan University, China; Yao Tian, OPPO, China; Haifeng Liu, University of Science and Technology of China, China; Ming Li, Wuhan University, Duke Kunshan University, China
SLP-P30.9: Small-Footprint Convolutional Neural Network with reduced feature map for Voice Activity Detection
Hwabyeong Chae, Sunggu Lee, Pohang University of Science and Technology (POSTECH), Korea (Democratic People's Republic of)
SLP-P30.10: COMPARATIVE STUDY OF TOKENIZATION ALGORITHMS FOR END-TO-END OPEN VOCABULARY KEYWORD DETECTION
Krishna Gurugubelli, Sahil Mohamed, Rajesh Krishna Krishnan Selvaraj, Samsung Research Institute Bengaluru, India, India
SLP-P30.11: IPHONMATCHNET: ZERO-SHOT USER-DEFINED KEYWORD SPOTTING USING IMPLICIT ACOUSTIC ECHO CANCELLATION
Yong-Hyeok Lee, Namhyun Cho, NCSOFT Corporation, Korea, Republic of
SLP-P30.12: TACOS: LEARNING TEMPORALLY STRUCTURED EMBEDDINGS FOR FEW-SHOT KEYWORD SPOTTING WITH DYNAMIC TIME WARPING
Kevin Wilkinghoff, Alessia Cornaggia-Urrigshardt, Fraunhofer FKIE, Germany
Contacts