SLP-P30: Key Word Spotting and Acoustic Event Detection
Thu, 18 Apr, 16:30 - 18:30 (UTC +9)
Location: Poster Zone 3C
Session Type: Poster
Session Co-Chairs: Kartik Audhkhasi, Google and Yong Qin, Nankai University
Track: Speech and Language Processing
Click the to view the manuscript on IEEE Xplore Open Preview

SLP-P30.1: DEEPCOMBOSAD: SPECTRO-TEMPORAL CORRELATION BASED SPEECH ACTIVITY DETECTION FOR NATURALISTIC AUDIO STREAMS

Aditya Joglekar, John H.L. Hansen, The University of Texas at Dallas, United States of America
 

SLP-P30.2: SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS

Holger Severin Bovbjerg, Jesper Jensen, Jan Østergaard, Zheng-Hua Tan, Aalborg University, Denmark
 

SLP-P30.3: IMPROVING VISION-INSPIRED KEYWORD SPOTTING USING DYNAMIC MODULE SKIPPING IN STREAMING CONFORMER ENCODER

Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik, Apple, Switzerland
 

SLP-P30.4: MAXIMUM-ENTROPY ADVERSARIAL AUDIO AUGMENTATION FOR KEYWORD SPOTTING

Zuzhao Ye, University of California, Riverside, United States of America; Gregory Ciccarelli, Amazon Inc., United States of America; Brian Kulis, Boston University, United States of America
 

SLP-P30.5: AS-PVAD: A FRAME-WISE PERSONALIZED VOICE ACTIVITY DETECTION NETWORK WITH ATTENTIVE SCORE LOSS

Fenting Liu, Zhejiang University/Alibaba Group, China; Feifei Xiong, Yiya Hao, Kechenying Zhou, Alibaba Group, China; Chenhui Zhang, Zhejiang University, China; Jinwei Feng, Alibaba Group, China
 

SLP-P30.6: ROBUST WAKE WORD SPOTTING WITH FRAME-LEVEL CROSS-MODAL ATTENTION BASED AUDIO-VISUAL CONFORMER

Haoxu Wang, Ming Cheng, Dukekunshan University, China; Qiang Fu, Zhejiang Laboratory, China; Ming Li, Dukekunshan University, China
 

SLP-P30.7: VIC-KD: VARIANCE-INVARIANCE-COVARIANCE KNOWLEDGE DISTILLATION TO MAKE KEYWORD SPOTTING MORE ROBUST AGAINST ADVERSARIAL ATTACKS

Heitor Guimarães, Arthur Pimentel, Anderson Avila, Tiago Falk, Institut National de la Recherche Scientifique, Canada
 

SLP-P30.8: Efficient Personal Voice Activity Detection With Wake Word Reference Speech

Bang Zeng, Ming Cheng, Wuhan University, Duke Kunshan University, China; Yao Tian, OPPO, China; Haifeng Liu, University of Science and Technology of China, China; Ming Li, Wuhan University, Duke Kunshan University, China
 

SLP-P30.9: Small-Footprint Convolutional Neural Network with reduced feature map for Voice Activity Detection

Hwabyeong Chae, Sunggu Lee, Pohang University of Science and Technology (POSTECH), Korea (Democratic People's Republic of)
 

SLP-P30.10: COMPARATIVE STUDY OF TOKENIZATION ALGORITHMS FOR END-TO-END OPEN VOCABULARY KEYWORD DETECTION

Krishna Gurugubelli, Sahil Mohamed, Rajesh Krishna Krishnan Selvaraj, Samsung Research Institute Bengaluru, India, India
 

SLP-P30.11: IPHONMATCHNET: ZERO-SHOT USER-DEFINED KEYWORD SPOTTING USING IMPLICIT ACOUSTIC ECHO CANCELLATION

Yong-Hyeok Lee, Namhyun Cho, NCSOFT Corporation, Korea, Republic of
 

SLP-P30.12: TACOS: LEARNING TEMPORALLY STRUCTURED EMBEDDINGS FOR FEW-SHOT KEYWORD SPOTTING WITH DYNAMIC TIME WARPING

Kevin Wilkinghoff, Alessia Cornaggia-Urrigshardt, Fraunhofer FKIE, Germany