WE3.AUD.5
A SEQUENTIAL AUDIO SPECTROGRAM TRANSFORMER FOR REAL-TIME SOUND EVENT DETECTION
Takezo Ohta, University of Tsukuba, Japan; Yoshiaki Bando, National Institute of Advanced Industrial Science and Technology, Japan; Keisuke Imoto, Doshisha University, National Institute of Advanced Industrial Science and Technology, Japan; Masaki Onishi, National Institute of Advanced Industrial Science and Technology, Japan
Session:
WE3.AUD: Detection and Classification for Audio and Speech II Lecture
Track:
ASMSP - Acoustic, Speech and Music Signal Processing
Location:
Auditorium
Presentation Time:
Wed, 28 Aug, 17:30 - 17:50 France Time (UTC +1)
Session Co-Chairs:
Rainer Martin, Ruhr-Universität Bochum and Ganesh Sivaraman, Pindrop
Presentation
Discussion
Resources
No resources available.
Session WE3.AUD
WE3.AUD.1: Impact of Speech Mode in Automatic Pathological Speech Detection
Shakeel Ahmad Sheikh, Ina Kodrasi, Signal Processing for Communication Group, Idiap Research Institute, Martigny, Switzerland, Switzerland
WE3.AUD.2: Test-Time Adaptation for Automatic Pathological Speech Detection in Noisy Environments
Mahdi Amiri, Ina Kodrasi, Idiap Research Institute, Switzerland
WE3.AUD.3: DISTRIBUTED COLLABORATIVE ANOMALOUS SOUND DETECTION BY EMBEDDING SHARING
Kota Dohi, Yohei Kawaguchi, Hitachi Ltd., Japan
WE3.AUD.4: Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations
Manjunath Mulimani, Annamaria Mesaros, Tampere University, Finland
WE3.AUD.5: A SEQUENTIAL AUDIO SPECTROGRAM TRANSFORMER FOR REAL-TIME SOUND EVENT DETECTION
Takezo Ohta, University of Tsukuba, Japan; Yoshiaki Bando, National Institute of Advanced Industrial Science and Technology, Japan; Keisuke Imoto, Doshisha University, National Institute of Advanced Industrial Science and Technology, Japan; Masaki Onishi, National Institute of Advanced Industrial Science and Technology, Japan