Paper ID | F-2-1.3 |
Paper Title |
SUBBAND CHANNEL SELECTION USING TEO FOR REPLAY SPOOF DETECTION IN VOICE ASSISTANTS |
Authors |
Harsh Kotta, Ankur T. Patil, Rajul Acharya, Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology, India |
Session |
F-2-1: Speaker Recognition 1, Language Recognition |
Time | Wednesday, 09 December, 12:30 - 14:00 |
Presentation Time: | Wednesday, 09 December, 13:00 - 13:15 Check your Time Zone |
|
All times are in New Zealand Time (UTC +13) |
Topic |
Speech, Language, and Audio (SLA): |
Abstract |
Recently, there is an increase in the demand for Voice Assistants (VAs) due to their convenience in accessing and controlling household devices. To make VAs user-friendly, less strict speaker verification constraints are imposed onto them which makes VAs highly vulnerable to spoofing attacks. In this paper, authors propose the design of front-end countermeasure system against replay spoofing attack for VAs that make use of microphone array to capture spatial diversity. We exploit this microphone array information by proposing a novel approach of the subband channel selection using mathematical structure of Teager Energy Operator (TEO). These selected subband channels are used to compute proposed Teager Energy Cepstral Coefficients (TECC max ) feature set. With this approach, we gain significant improvement in the performance of replay attack detection task on VAs against the baseline feature set, i.e., Constant-Q Cepstral Coefficient (CQCC). Results indicate an absolute reduction in Equal Error Rate (EER) of 4.1% and 8.6% on development and evaluation set, respectively, of ReMASC dataset. Authors also performed classifier-level fusion of GMM, and LCNN-based back end classifiers using proposed TECCmax feature set and obtained absolute reduction of 5.98% and 10.67% on development and evaluation sets, respectively. |