Technical Program

Paper Detail

Paper ID	F-2-1.3
Paper Title	SUBBAND CHANNEL SELECTION USING TEO FOR REPLAY SPOOF DETECTION IN VOICE ASSISTANTS
Authors	Harsh Kotta, Ankur T. Patil, Rajul Acharya, Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology, India
Session	F-2-1: Speaker Recognition 1, Language Recognition
Time	Wednesday, 09 December, 12:30 - 14:00
Presentation Time:	Wednesday, 09 December, 13:00 - 13:15 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Speech, Language, and Audio (SLA):
Abstract	Recently, there is an increase in the demand for Voice Assistants (VAs) due to their convenience in accessing and controlling household devices. To make VAs user-friendly, less strict speaker verification constraints are imposed onto them which makes VAs highly vulnerable to spoofing attacks. In this paper, authors propose the design of front-end countermeasure system against replay spoofing attack for VAs that make use of microphone array to capture spatial diversity. We exploit this microphone array information by proposing a novel approach of the subband channel selection using mathematical structure of Teager Energy Operator (TEO). These selected subband channels are used to compute proposed Teager Energy Cepstral Coefficients (TECC max ) feature set. With this approach, we gain significant improvement in the performance of replay attack detection task on VAs against the baseline feature set, i.e., Constant-Q Cepstral Coefficient (CQCC). Results indicate an absolute reduction in Equal Error Rate (EER) of 4.1% and 8.6% on development and evaluation set, respectively, of ReMASC dataset. Authors also performed classifier-level fusion of GMM, and LCNN-based back end classifiers using proposed TECCmax feature set and obtained absolute reduction of 5.98% and 10.67% on development and evaluation sets, respectively.