Technical Program

Paper Detail

Paper ID	F-2-1.4
Paper Title	Design of Voice Privacy System using Linear Prediction
Authors	Priyanka Gupta, Gauri Prajapati, Shrishti Singh, Madhu Kamble, Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology, India
Session	F-2-1: Speaker Recognition 1, Language Recognition
Time	Wednesday, 09 December, 12:30 - 14:00
Presentation Time:	Wednesday, 09 December, 13:15 - 13:30 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Speech, Language, and Audio (SLA):
Abstract	Speaker’s identity is the most crucial information exploited (implicitly) by an Automatic Speaker Verification (ASV) system. Numerous attacks can be obliterated simultaneously if privacy preservation is exercised for a speaker’s identity. The baseline of the Voice Privacy Challenge 2020 by INTERSPEECH uses the Linear Prediction (LP) model of speech, and McAdam's coefficient for achieving speaker de-identification. The baseline approach focuses on altering only the pole angles using McAdam's coefficient. However, from speech acoustics and digital resonator design, the radius of the poles is associated with various energy losses. The energy losses implicitly carry speaker-specific information during speech production. To that effect, the authors have brought fine-tuned changes in both pole angle and pole radius, resulting in 18.98% higher value of EER for Vctk-test-com dataset, and 5% lower WER for Libri-test dataset compared to the baseline. This means privacy-preservation is indeed improved by our approach. Furthermore, we have exploited the relatively poor spectral resolution of female speakers to our advantage for achieving effective anonymization. To that effect, gender-based analysis of the obtained results reveals that our approach leads to better speaker anonymization for females as compared to the male speakers.