Technical Program

Paper Detail

Paper IDF-2-1.4
Paper Title Design of Voice Privacy System using Linear Prediction
Authors Priyanka Gupta, Gauri Prajapati, Shrishti Singh, Madhu Kamble, Hemant A. Patil, Dhirubhai Ambani Institute of Information and Communication Technology, India
Session F-2-1: Speaker Recognition 1, Language Recognition
TimeWednesday, 09 December, 12:30 - 14:00
Presentation Time:Wednesday, 09 December, 13:15 - 13:30 Check your Time Zone
All times are in New Zealand Time (UTC +13)
Topic Speech, Language, and Audio (SLA):
Abstract Speaker’s identity is the most crucial information exploited (implicitly) by an Automatic Speaker Verification (ASV) system. Numerous attacks can be obliterated simultaneously if privacy preservation is exercised for a speaker’s identity. The baseline of the Voice Privacy Challenge 2020 by INTERSPEECH uses the Linear Prediction (LP) model of speech, and McAdam's coefficient for achieving speaker de-identification. The baseline approach focuses on altering only the pole angles using McAdam's coefficient. However, from speech acoustics and digital resonator design, the radius of the poles is associated with various energy losses. The energy losses implicitly carry speaker-specific information during speech production. To that effect, the authors have brought fine-tuned changes in both pole angle and pole radius, resulting in 18.98% higher value of EER for Vctk-test-com dataset, and 5% lower WER for Libri-test dataset compared to the baseline. This means privacy-preservation is indeed improved by our approach. Furthermore, we have exploited the relatively poor spectral resolution of female speakers to our advantage for achieving effective anonymization. To that effect, gender-based analysis of the obtained results reveals that our approach leads to better speaker anonymization for females as compared to the male speakers.