Technical Program

Paper Detail

Paper ID	E-3-3.1
Paper Title	A Joint-Loss Approach for Speech Enhancement via Single-channel Neural Network and MVDR Beamformer
Authors	Zhi-Wei Tan, Anh H. T. Nguyen, Linh T. T. Tran, Andy W. H. Khong, Nanyang Technological University, Singapore
Session	E-3-3: Advanced Signal Processing and Machine Learning for Audio and Speech Applications
Time	Thursday, 10 December, 17:30 - 19:30
Presentation Time:	Thursday, 10 December, 17:30 - 17:45 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Speech, Language, and Audio (SLA): Special Session: Advanced Signal Processing and Machine Learning for Audio and Speech Applications
Abstract	Recent developments of noise reduction involves the use of neural beamforming. While some success is achieved, these algorithms rely solely on the gain of the beamformer to enhance the noisy signals. We propose a framework that comprises two stages where the first-stage neural network aims to achieve a good estimate of the signal and noise to the second-stage beamformer. We also introduce an objective function that reduces the distortion of the speech component in each stage. This objective function improves the accuracy of the second-stage beamformer by enhancing the first-stage output, and in the second stage, enhances the training of the network by propagating the gradient through the beamforming operation. A parameter is introduced to control the trade-off between optimizing these two stages. Simulation results on the CHiME-3 dataset at low-SNR show that the proposed algorithm is able to exploit the enhancement gains from the neural network and the beamformer with improvement over other baseline algorithms in terms of speech distortion, quality and intelligibility.