Technical Program

Paper Detail

Paper ID	F-2-3.1
Paper Title	HARMONIC STRUCTURE MASK FOR SPEECH ENHANCEMENT USING SPARSITY REGULARIZATION
Authors	Haonan Wang, Kenta Iwai, Takanobu Nishiura, Ritsumeikan University, Japan
Session	F-2-3: Speech Enhancement 2
Time	Wednesday, 09 December, 17:15 - 19:15
Presentation Time:	Wednesday, 09 December, 17:15 - 17:30 Check your Time Zone
	All times are in New Zealand Time (UTC +13)
Topic	Speech, Language, and Audio (SLA):
Abstract	Harmonic structure, an important characteristic of speech signals, has been utilized in various speech processing applications, such as dereverberation, fundamental frequency (f0) estimation, voice activity detection (VAD), phase reconstruction and source separation. This paper presents a harmonic structure mask for those speech enhancement applications based on sparsity regularization via convex optimization. Specifically speaking, we first derive a harmonic structure mask of the noisy speech using f0 and VAD estimations, then use this mask to protect harmonic components of speech during the sparsity regularization process. The proposed mask benefits from the additional harmonic information, leading to better protection of harmonic components. Numerical experiments show that the proposed mask can improve speech quality and intelligibility compared to the previous work.