Technical Program

Paper Detail

Paper IDF-2-3.1
Paper Title HARMONIC STRUCTURE MASK FOR SPEECH ENHANCEMENT USING SPARSITY REGULARIZATION
Authors Haonan Wang, Kenta Iwai, Takanobu Nishiura, Ritsumeikan University, Japan
Session F-2-3: Speech Enhancement 2
TimeWednesday, 09 December, 17:15 - 19:15
Presentation Time:Wednesday, 09 December, 17:15 - 17:30 Check your Time Zone
All times are in New Zealand Time (UTC +13)
Topic Speech, Language, and Audio (SLA):
Abstract Harmonic structure, an important characteristic of speech signals, has been utilized in various speech processing applications, such as dereverberation, fundamental frequency (f0) estimation, voice activity detection (VAD), phase reconstruction and source separation. This paper presents a harmonic structure mask for those speech enhancement applications based on sparsity regularization via convex optimization. Specifically speaking, we first derive a harmonic structure mask of the noisy speech using f0 and VAD estimations, then use this mask to protect harmonic components of speech during the sparsity regularization process. The proposed mask benefits from the additional harmonic information, leading to better protection of harmonic components. Numerical experiments show that the proposed mask can improve speech quality and intelligibility compared to the previous work.