Technical Program

Paper Detail

Paper IDE-3-2.2
Paper Title DNN-BASED PERMUTATION SOLVER FOR FREQUENCY-DOMAIN INDEPENDENT COMPONENT ANALYSIS IN TWO-SOURCE MIXTURE CASE
Authors Shuhei Yamaji, Daichi Kitamura, National Institute of Technology, Kagawa College, Japan
Session E-3-2: Speech Separation 2, Sound source separation
TimeThursday, 10 December, 15:30 - 17:15
Presentation Time:Thursday, 10 December, 15:45 - 16:00 Check your Time Zone
All times are in New Zealand Time (UTC +13)
Topic Speech, Language, and Audio (SLA):
Abstract Frequency-domain independent component analysis (FDICA) is a popular algorithm for multichannel audio source separation. The source components in each frequencies estimated by FDICA must be aligned over all frequencies so that the components of the same source are grouped. This postprocessing of FDICA is the so-called permutation problem. Although various permutation solvers have been proposed, their performances are still limited particularly in a multispeaker separation task in a reverberant environment. To improve the performance of the permutation solver, in this paper, a new data-driven permutation solver based on deep neural networks (DNNs) is presented. In the proposed method, the DNN that predicts whether the input local time-frequency components belong to the same source is trained, and the permutation problem is solved by taking majority decisions of the predicted results.