Paper ID | C-2-2.5 |
Paper Title |
Adversarial Training Using Inter/Intra-Attention Architecture for Speech Enhancement Network |
Authors |
Yosuke SUGIURA, Tetsuya SHIMAMURA, Saitama University, Japan |
Session |
C-2-2: Advanced Topics in Signal Processing & Machine Learning - Acoustic & Biomedical Applications |
Time | Wednesday, 09 December, 15:30 - 17:00 |
Presentation Time: | Wednesday, 09 December, 16:30 - 16:45 Check your Time Zone |
|
All times are in New Zealand Time (UTC +13) |
Topic |
Signal and Information Processing Theory and Methods (SIPTM): Special Session: Advanced Topics in Signal Processing & Machine Learning - Acoustic & Biomedical Applications |
Abstract |
In this paper, we propose a new adversarial training for the end-to-end speech enhancement network. Taking the advantage of getting the paired training waveform, a new attention module is introduced into the proposed discriminator to incorporate the information of the desired waveform. Since this attention module has a role of the inter- and intra-attention mechanism, it helps the discriminator to distinctly distinguish the structural features underlying in the desired waveform and the waveform generated by the speech enhancement network. Unlike the other conditional generative adversarial networks, the proposed training architecture can simultaneously minimize the adversarial loss and the distortion loss. Through the simulation experiments for speech enhancement, we reveal that the proposed adversarial training can provide the significant performance. |