SPE-P2: Speech Enhancement I: Network Architectures |
| Session Type: Poster |
| Time: Tuesday, 5 May, 11:30 - 13:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Afsaneh Asaei, UnternehmerTUM and Timo Gerkmann, Universität Hamburg |
| SPE-P2.1: CP-GAN: CONTEXT PYRAMID GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT |
| Gang Liu; Sun Yat-Sen University |
| Ke Gong; DarkMatter AI Research |
| Xiaodan Liang; Sun Yat-Sen University |
| Zhiguang Chen; Sun Yat-Sen University |
| SPE-P2.2: DENSELY CONNECTED NEURAL NETWORK WITH DILATED CONVOLUTIONS FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN |
| Ashutosh Pandey; Ohio State University |
| DeLiang Wang; Ohio State University |
| SPE-P2.3: PAN: PHONEME-AWARE NETWORK FOR MONAURAL SPEECH ENHANCEMENT |
| Zhihao Du; Harbin Institute of Technology |
| Ming Lei; Alibaba Group |
| Jiqing Han; Harbin Institute of Technology |
| Shiliang Zhang; Alibaba Group |
| SPE-P2.4: EFFICIENT TRAINABLE FRONT-ENDS FOR NEURAL SPEECH ENHANCEMENT |
| Jonah Casebeer; University of Illinois at Urbana–Champaign |
| Umut Isik; Amazon Web Services |
| Shrikant Venkataramani; University of Illinois at Urbana–Champaign |
| Arvindh Krishnaswamy; Amazon Web Services |
| SPE-P2.5: INVERTIBLE DNN-BASED NONLINEAR TIME-FREQUENCY TRANSFORM FOR SPEECH ENHANCEMENT |
| Daiki Takeuchi; Waseda University |
| Kohei Yatabe; Waseda University |
| Yuma Koizumi; NTT Corporation |
| Yasuhiro Oikawa; Waseda University |
| Noboru Harada; NTT Corporation |
| SPE-P2.6: T-GSA: TRANSFORMER WITH GAUSSIAN-WEIGHTED SELF-ATTENTION FOR SPEECH ENHANCEMENT |
| Jaeyoung Kim; Google |
| Mostafa El-Khamy; Samsung Semiconductor, Inc. |
| Jungwon Lee; Samsung Semiconductor, Inc. |
| SPE-P2.7: REDUNDANT CONVOLUTIONAL NETWORK WITH ATTENTION MECHANISM FOR MONAURAL SPEECH ENHANCEMENT |
| Tian Lan; University of Electronic Science and Technology of China |
| Yilan Lyu; University of Electronic Science and Technology of China |
| Guoqiang Hui; University of Electronic Science and Technology of China |
| Refuoe Mokhosi; University of Electronic Science and Technology of China |
| Sen Li; University of Electronic Science and Technology of China |
| Qiao Liu; University of Electronic Science and Technology of China |
| SPE-P2.8: RESIDUAL RECURRENT NEURAL NETWORK FOR SPEECH ENHANCEMENT |
| Jalal Abdulbaqi; Rutgers, The State University of New Jersey |
| Yue Gu; Rutgers, The State University of New Jersey |
| Shuhong Chen; Rutgers, The State University of New Jersey |
| Ivan Marsic; Rutgers, The State University of New Jersey |
| SPE-P2.9: 2D-TO-2D MASK ESTIMATION FOR SPEECH ENHANCEMENT BASED ON FULLY CONVOLUTIONAL NEURAL NETWORK |
| Yanhui Tu; University of Science and Technology of China |
| Jun Du; University of Science and Technology of China |
| Chin-Hui Lee; Georgia Institute of Technology |
| SPE-P2.10: SELF-SUPERVISED DENOISING AUTOENCODER WITH LINEAR REGRESSION DECODER FOR SPEECH ENHANCEMENT |
| Ryandhimas Edo Zezario; Academia Sinica |
| Tassadaq Hussain; Academia Sinica |
| Xugang Lu; National Institute of Information and Communications Technology (NICT) |
| Hsin-Min Wang; Academia Sinica |
| Yu Tsao; Academia Sinica |
| SPE-P2.11: FULLY CONVOLUTIONAL RECURRENT NETWORKS FOR SPEECH ENHANCEMENT |
| Maximilian Strake; Technische Universität Braunschweig |
| Bruno Defraene; NXP Semiconductors |
| Kristoff Fluyt; NXP Semiconductors |
| Wouter Tirry; NXP Semiconductors |
| Tim Fingscheidt; Technische Universität Braunschweig |
| SPE-P2.12: PHONETIC FEEDBACK FOR SPEECH ENHANCEMENT WITH AND WITHOUT PARALLEL SPEECH DATA |
| Peter Plantinga; Ohio State University |
| Deblin Bagchi; Ohio State University |
| Eric Fosler-Lussier; Ohio State University |