SPE-P17: Speech Enhancement IV |
| Session Type: Poster |
| Time: Friday, 8 May, 08:00 - 10:00 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Marco Siniscalchi, Kore University of Enna and Takuya Yoshioka, Microsoft |
| SPE-P17.1: UNSUPERVISED NEURAL MASK ESTIMATOR FOR GENERALIZED EIGEN-VALUE BEAMFORMING BASED ASR |
| Rohit Kumar; Indian Institute Science |
| Anirudh Sreeram; Indian Institute Science |
| Anurenjan Purushothaman; Indian Institute Science |
| Sriram Ganapathy; Indian Institute Science |
| SPE-P17.2: SPATIAL ATTENTION FOR FAR-FIELD SPEECH RECOGNITION WITH DEEP BEAMFORMING NEURAL NETWORKS |
| Weipeng He; Idiap Research Institute |
| Lu Lu; Facebook |
| Biqiao Zhang; Facebook |
| Jay Mahadeokar; Facebook |
| Kaustubh Kalgaonkar; Facebook |
| Christian Fuegen; Facebook |
| SPE-P17.3: TENSOR-TO-VECTOR REGRESSION FOR MULTI-CHANNEL SPEECH ENHANCEMENT BASED ON TENSOR-TRAIN NETWORK |
| Jun Qi; Georgia Institute of Technology |
| Hu Hu; Georgia Institute of Technology |
| Yannan Wang; Tencent |
| Chao-Han Huck Yang; Georgia Institute of Technology |
| Marco Siniscalchi; University of Enna |
| Chin-Hui Lee; Georgia Institute of Technology |
| SPE-P17.4: TRUTH-TO-ESTIMATE RATIO MASK: A POST-PROCESSING METHOD FOR SPEECH ENHANCEMENT DIRECT AT LOW SIGNAL-TO-NOISE RATIOS |
| Bohan Chen; Hong Kong University of Science and Technology Shenzhen Research Institute |
| He Wang; Hong Kong University of Science and Technology Shenzhen Research Institute |
| Yue Wei; Incus Company Limited |
| Richard H.Y. So; Hong Kong University of Science and Technology |
| SPE-P17.5: GEOMETRY CONSTRAINED PROGRESSIVE LEARNING FOR LSTM-BASED SPEECH ENHANCEMENT |
| Xin Tang; University of Science and Technology of China |
| Jun Du; University of Science and Technology of China |
| Li Chai; University of Science and Technology of China |
| Yannan Wang; Tencent Technology(Shenzhen) Company Limited |
| Qing Wang; Tencent Technology(Shenzhen) Company Limited |
| Chin-Hui Lee; Georgia Institute of Technology |
| SPE-P17.6: USING SEPARATE LOSSES FOR SPEECH AND NOISE IN MASK-BASED SPEECH ENHANCEMENT |
| Ziyi Xu; Technische Universität Braunschweig |
| Samy Elshamy; Technische Universität Braunschweig |
| Tim Fingscheidt; Technische Universität Braunschweig |
| SPE-P17.7: STABLE TRAINING OF DNN FOR SPEECH ENHANCEMENT BASED ON PERCEPTUALLY-MOTIVATED BLACK-BOX COST FUNCTION |
| Masaki Kawanaka; National Institute of Technology, Tokuyama College |
| Yuma Koizumi; NTT Corporation |
| Ryoichi Miyazaki; National Institute of Technology, Tokuyama College |
| Kohei Yatabe; Waseda University |
| SPE-P17.8: A ROBUST AUDIO-VISUAL SPEECH ENHANCEMENT MODEL |
| Wupeng Wang; Huawei Noah's Ark Lab |
| Chao Xing; Huawei Noah's Ark Lab |
| Dong Wang; Tsinghua University |
| Xiao Chen; Huawei Noah's Ark Lab |
| Fengyu Sun; Huawei Technologies CO. LTD |
| SPE-P17.9: ROBUST UNSUPERVISED AUDIO-VISUAL SPEECH ENHANCEMENT USING A MIXTURE OF VARIATIONAL AUTOENCODERS |
| Mostafa Sadeghi; Inria, Grenoble Alpes |
| Xavier Alameda-Pineda; Inria, Grenoble Alpes |
| SPE-P17.10: AV(SE)²: AUDIO-VISUAL SQUEEZE-EXCITE SPEECH ENHANCEMENT |
| Michael Iuzzolino; University of Colorado Boulder |
| Kazuhito Koishida; Microsoft Corporation |
| SPE-P17.11: SPECTROGRAMS FUSION WITH MINIMUM DIFFERENCE MASKS ESTIMATION FOR MONAURAL SPEECH DEREVERBERATION |
| Hao Shi; Tianjin University |
| Longbiao Wang; Tianjin University |
| Meng Ge; Tianjin University |
| Sheng Li; National Institute of Information and Communications Technology (NICT) |
| Jianwu Dang; Tianjin University |
| SPE-P17.12: A RETURN TO DEREVERBERATION IN THE FREQUENCY DOMAIN USING A JOINT LEARNING APPROACH |
| Yuying Li; Indiana University Bloomington |
| Donald S. Williamson; Indiana University |