SPE-P13: Speech Separation and Extraction III |
| Session Type: Poster |
| Time: Thursday, 7 May, 11:30 - 13:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chairs: Hakan Erdogan, Google and Marc Delcroix, NTT |
| SPE-P13.1: AN EMPIRICAL STUDY OF CONV-TASNET |
| Berkan Kadioglu; Northeastern University |
| Michael Horgan; Dolby Laboratories |
| Xiaoyu Liu; Dolby Laboratories |
| Jordi Pons; Dolby Laboratories |
| Dan Darcy; Dolby Laboratories |
| Vivek Kumar; Dolby Laboratories |
| SPE-P13.2: MASK-DEPENDENT PHASE ESTIMATION FOR MONAURAL SPEAKER SEPARATION |
| Zhaoheng Ni; Graduate Center, City University of New York |
| Michael I Mandel; Brooklyn College, City University of New York |
| SPE-P13.3: JOINT PHONEME ALIGNMENT AND TEXT-INFORMED SPEECH SEPARATION ON HIGHLY CORRUPTED SPEECH |
| Kilian Schulze-Forster; LTCI, Télécom Paris, Institut Polytechnique de Paris |
| Clement S. J. Doire; Audionamix |
| Gaël Richard; LTCI, Télécom Paris, Institut Polytechnique de Paris |
| Roland Badeau; LTCI, Télécom Paris, Institut Polytechnique de Paris |
| SPE-P13.4: SINGLE-CHANNEL SPEECH SEPARATION INTEGRATING PITCH INFORMATION BASED ON A MULTI TASK LEARNING FRAMEWORK |
| Xiang Li; Peking University |
| Rui Liu; Peking University |
| Tao Song; Peking University |
| Xihong Wu; Peking University |
| Jing Chen; Peking University |
| SPE-P13.5: CONTINUOUS SPEECH SEPARATION: DATASET AND ANALYSIS |
| Zhuo Chen; Microsoft |
| Takuya Yoshioka; Microsoft |
| Liang Lu; Microsoft |
| Tianyan Zhou; Microsoft |
| Zhong Meng; Microsoft |
| Yi Luo; Microsoft |
| Jian Wu; Microsoft |
| Xiong Xiao; Microsoft |
| Jinyu Li; Microsoft |
| SPE-P13.6: THE SOUND OF MY VOICE: SPEAKER REPRESENTATION LOSS FOR TARGET VOICE SEPARATION |
| Seongkyu Mun; Naver Corporation |
| Soyeon Choe; Naver Corporation |
| Jaesung Huh; Naver Corporation |
| Joon Son Chung; Naver Corporation |
| SPE-P13.7: SPEAKER-AWARE TARGET SPEAKER ENHANCEMENT BY JOINTLY LEARNING WITH SPEAKER EMBEDDING EXTRACTION |
| Xuan Ji; Tencent |
| Meng Yu; Tencent |
| Chunlei Zhang; Tencent |
| Dan Su; Tencent |
| Tao Yu; Tencent |
| Xiaoyu Liu; Tencent |
| Dong Yu; Tencent |
| SPE-P13.8: FAR-FIELD LOCATION GUIDED TARGET SPEECH EXTRACTION USING END-TO-END SPEECH RECOGNITION OBJECTIVES |
| Aswin Shanmugam Subramanian; Johns Hopkins University |
| Chao Weng; Tencent AI |
| Meng Yu; Tencent AI |
| Shi-Xiong Zhang; Tencent AI Lab |
| Yong Xu; Tencent AI |
| Shinji Watanabe; Johns Hopkins University |
| Dong Yu; Tencent AI |
| SPE-P13.9: A STUDY OF CHILD SPEECH EXTRACTION USING JOINT SPEECH ENHANCEMENT AND SEPARATION IN REALISTIC CONDITIONS |
| Xin Wang; University of Science and Technology of China |
| Jun Du; University of Science and Technology of China |
| Alejandrina Cristia; Laboratoire de Sciences Cognitives et Psycholinguistique |
| Lei Sun; University of Science and Technology of China |
| Chin-Hui Lee; Georgia Institute of Technology |
| SPE-P13.10: AN ANALYSIS OF SPEECH ENHANCEMENT AND RECOGNITION LOSSES IN LIMITED RESOURCES MULTI-TALKER SINGLE CHANNEL AUDIO-VISUAL ASR |
| Luca Pasa; University of Padova |
| Giovanni Morrone; University of Modena and Reggio Emilia |
| Leonardo Badino; Istituto Italiano di Tecnologia (IIT) |
| SPE-P13.11: DEEP AUDIO-VISUAL SPEECH SEPARATION WITH ATTENTION MECHANISM |
| Chenda Li; Shanghai Jiao Tong University |
| Yanmin Qian; Shanghai Jiao Tong University |
| SPE-P13.12: ENHANCING END-TO-END MULTI-CHANNEL SPEECH SEPARATION VIA SPATIAL FEATURE LEARNING |
| Rongzhi Gu; Peking University Shenzhen Graduate School |
| Shi-Xiong Zhang; Tencent AI Lab |
| Lianwu Chen; Tencent |
| Yong Xu; Tencent |
| Meng Yu; Tencent |
| Dan Su; Tencent |
| Yuexian Zou; Peking University Shenzhen Graduate School |
| Dong Yu; Tencent |