SLP-L23: Speech separation and extraction
Thu, 18 Apr, 16:30 - 18:30 (UTC +9)
Location: Room 104
Session Type: Lecture
Session Co-Chairs: Gordon Wichern , Mitsubihi Electric Research Labs (MERL) and Katerina Zmolikova, Meta
Track: Speech and Language Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Thu, 18 Apr, 16:30 - 16:50 (UTC +9)
 

SLP-L23.1: NEUROHEED+: IMPROVING NEURO-STEERED SPEAKER EXTRACTION WITH JOINT AUDITORY ATTENTION DETECTION

Zexu Pan, Gordon Wichern, Francois Germain, Sameer Khurana, Jonathan Le Roux, Mitsubishi Electric Research Laboratories, United States of America
Thu, 18 Apr, 16:50 - 17:10 (UTC +9)
 

SLP-L23.2: TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS

Junyi Peng, Brno University of Technology, Czechia; Marc Delcroix, Tsubasa Ochiai, NTT Corporation, Japan; Oldřich Plchot, Brno University of Technology, Czechia; Shoko Araki, NTT Corporation, Japan; Jan Černocký, Brno University of Technology, Czechia
Thu, 18 Apr, 17:10 - 17:30 (UTC +9)
 

SLP-L23.3: AUDIO-VISUAL ACTIVE SPEAKER EXTRACTION FOR SPARSELY OVERLAPPED MULTI-TALKER SPEECH

Junjie Li, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China; Ruijie Tao, Zexu Pan, Meng Ge, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore; Shuai Wang, Haizhou Li, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China
Thu, 18 Apr, 17:30 - 17:50 (UTC +9)
 

SLP-L23.4: AUDIOVISUAL SPEAKER SEPARATION WITH FULL- AND SUB-BAND MODELING IN THE TIME-FREQUENCY DOMAIN

Vahid Ahmadi Kalkhorani, Ohio State University, United States of America; Anurag Kumar, Ke Tan, Buye Xu, Meta Reality Labs, United States of America; DeLiang Wang, Ohio State University, United States of America
Thu, 18 Apr, 17:50 - 18:10 (UTC +9)
 

SLP-L23.5: Combining Conformer and Dual-Path-Transformer Networks for Single Channel Noisy Reverberant Speech Separation

William Ravenscroft, Stefan Goetze, Thomas Hain, The University of Sheffield, United Kingdom of Great Britain and Northern Ireland
Thu, 18 Apr, 18:10 - 18:30 (UTC +9)
 

SLP-L23.6: Generation-based Target Speech Extraction with Speech Discretization and Vocoder

Linfeng Yu, Wangyou Zhang, Chenpeng Du, Leying Zhang, Zheng Liang, Yanmin Qian, Shanghai Jiao Tong University, China