SPE-L12: Speech Separation and Extraction II: Multi-channel |
| Session Type: Lecture |
| Time: Thursday, 7 May, 11:30 - 13:30 |
| Location: On-Demand |
| Virtual Session: View on Virtual Platform |
| Session Chair: Araki Shoko, NTT |
| SPE-L12.1: BEAM-TASNET: TIME-DOMAIN AUDIO SEPARATION NETWORK MEETS FREQUENCY-DOMAIN BEAMFORMER |
| Tsubasa Ochiai; NTT Communication Science Laboratories |
| Marc Delcroix; NTT Communication Science Laboratories |
| Rintaro Ikeshita; NTT Communication Science Laboratories |
| Keisuke Kinoshita; NTT Communication Science Laboratories |
| Tomohiro Nakatani; NTT Communication Science Laboratories |
| Shoko Araki; NTT Communication Science Laboratories |
| SPE-L12.2: ON END-TO-END MULTI-CHANNEL TIME DOMAIN SPEECH SEPARATION IN REVERBERANT ENVIRONMENTS |
| Jisi Zhang; University of Sheffield |
| Catalin Zorila; Toshiba Cambridge Research Laboratory |
| Rama Doddipatla; Toshiba Cambridge Research Laboratory |
| Jon Barker; University of Sheffield |
| SPE-L12.3: END-TO-END MICROPHONE PERMUTATION AND NUMBER INVARIANT MULTI-CHANNEL SPEECH SEPARATION |
| Yi Luo; Columbia University |
| Zhuo Chen; Microsoft |
| Nima Mesgarani; Columbia University |
| Takuya Yoshioka; Microsoft |
| SPE-L12.4: DNN-SUPPORTED MASK-BASED CONVOLUTIONAL BEAMFORMING FOR SIMULTANEOUS DENOISING, DEREVERBERATION, AND SOURCE SEPARATION |
| Tomohiro Nakatani; NTT Corporation |
| Riki Takahashi; Tsukuba University |
| Tsubasa Ochiai; NTT Corporation |
| Keisuke Kinoshita; NTT Corporation |
| Rintaro Ikeshita; NTT Corporation |
| Marc Delcroix; NTT Corporation |
| Shoko Araki; NTT Corporation |
| SPE-L12.5: REAL-TIME BINAURAL SPEECH SEPARATION WITH PRESERVED SPATIAL CUES |
| Cong Han; Columbia University |
| Yi Luo; Columbia University |
| Nima Mesgarani; Columbia University |
| SPE-L12.6: SLOGD: SPEAKER LOCATION GUIDED DEFLATION APPROACH TO SPEECH SEPARATION |
| Sunit Sivasankaran; Inria-Nancy |
| Emmanuel Vincent; Inria-Nancy |
| Dominique Fohr; Loria |