AASP-L7.5
SCALABLE AND EFFICIENT SPEECH ENHANCEMENT USING MODIFIED COLD DIFFUSION: A RESIDUAL LEARNING APPROACH
Minje Kim, University of Illinois at Urbana-Champaign, United States of America; Trausti Kristjansson, Amazon, United States of America
Session:
AASP-L7: Audio Signal Restoration and Speech Enhancement Lecture
Track:
Audio and Acoustic Signal Processing
Location:
Room E1
Presentation Time:
Wed, 17 Apr, 17:50 - 18:10 (UTC +9)
Session Co-Chairs:
Timo Gerkmann, Universität Hamburg and Nobutaka Ito, University of Tokyo
Session AASP-L7
AASP-L7.1: A FLEXIBLE ONLINE FRAMEWORK FOR PROJECTION-BASED STFT PHASE RETRIEVAL
Tal Peer, Simon Welker, Johannes Kolhoff, Timo Gerkmann, Universität Hamburg, Germany
AASP-L7.2: UNRESTRICTED GLOBAL-PHASE-BIAS AWARE SINGLE-CHANNEL SPEECH ENHANCEMENT WITH CONFORMER-BASED METRIC GAN
Shiqi Zhang, Zheng Qiu, Waseda University, Japan; Daiki Takeuchi, Noboru Harada, NTT Corporation, Japan; Shoji Makino, Waseda University, Japan
AASP-L7.3: LOW-LATENCY SPEECH ENHANCEMENT VIA SPEECH TOKEN GENERATION
Huaying Xue, Xiulian Peng, Yan Lu, Microsoft, China
AASP-L7.4: A LIGHTWEIGHT HYBRID MULTI-CHANNEL SPEECH EXTRACTION SYSTEM WITH DIRECTIONAL VOICE ACTIVITY DETECTION
Tianchi Sun, Tong Lei, Nanjing University, China; Xu Zhang, Jiangsu Thingstar Information Technology Co., Ltd., China; Yuxiang Hu, Changbao Zhu, Horizon Robotics, China; Jing Lu, Nanjing University, China
AASP-L7.5: SCALABLE AND EFFICIENT SPEECH ENHANCEMENT USING MODIFIED COLD DIFFUSION: A RESIDUAL LEARNING APPROACH
Minje Kim, University of Illinois at Urbana-Champaign, United States of America; Trausti Kristjansson, Amazon, United States of America
AASP-L7.6: AudioSR: Versatile Audio Super-resolution at Scale
Haohe Liu, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Ke Chen, University of California San Diego, United States of America; Qiao Tian, ByteDance, China; Wenwu Wang, Mark D. Plumbley, University of Surrey, United Kingdom of Great Britain and Northern Ireland
Contacts