AASP-P26: Audio and Speech Source Separation and Signal Enhancement III
Poster
Fri, 8 May, 09:00 - 11:00
Location: Poster Area 24
Session Type: Poster
Track: Audio and Acoustic Signal Processing [AA]
Click the to view the manuscript on IEEE Xplore Open Preview

AASP-P26.1: DISSECTING PERFORMANCE DEGRADATION IN AUDIO SOURCE SEPARATION UNDER SAMPLING FREQUENCY MISMATCH

Kanami Imamura, The University of Tokyo / National Institute of Advanced Industrial Science and Technology (AIST), Japan; Tomohiko Nakamura, National Institute of Advanced Industrial Science and Technology (AIST), Japan; Kohei Yatabe, Tokyo University of Agriculture and Technology, Japan; Hiroshi Saruwatari, The University of Tokyo, Japan

AASP-P26.2: NEURAL NETWORK-BASED TIME-FREQUENCY-BIN-WISE LINEAR COMBINATION OF BEAMFORMERS FOR UNDERDETERMINED TARGET SOURCE EXTRACTION

Changda Chen, Waseda University, Japan; Yichen Yang, Northwestern Polytechnical University, China; Wei Liu, Wuhan University, China; Shoji Makino, Waseda University, Japan

AASP-P26.3: Shortcut Flow Matching for Speech Enhancement: Step-Invariant flows via single stage training

Naisong Zhou, École polytechnique fédérale de Lausanne, Switzerland; Saisamarth Rajesh Phaye, Milos Cernak, Andy Pearce, Tijana Stojkovic, Logitech, Singapore; Andrea Cavallaro, École polytechnique fédérale de Lausanne, Switzerland; Andrew Harper, Logitech, United Kingdom of Great Britain and Northern Ireland

AASP-P26.4: TOWARDS REAL-TIME GENERATIVE SPEECH RESTORATION WITH FLOW-MATCHING

Tsun-An Hsieh, University of Illinois Urbana-Champaign, United States of America; Sebastian Braun, Microsoft, United States of America

AASP-P26.5: UniverSR: Unified and Versatile Audio Super-Resolution via Vocoder-Free Flow Matching

Woongjib Choi, Sangmin Lee, Hyungseob Lim, Hong-Goo Kang, Yonsei University, Korea, Republic of

AASP-P26.6: GENERALIZABILITY OF PREDICTIVE AND GENERATIVE SPEECH ENHANCEMENT MODELS TO PATHOLOGICAL SPEAKERS

Mingchi Hou, Idiap Research Institute, Switzerland; Ante Jukic, NVIDIA, United States of America; Ina Kodrasi, Idiap Research Institute, Switzerland

AASP-P26.7: CLASS-AWARE PERMUTATION-INVARIANT SIGNAL-TO-DISTORTION RATIO FOR SEMANTIC SEGMENTATION OF SOUND SCENE WITH SAME-CLASS SOURCES

Binh Thien Nguyen, Masahiro Yasuda, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada, NTT, Inc., Japan

AASP-P26.8: SPATIAL COVARIANCE MATRIX RECONSTRUCTION FOR SPEECH ENHANCEMENT IN REVERBERANT MULTI-SOURCE ENVIRONMENTS

Wei Liu, Wuhan University, China; Xueqin Luo, Jilu Jin, Northwestern Polytechnical University, China; Gongping Huang, Wuhan University, China; Jingdong Chen, Northwestern Polytechnical University, China; Jacob Benesty, University of Quebec, Canada; Shoji Makino, Waseda University, Japan

AASP-P26.9: SINGLE-STEP CONTROLLABLE MUSIC BANDWIDTH EXTENSION WITH FLOW MATCHING

Carlos Hernández Oliván, Hendrik Vincent Koops, Hao Hao Tan, Elio Quinton, Universal Music Group, United Kingdom of Great Britain and Northern Ireland