AASP-P22: Audio and Speech Source Separation and Signal Enhancement II
Poster
Thu, 7 May, 16:30 - 18:30
Location: Poster Area 24
Session Type: Poster
Track: Audio and Acoustic Signal Processing [AA]
Click the to view the manuscript on IEEE Xplore Open Preview

AASP-P22.2: DOMAIN PARTITIONING MEETS PARAMETER-EFFICIENT FINE-TUNING: A NOVEL METHOD FOR IMPROVED LANGUAGE-QUERIED AUDIO SOURCE SEPARATION

Yinkai Zhang, Dingbang Zhang, Tao Wang, Xinjiang University, China; Diana Rakhimova, Al-Farabi Kazakh National University, Kazakhstan; Kai Wang, Hao Huang, Xinjiang University, China

AASP-P22.3: VM-UNSSOR: Unsupervised Neural Speech Separation Enhanced by Higher-SNR Virtual Microphone Arrays

Shulin He, Zhong-Qiu Wang, Southern University of Science and Technology, China

AASP-P22.4: Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture

Julius Richter, Danilo de Oliveira, Timo Gerkmann, University of Hamburg, Germany

AASP-P22.5: HAIR NOISE ANALYSIS AND MITIGATION FOR SMART GLASSES AUDIO CAPTURES

Subrata Biswas, Worcester Polytechnic Institute, United States of America; Daniel Wong, Meta Platforms, Inc., United States of America; Bashima Islam, Worcester Polytechnic Institute, United States of America; Sanjeel Parekh, Vladimir Tourbabin, Meta Platforms, Inc., United States of America

AASP-P22.6: SOUNDCOMPASS: NAVIGATING TARGET SOUND EXTRACTION WITH EFFECTIVE DIRECTIONAL CLUE INTEGRATION IN COMPLEX ACOUSTIC SCENES

Dayun Choi, Jung-Woo Choi, Korea Advanced Institute of Science and Technology (KAIST), Korea, Republic of

AASP-P22.7: UNMIXX: UNTANGLING HIGHLY CORRELATED SINGING VOICES MIXTURES

Jihoo Jung, Ji-Hoon Kim, Doyeop Kwak, Junwon Lee, Juhan Nam, Joon Son Chung, Korea Advanced Institute of Science and Technology, South Korea, Korea, Republic of

AASP-P22.8: MMAUDIOSEP: TAMING VIDEO-TO-AUDIO GENERATIVE MODEL TOWARDS VIDEO/TEXT-QUERIED SOUND SEPARATION

Akira Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Sony Group Corporation, Japan

AASP-P22.9: DITSE: HIGH-FIDELITY GENERATIVE SPEECH ENHANCEMENT VIA LATENT DIFFUSION TRANSFORMERS

Heitor Rodrigues Guimaraes, Institut National de la Recherche Scientifique, Canada; Jiaqi Su, Rithesh Kumar, Adobe Research, United States of America; Tiago Falk, Institut National de la Recherche Scientifique, Canada; Zeyu Jin, Adobe Research, United States of America

AASP-P22.10: SPECTRAL OR SPATIAL? LEVERAGING BOTH FOR SPEAKER EXTRACTION IN CHALLENGING DATA CONDITIONS

Aviad Eisenberg, Bar-Ilan University and OriginAI, Israel; Sharon Gannot, Bar-Ilan University, Israel; Shlomo E. Chazan, OriginAI, Israel