FR1.PA2.7
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
Kuan-Chen Wang, National Taiwan University, Taiwan; You-Jin Li, Wei-Lun Chen, Academia Sinica, Taiwan; Yu-Wen Chen, Columbia University, United States; Yi-Ching Wang, Chunghwa Telecom Co., Ltd., Taiwan; Ping-Cheng Yeh, National Taiwan University, Taiwan; Chao Zhang, Tsinghua University, China; Yu Tsao, Academia Sinica, Taiwan
Session:
FR1.PA2: Enhancement, Separation and Reconstruction of Audio and Speech Poster
Track:
ASMSP - Acoustic, Speech and Music Signal Processing
Location:
Poster Area 2
Presentation Time:
Fri, 30 Aug, 10:30 - 12:30 France Time (UTC +1)
Session Co-Chairs:
Mathieu Fontaine, Télécom Paris and Emina Alickovic, Linköping University,
Presentation
Discussion
Resources
No resources available.
Session FR1.PA2
FR1.PA2.1: Ray-Space constrained multichannel Nonnegative Matrix Factorization for Audio Source Separation
Antonio Jesús Muñoz-Montoro, Universidad de Jaén, Spain; Marco Olivieri, Mirco Pezzoli, Politecnico di Milano, Italy; Julio José Carabias-Orti, Universidad de Jaén, Spain; Fabio Antonacci, Augusto Sarti, Politecnico di Milano, Italy
FR1.PA2.2: CATSE: A Context-Aware Framework for Causal Target Sound Extraction
Shrishail Baligar, University of California, Merced, United States; Mikolaj Kegler, Bryce Irvin, Marko Stamenovic, Bose Corporation, USA, United States; Shawn Newsam, University of California, Merced, United States
FR1.PA2.3: NEURAL NETWORK-BASED SPEECH RECONSTRUCTION FROM UNDERSAMPLED STFT MAGNITUDE DATA
Wojciech Czaja, Canran Ji, Shashank Sule, Matthias Wellershoff, University of Maryland, College Park, United States
FR1.PA2.4: PRE-TRAINING MUSIC CLASSIFICATION MODELS VIA MUSIC SOURCE SEPARATION
Christos Garoufis, Athanasia Zlatintsi, Athena Research Center, Greece; Petros Maragos, National Technical University of Athens, Greece
FR1.PA2.5: DeePMOS-B: Deep Posterior Mean-Opinion-Score using Beta Distribution
Xinyu Liang, KTH Royal Institute of Technology, Sweden; Fredrik Cumlin, Codemill AB, Sweden; Victor Ungureanu, Chandan K. A. Reddy, Christian Schuldt, Google LLC, Switzerland; Saikat Chatterjee, KTH Royal Institute of Technology, Sweden
FR1.PA2.6: Using Speech Foundational Models In Loss Functions For Hearing Aid Speech Enhancement
Robert Sutherland, George Close, Thomas Hain, Stefan Goetze, Jon Barker, University of Sheffield, United Kingdom
FR1.PA2.7: Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
Kuan-Chen Wang, National Taiwan University, Taiwan; You-Jin Li, Wei-Lun Chen, Academia Sinica, Taiwan; Yu-Wen Chen, Columbia University, United States; Yi-Ching Wang, Chunghwa Telecom Co., Ltd., Taiwan; Ping-Cheng Yeh, National Taiwan University, Taiwan; Chao Zhang, Tsinghua University, China; Yu Tsao, Academia Sinica, Taiwan
FR1.PA2.8: ALIAS-FREE LEVEL CROSSING SAMPLING
Negar Riazifar, Nigel G. Stocks, University of Warwick, United Kingdom
FR1.PA2.9: Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
Yuying Xie, Aalborg University, Denmark; Michael Kuhlmann, Frederik Rautenberg, Paderborn University, Germany; Zheng-Hua Tan, Aalborg University, Denmark; Reinhold Haeb-Umbach, Paderborn University, Germany
FR1.PA2.10: ROOM TRANSFER FUNCTION RECONSTRUCTION USING COMPLEX-VALUED NEURAL NETWORKS AND IRREGULARLY DISTRIBUTED MICROPHONES
Francesca Ronchini, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti, Politecnico di Milano, Italy