TH1.P2.3
Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder
Renzheng Shi, Andreas Bär, Marvin Sach, Technische Universität Braunschweig, Germany; Wouter Tirry, GOODiX Technology Belgium B.V., Germany; Tim Fingscheidt, Technische Universität Braunschweig, Germany
Session:
TH1.P2: Poster Session VII: Speech and audio coding, New and emerging topics in speech and audio processing, Special Session: Deep learning-based approaches to audio telepresence Poster
Track:
Acoustic echo and feedback suppression
Location:
Indgangsfoyer
Presentation Time:
Thu, 12 Sep, 10:00 - 12:00 Central European Time (UTC +1)
Session Co-Chairs:
Rainer Martin, Ruhr University Bochum and Mingsian Bai, National Tsing Hua University
Presentation
Discussion
Resources
No resources available.
Session TH1.P2
TH1.P2.1: HIGH-FIDELITY DIFFUSION-BASED AUDIO CODEC
Zhengpu Zhang, Jianyuan Feng, Yongjian Mao, Yehang Zhu, Junjie Shi, Xuzhou Ye, Shilei Liu, Derong Liu, Chuanzeng Huang, ByteDance, China
TH1.P2.2: A CROSS-DOMAIN APPROACH TO TEMPORAL ENVELOPE SHAPING IN PARAMETRIC STEREO CODING USING DEEP LEARNING
Patrick Kechichian, Akshaya Ravi, Erik Schuijers, Philips, Netherlands
TH1.P2.3: Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder
Renzheng Shi, Andreas Bär, Marvin Sach, Technische Universität Braunschweig, Germany; Wouter Tirry, GOODiX Technology Belgium B.V., Germany; Tim Fingscheidt, Technische Universität Braunschweig, Germany
TH1.P2.4: Gaussian Flow Bridges for Audio Domain Transfer with Unpaired Data
Eloi Moliner Juanpere, Aalto University, Finland; Sebastian Braun, Hannes Gamper, Microsoft Research, United States of America
TH1.P2.5: Complexity Reduction for Classification of Musical Instruments Using Element Selection
Ryu Kato, Natsuki Ueno, Nobutaka Ono, Tokyo Metropolitan University, Japan; Ryo Matsuda, Kazunobu Kondo, Yamaha Corporation, Japan
TH1.P2.6: PAD-VC: A PROSODY-AWARE DECODER FOR ANY-TO-FEW VOICE CONVERSION
Arunava Kr Kalita, Indian Institute of Information Technology Guwahati, India; Christian Dittmar, Paolo Sani, Frank Zalkow, Fraunhofer IIS, Erlangen, Germany, Germany; Emanuël A. P. Habets, International Audio Laboratories Erlangen, Germany, Germany; Rusha Patra, Indian Institute of Information Technology Guwahati, India
TH1.P2.7: LONG-TERM CONVERSATION ANALYSIS: PRIVACY-UTILITY TRADE-OFF UNDER NOISE AND REVERBERATION
Jule Pohlhausen, Jade University of Applied Sciences, Oldenburg, Germany; Francesco Nespoli, Imperial College, London, United Kingdom of Great Britain and Northern Ireland; Jörg Bitzer, Jade University of Applied Sciences, Oldenburg, Germany
TH1.P2.8: HARMONICS TO THE RESCUE: WHY VOICED SPEECH IS NOT A WSS PROCESS
Giovanni Bologni, Delft University of Technology, Netherlands; Richard Heusdens, Netherlands Defence Academy, Netherlands; Richard C. Hendriks, Delft University of Technology, Netherlands
TH1.P2.9: DERIVATIVE FEATURES OF SHORT-TIME HOLOMORPHIC FOURIER TRANSFORM
Iori Hashimoto, Yu Morinaga, Suehiro Shimauchi, Shigeaki Aoki, Kanazawa Institute of Technology, Japan
TH1.P2.10: FEASIBILITY OF IMAGLS-BSM - ILD INFORMED BINAURAL SIGNAL MATCHING WITH ARBITRARY MICROPHONE ARRAYS
Or Berebi, Ben-Gurion University of the Negev, Israel; Zamir Ben-Hur, David Lou Alon, Meta, United States of America; Boaz Rafaely, Ben-Gurion University of the Negev, Israel
TH1.P2.11: RGI-NET: 3D ROOM GEOMETRY INFERENCE FROM ROOM IMPULSE RESPONSES WITH HIDDEN FIRST-ORDER REFLECTIONS
Inmo Yeon, Jung-Woo Choi, Korea Advanced Institute of Science and Technology (KAIST), Korea, Republic of
TH1.P2.12: A tunable binaural audio telepresence system capable of balancing immersive and enhanced modes
Yicheng Hsu, Mingsian Bai, National Tsing Hua University, Taiwan
TH1.P2.13: NEURAL DIRECTIONAL FILTERING: FAR-FIELD DIRECTIVITY CONTROL WITH A SMALL MICROPHONE ARRAY
Julian Wechsler, Srikanth Raj Chetupalli, Mhd Modar Halimeh, Oliver Thiergart, Emanuël A. P. Habets, International Audio Laboratories Erlangen, Germany
TH1.P2.14: Magnitude Least-Squares based Ambisonics Estimation of Head-Worn Device Microphone Measurements for Binaural Reproduction
AMY BASTINE, Lachlan Birnie, Thushara Abhayapala, Prasanga Samarasinghe, The Australian National University, Australia; Vladimir Tourbabin, Reality Labs Research, Meta, United States of America