AASP-L3: Neural Speech and Audio Coding
Oral
Wed, 6 May, 09:00 - 11:00
Location: Room 127+128
Session Type: Oral
Track: Audio and Acoustic Signal Processing [AA]
Click the to view the manuscript on IEEE Xplore Open Preview
Wed, 6 May, 09:00 - 09:20

AASP-L3.1: S-PRESSO: ULTRA LOW BITRATE SOUND EFFECT COMPRESSION WITH DIFFUSION AUTOENCODERS AND OFFLINE QUANTIZATION

Zineb Lahrichi, Gaëtan Hadjeres, Sony AI, France; Gaël Richard, Geoffroy Peeters, LTCI, Télécom Paris, Institut Polytechnique de Paris, France
Wed, 6 May, 09:20 - 09:40

AASP-L3.2: ACOUSTIC TELEPORTATION VIA DISENTANGLED NEURAL AUDIO CODEC REPRESENTATIONS

Philipp Grundhuber, Mhd Modar Halimeh, Fraunhofer Institute for Integrated Circuits, Germany; Emanuël Habets, International Audio Laboratories Erlangen, Germany
Wed, 6 May, 09:40 - 10:00

AASP-L3.3: Residual Tokens Enhance Masked Autoencoders for Speech Modeling

Samir Sadok, Stéphane Lathuilière, Xavier Alameda-Pineda, INRIA, France
Wed, 6 May, 10:00 - 10:20

AASP-L3.4: Arbitrarily Settable Frame Rate Neural Speech Codec with Content Adaptive Variable Length Segmentation

Yukun Qian, Wenjie Zhang, Xuyi Zhuang, Shiyun Xu, Lianyu Zhou, Mingjiang Wang, Harbin Institute of Technology (Shenzhen), China
Wed, 6 May, 10:20 - 10:40

AASP-L3.5: Lisa: Lightweight Yet Superb Neural Speech Coding

Jiankai Huang, Junteng Zhang, Ming Lu, Xun Cao, Zhan Ma, Nanjing University, China
Wed, 6 May, 10:40 - 11:00

AASP-L3.6: SWITCHCODEC: ADAPTIVE RESIDUAL-EXPERT SPARSE QUANTIZATION FOR HIGH-FIDELITY NEURAL AUDIO CODING

Xiangbo Wang, Hangzhou dianzi university, China; Wenbin Jiang, Jin Wang, Yubo You, Hangzhou Dianzi University, China; Sheng Fang, Hangzhou dianzi university, China; Fei Wen, Shanghai Jiao Tong University, China