AASP-P17.2

FROM HALLUCINATION TO ARTICULATION: LANGUAGE MODEL-DRIVEN LOSSES FOR ULTRA LOW-BITRATE NEURAL SPEECH CODING

Jayeon Yi, Minje Kim, University of Illinois Urbana-Champaign, United States of America

Session:
AASP-P17: Audio and Speech Coding, Transmission, and Representations Poster

Track:
Audio and Acoustic Signal Processing [AA]

Location:
Poster Area 26

Presentation Time:
Thu, 7 May, 09:00 - 11:00

Presentation
Discussion
Resources
No resources available.
Session AASP-P17
AASP-P17.1: MixGAN-based Non-blind Bandwidth Extension for Audio Codec
Hao Guo, Shenzhen International Graduate School, Tsinghua University, Shenzhen, China, China; Bingyin Xia, Central Media Technology Institute, Huawei Technologies, Beijing, China, China; Xiao-Ping Zhang, Wenbo Ding, Shenzhen International Graduate School, Tsinghua University, Shenzhen, China, China
AASP-P17.2: FROM HALLUCINATION TO ARTICULATION: LANGUAGE MODEL-DRIVEN LOSSES FOR ULTRA LOW-BITRATE NEURAL SPEECH CODING
Jayeon Yi, Minje Kim, University of Illinois Urbana-Champaign, United States of America
AASP-P17.3: Towards Evaluating Generative Audio: Insights from Neural Audio Codec Embedding Distances
ARIJIT BISWAS, Dolby Germany GmbH, Germany; Lars Villemoes, Dolby Sweden AB, Sweden
AASP-P17.4: SALAD-VAE: Semantic Audio Compression with Language-Audio Distillation
Sebastian Braun, Hannes Gamper, Dimitra Emmanouilidou, Microsoft, United States of America
AASP-P17.5: AUDEN-VOICE: GENERAL-PURPOSE VOICE ENCODER FOR SPEECH AND LANGUAGE UNDERSTANDING
Mingyue Huo, University of Illinois Urbana-Champaign, United States of America; Wei-Cheng Tseng, University of Texas at Austin, United States of America; Yiwen Shao, Hao Zhang, Dong Yu, Tencent America, United States of America
AASP-P17.6: Identifying the Minimal and Maximal Phonetic Subspace of Speech Representations
Xingwen Han, Hao Tang, University of Edinburgh, United Kingdom of Great Britain and Northern Ireland
AASP-P17.7: ENHANCING NOISE ROBUSTNESS FOR NEURAL SPEECH CODECS THROUGH RESOURCE-EFFICIENT PROGRESSIVE QUANTIZATION PERTURBATION SIMULATION
Rui-Chen Zheng, Yang Ai, Hui-Peng Du, Li-Rong Dai, University of Science and Technology of China, China
AASP-P17.8: TESTING THE EFFICIENT CODING HYPOTHESIS BEYOND HUMANS: THE AUDITORY KERNELS OF BAT VOCALIZATIONS
Aleksandra Savova, Jorge Martinez, Dimme de Groot, Delft University of Technology, Netherlands
AASP-P17.9: Low-Bandwidth High-Fidelity Speech Transmission With Generative Latent Joint Source-Channel Coding
Guangkuan Li, Shengshi Yao, Beijing University of Posts and Telecommunications, China; Sixian Wang, Shanghai Jiao Tong University, China; Zhenyu Liu, University of Surrey, United Kingdom of Great Britain and Northern Ireland; Kai Niu, Jincheng Dai, Beijing University of Posts and Telecommunications, China
AASP-P17.10: DYNAMIC BIT-PLANE ARITHMETIC CODING METHOD FOR QUANTIZED SPECTRAL COEFFICIENTS IN USAC
Seonjae Kim, Dong-A University, Korea, Republic of; Byeongho Jo, Seungkwon Beack, Taejin Lee, Electronics Telecommunication Research Institute (ETRI), Korea, Republic of; Dongsan Jun, Dong-A University, Korea, Republic of
Contacts