SLP-P11.7

Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis

Pengchao Feng, Yao Xiao, Ziyang Ma, Zhikang Niu, Shuai Fan, Shanghai Jiao Tong University, China, China; Yao Li, Shanghai Aviation Electric Co., Ltd, China, China; Sheng Wang, Xie Chen, Shanghai Jiao Tong University, China, China

Session:
SLP-P11: Prosody and Expressive Speech Generation Poster

Track:
Speech and Language Processing [SL]

Location:
Poster Area 31

Presentation Time:
Tue, 5 May, 16:30 - 18:30

Presentation
Discussion
Resources
No resources available.
Session SLP-P11
SLP-P11.1: MEASURING PROSODY DIVERSITY IN ZERO-SHOT TTS: A NEW METRIC, BENCHMARK, AND EXPLORATION
Yifan Yang, Bing Han, Shanghai Jiao Tong University, China; Hui Wang, Long Zhou, Tencent, China; Wei Wang, Shanghai Jiao Tong University, China; Mingyu Cui, Xu Tan, Tencent, China; Xie Chen, Shanghai Jiao Tong University, China
SLP-P11.2: EMOTIONAL DIMENSION CONTROL IN LANGUAGE MODEL-BASED TEXT-TO-SPEECH: SPANNING A BROAD SPECTRUM OF HUMAN EMOTIONS
Kun Zhou, Alibaba Group, Singapore; You Zhang, University of Rochester, United States of America; Dianwen Ng, Shengkui Zhao, Hao Wang, Bin Ma, Alibaba Group, United States of America
SLP-P11.3: EmoShift: Lightweight Activation Steering for Enhanced Emotion-Aware Speech Synthesis
Li Zhou, Hao Jiang, The Chinese University of Hong Kong, Shenzhen, China; Junjie Li, The Hong Kong Polytechnic University, China; Tianrui Wang, Tianjin University, China; Haizhou Li, The Chinese University of Hong Kong, Shenzhen, China
SLP-P11.4: Beyond Global Emotion: Fine-Grained Emotional Speech Synthesis with Dynamic Word-Level Modulation
Sirui Wang, Andong Chen, Tiejun Zhao, Harbin Institute of Technology, China
SLP-P11.5: EMORL-TTS: REINFORCEMENT LEARNING FOR FINE-GRAINED EMOTION CONTROL IN LLM-BASED TTS
Haoxun Li, Yu Liu, Yuqing Sun, Hanlei Shi, Leyuan Qu, Taihao Li, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, China
SLP-P11.6: QFOCUS: CONTROLLABLE SYNTHESIS FOR AUTOMATED SPEECH STRESS EDITING TO DELIVER HUMAN-LIKE EMPHATIC INTENT
Jingyi Fang, Qifu Technology, China; Yufei Tang, Zhiyu Wu, Yuanzhong Zheng, Yaoxuan Wang, Haojun Fei, Qifu Technology, Inc., China
SLP-P11.7: Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis
Pengchao Feng, Yao Xiao, Ziyang Ma, Zhikang Niu, Shuai Fan, Shanghai Jiao Tong University, China, China; Yao Li, Shanghai Aviation Electric Co., Ltd, China, China; Sheng Wang, Xie Chen, Shanghai Jiao Tong University, China, China
SLP-P11.8: DAIEN-TTS: DISENTANGLED AUDIO INFILLING FOR ENVIRONMENT-AWARE TEXT-TO-SPEECH SYNTHESIS
Ye-Xin Lu, Yu Gu, University of Science and Technology of China, China; Kun Wei, Northwestern Polytechnical University, China; Hui-Peng Du, Yang Ai, Zhen-Hua Ling, University of Science and Technology of China, China
SLP-P11.9: Objective Evaluation of Prosody and Intelligibility in Speech Synthesis via Conditional Prediction of Discrete Tokens
Ismail Rasim Ulgen, Zongyang Du, Junchen Lu, Philipp Koehn, Berrak Sisman,
SLP-P11.10: PRESENT: ZERO-SHOT TEXT-TO-PROSODY CONTROL
Perry Lam, Singapore University of Technology and Design, Singapore; Huayun Zhang, Nancy Chen, A*STAR, Singapore; Berrak Sisman, Johns Hopkins University, United States of America; Dorien Herremans, Singapore University of Technology and Design, Singapore
Contacts