Technical Program

TTS: Text-to-speech System

Session Type: Poster
Time: Sunday, December 15, 10:30 - 12:00
Location: VHS Event Centre, Level 1
Session Chair: Koichi Shinoda, Tokyo Institute of Technology
 
TTS.1: ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION
Berrak Sisman, Mingyang Zhang, National University of Singapore, Singapore; Minghui Dong, Institute for Infocomm Research, A*STAR, Singapore; Haizhou Li, National University of Singapore, Singapore
 
TTS.2: WAVENET FACTORIZATION WITH SINGULAR VALUE DECOMPOSITION FOR VOICE CONVERSION
Hongqiang Du, Northwestern Polytechnical University, China; Xiaohai Tian, National University of Singapore, Singapore; Lei Xie, Northwestern Polytechnical University, China; Haizhou Li, National University of Singapore, Singapore
 
TTS.3: A MODULARIZED NEURAL NETWORK WITH LANGUAGE-SPECIFIC OUTPUT LAYERS FOR CROSS-LINGUAL VOICE CONVERSION
Yi Zhou, Xiaohai Tian, Emre Yilmaz, Rohan Kumar Das, Haizhou Li, National University of Singapore, Singapore
 
TTS.4: KNOWLEDGE DISTILLATION FROM BERT IN PRE-TRAINING AND FINE-TUNING FOR POLYPHONE DISAMBIGUATION
Hao Sun, Peking University, China; Xu Tan, Microsoft Research, China; Jun-Wei Gan, Sheng Zhao, Dongxu Han, Microsoft STC Asia, China; Hongzhi Liu, Peking University, China; Tao Qin, Tie-Yan Liu, Microsoft Research, China
 
TTS.5: INVESTIGATION OF SHALLOW WAVENET VOCODER WITH LAPLACIAN DISTRIBUTION OUTPUT
Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda, Nagoya University, Japan
 
TTS.6: LEARNING HIERARCHICAL REPRESENTATIONS FOR EXPRESSIVE SPEAKING STYLE IN END-TO-END SPEECH SYNTHESIS
Xiaochun An, Northwestern Polytechnical University, China; Yuxuan Wang, ByteDance AI Lab, China; Shan Yang, Northwestern Polytechnical University, China; Zejun Ma, ByteDance AI Lab, China; Lei Xie, Northwestern Polytechnical University, China
 
TTS.7: CONTROLLING EMOTION STRENGTH WITH RELATIVE ATTRIBUTE FOR END-TO-END SPEECH SYNTHESIS
Xiaolian Zhu, Shan Yang, Geng Yang, Lei Xie, Northwestern Polytechnical University, China
 
TTS.8: BOOTSTRAPPING NON-PARALLEL VOICE CONVERSION FROM SPEAKER-ADAPTIVE TEXT-TO-SPEECH
Hieu-Thi Luong, Junichi Yamagishi, National Institute of Informatics, Japan
 
TTS.9: IMPROVING MANDARIN END-TO-END SPEECH SYNTHESIS BY SELF-ATTENTION AND LEARNABLE GAUSSIAN BIAS
Fengyu Yang, Shan Yang, Northwestern Polytechnical University, China; Pengcheng Zhu, Pengju Yan, Tongdun, China; Lei Xie, Northwestern Polytechnical University, China
 
TTS.10: TACOTRON-BASED ACOUSTIC MODEL USING PHONEME ALIGNMENT FOR PRACTICAL NEURAL TEXT-TO-SPEECH SYSTEMS
Takuma Okamoto, National Institute of Information and Communications Technology, Japan; Tomoki Toda, Nagoya University, Japan; Yoshinori Shiga, Hisashi Kawai, National Institute of Information and Communications Technology, Japan