TTS: Text-to-speech System |
Session Type: Poster |
Time: Sunday, December 15, 10:30 - 12:00 |
Location: VHS Event Centre, Level 1 |
Session Chair: Koichi Shinoda, Tokyo Institute of Technology
|
|
TTS.1: ON THE STUDY OF GENERATIVE ADVERSARIAL NETWORKS FOR CROSS-LINGUAL VOICE CONVERSION |
Berrak Sisman, Mingyang Zhang, National University of Singapore, Singapore; Minghui Dong, Institute for Infocomm Research, A*STAR, Singapore; Haizhou Li, National University of Singapore, Singapore |
|
TTS.2: WAVENET FACTORIZATION WITH SINGULAR VALUE DECOMPOSITION FOR VOICE CONVERSION |
Hongqiang Du, Northwestern Polytechnical University, China; Xiaohai Tian, National University of Singapore, Singapore; Lei Xie, Northwestern Polytechnical University, China; Haizhou Li, National University of Singapore, Singapore |
|
TTS.3: A MODULARIZED NEURAL NETWORK WITH LANGUAGE-SPECIFIC OUTPUT LAYERS FOR CROSS-LINGUAL VOICE CONVERSION |
Yi Zhou, Xiaohai Tian, Emre Yilmaz, Rohan Kumar Das, Haizhou Li, National University of Singapore, Singapore |
|
TTS.4: KNOWLEDGE DISTILLATION FROM BERT IN PRE-TRAINING AND FINE-TUNING FOR POLYPHONE DISAMBIGUATION |
Hao Sun, Peking University, China; Xu Tan, Microsoft Research, China; Jun-Wei Gan, Sheng Zhao, Dongxu Han, Microsoft STC Asia, China; Hongzhi Liu, Peking University, China; Tao Qin, Tie-Yan Liu, Microsoft Research, China |
|
TTS.5: INVESTIGATION OF SHALLOW WAVENET VOCODER WITH LAPLACIAN DISTRIBUTION OUTPUT |
Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda, Nagoya University, Japan |
|
TTS.6: LEARNING HIERARCHICAL REPRESENTATIONS FOR EXPRESSIVE SPEAKING STYLE IN END-TO-END SPEECH SYNTHESIS |
Xiaochun An, Northwestern Polytechnical University, China; Yuxuan Wang, ByteDance AI Lab, China; Shan Yang, Northwestern Polytechnical University, China; Zejun Ma, ByteDance AI Lab, China; Lei Xie, Northwestern Polytechnical University, China |
|
TTS.7: CONTROLLING EMOTION STRENGTH WITH RELATIVE ATTRIBUTE FOR END-TO-END SPEECH SYNTHESIS |
Xiaolian Zhu, Shan Yang, Geng Yang, Lei Xie, Northwestern Polytechnical University, China |
|
TTS.8: BOOTSTRAPPING NON-PARALLEL VOICE CONVERSION FROM SPEAKER-ADAPTIVE TEXT-TO-SPEECH |
Hieu-Thi Luong, Junichi Yamagishi, National Institute of Informatics, Japan |
|
TTS.9: IMPROVING MANDARIN END-TO-END SPEECH SYNTHESIS BY SELF-ATTENTION AND LEARNABLE GAUSSIAN BIAS |
Fengyu Yang, Shan Yang, Northwestern Polytechnical University, China; Pengcheng Zhu, Pengju Yan, Tongdun, China; Lei Xie, Northwestern Polytechnical University, China |
|
TTS.10: TACOTRON-BASED ACOUSTIC MODEL USING PHONEME ALIGNMENT FOR PRACTICAL NEURAL TEXT-TO-SPEECH SYSTEMS |
Takuma Okamoto, National Institute of Information and Communications Technology, Japan; Tomoki Toda, Nagoya University, Japan; Yoshinori Shiga, Hisashi Kawai, National Institute of Information and Communications Technology, Japan |
|