TH2.I: Machine Learning for Speech Synthesis II |
| Session Type: Poster |
| Time: Thursday, 7 May, 11:30 - 13:30 |
| Location: On-Demand |
| Session Chairs: Tomoki Toda, Nagoya University and Zhiyong Wu, Tsinghua University
|
| |
| TH2.I.1: EFFICIENT SHALLOW WAVENET VOCODER USING MULTIPLE SAMPLES OUTPUT BASED ON LAPLACIAN DISTRIBUTION AND LINEAR PREDICTION |
| Patrick Lumban Tobing; Nagoya University |
| Yi-Chiao Wu; Nagoya University |
| Tomoki Hayashi; Nagoya University |
| Kazuhiro Kobayashi; Nagoya University |
| Tomoki Toda; Nagoya University |
| |
| TH2.I.2: FLOW-TTS: A NON-AUTOREGRESSIVE NETWORK FOR TEXT TO SPEECH BASED ON FLOW |
| Chenfeng Miao; Ping An Technology (Shenzhen) Co., Ltd. |
| Shuang Liang; Ping An Technology (Shenzhen) Co., Ltd. |
| Minchuan Chen; Ping An Technology (Shenzhen) Co., Ltd. |
| Jun Ma; Ping An Technology (Shenzhen) Co., Ltd. |
| Shaojun Wang; Ping An Technology (Shenzhen) Co., Ltd. |
| Jing Xiao; Ping An Technology (Shenzhen) Co., Ltd. |
| |
| TH2.I.3: WAVEFFJORD: FFJORD-BASED VOCODER FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS |
| Ning-Qian Wu; University of Science and Technology of China |
| Zhen-Hua Ling; University of Science and Technology of China |
| |
| TH2.I.4: IMPROVING LPCNET-BASED TEXT-TO-SPEECH WITH LINEAR PREDICTION-STRUCTURED MIXTURE DENSITY NETWORK |
| Min-Jae Hwang; Yonsei university |
| Eunwoo Song; Naver corporation |
| Ryuichi Yamamoto; LINE Corporation |
| Frank K. Soong; Microsoft Research Asia |
| Hong-Goo Kang; Yonsei university |
| |
| TH2.I.5: DISENTANGLING TIMBRE AND SINGING STYLE WITH MULTI-SINGER SINGING SYNTHESIS SYSTEM |
| Juheon Lee; Seoul National University |
| Hyeong-Seok Choi; Seoul National University |
| Junghyun Koo; Seoul National University |
| Kyogu Lee; Seoul National University |
| |
| TH2.I.6: SEQUENCE-TO-SEQUENCE SINGING SYNTHESIS USING THE FEED-FORWARD TRANSFORMER |
| Merlijn Blaauw; Universitat Pompeu Fabra |
| Jordi Bonada; Universitat Pompeu Fabra |
| |
| TH2.I.7: KOREAN SINGING VOICE SYNTHESIS BASED ON AUTO-REGRESSIVE BOUNDARY EQUILIBRIUM GAN |
| Soonbeom Choi; Korea Advanced Institute of Science and Technology (KAIST) |
| Wonil Kim; Korea Advanced Institute of Science and Technology (KAIST) |
| Saebyul Park; Korea Advanced Institute of Science and Technology (KAIST) |
| Sangeon Yong; Korea Advanced Institute of Science and Technology (KAIST) |
| Juhan Nam; Korea Advanced Institute of Science and Technology (KAIST) |
| |
| TH2.I.8: FAST AND HIGH-QUALITY SINGING VOICE SYNTHESIS SYSTEM BASED ON CONVOLUTIONAL NEURAL NETWORKS |
| Kazuhiro Nakamura; Techno-Speech |
| Shinji Takaki; Techno-Speech |
| Kei Hashimoto; Techno-Speech |
| Keiichiro Oura; Techno-Speech |
| Yoshihiko Nankaku; Nagoya Institute of Technology |
| Keiichi Tokuda; Techno-Speech |
| |
| TH2.I.9: HYBRID NEURAL-PARAMETRIC F0 MODEL FOR SINGING SYNTHESIS |
| Jordi Bonada; Universitat Pompeu Fabra |
| Merlijn Blaauw; Universitat Pompeu Fabra |
| |
| TH2.I.10: UTTERANCE-LEVEL SEQUENTIAL MODELING FOR DEEP GAUSSIAN PROCESS BASED SPEECH SYNTHESIS USING SIMPLE RECURRENT UNIT |
| Tomoki Koriyama; University of Tokyo |
| Hiroshi Saruwatari; University of Tokyo |
| |
| TH2.I.11: EMOTIONAL SPEECH SYNTHESIS WITH RICH AND GRANULARIZED CONTROL |
| Se-Yun Um; Yonsei University |
| Sangshin Oh; Yonsei University |
| Kyungguen Byun; Yonsei University |
| Inseon Jang; Electronics and Telecommunications Research Institute (ETRI) |
| Chunghyun Ahn; Electronics and Telecommunications Research Institute (ETRI) |
| Hong-Goo Kang; Yonsei University |
| |
| TH2.I.12: TOWARDS UNSUPERVISED SPEECH RECOGNITION AND SYNTHESIS WITH QUANTIZED SPEECH REPRESENTATION LEARNING |
| Alexander H. Liu; National Taiwan University |
| Tao Tu; National Taiwan University |
| Hung-yi Lee; National Taiwan University |
| Lin-shan Lee; National Taiwan University |
| |