Technical Program

SLP-P20: Speech Synthesis II

Session Type: Poster
Time: Friday, May 17, 08:30 - 10:30
Location: Poster Area B, Ground Floor
Session Chair: Zhenhua Ling, University of Science and Technology of China
 
SLP-P20.1: INVESTIGATION OF ENHANCED TACOTRON TEXT-TO-SPEECH SYNTHESIS SYSTEMS WITH SELF-ATTENTION FOR PITCH ACCENT LANGUAGE
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Yusuke Yasuda; National Institute of Informatics
         Xin Wang; National Institute of Informatics
         Shinji Takaki; National Institute of Informatics
         Junichi Yamagishi; National Institute of Informatics
 
SLP-P20.2: ENHANCING HYBRID SELF-ATTENTION STRUCTURE WITH RELATIVE-POSITION-AWARE BIAS FOR SPEECH SYNTHESIS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Shan Yang; Northwestern Polytechnical University
         Heng Lu; Tencent AI Lab
         Shiying Kang; Tencent AI Lab
         Lei Xie; Northwestern Polytechnical University
         Dong Yu; Tencent AI Lab
 
SLP-P20.3: WAVEFORM GENERATION FOR TEXT-TO-SPEECH SYNTHESIS USING PITCH-SYNCHRONOUS MULTI-SCALE GENERATIVE ADVERSARIAL NETWORKS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Lauri Juvela; Aalto University
         Bajibabu Bollepalli; Aalto University
         Junichi Yamagishi; National Institute of Informatics
         Paavo Alku; Aalto University
 
SLP-P20.4: INVESTIGATING CONTEXT FEATURES HIDDEN IN END-TO-END TTS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Kohki Mametani; Doshisha University
         Tsuneo Kato; Doshisha University
         Seiichi Yamamoto; Doshisha University
 
SLP-P20.5: CASTING TO CORPUS: SEGMENTING AND SELECTING SPONTANEOUS DIALOGUE FOR TTS WITH A CNN-LSTM SPEAKER-DEPENDENT BREATH DETECTOR
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Éva Székely; KTH Royal Institute of Technology
         Gustav Eje Henter; KTH Royal Institute of Technology
         Joakim Gustafson; KTH Royal Institute of Technology
 
SLP-P20.6: PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Ruibo Fu; CASIA
         Jianhua Tao; CASIA
         Zhengqi Wen; CASIA
         Yibin Zheng; CASIA
 
SLP-P20.7: END-TO-END CODE-SWITCHED TTS WITH MIX OF MONOLINGUAL RECORDINGS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Yuewen Cao; The Chinese University of Hong Kong
         Xixin Wu; The Chinese University of Hong Kong
         Songxiang Liu; The Chinese University of Hong Kong
         Jianwei Yu; The Chinese University of Hong Kong
         Xu Li; The Chinese University of Hong Kong
         Zhiyong Wu; Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems
         Xunying Liu; The Chinese University of Hong Kong
         Helen Meng; The Chinese University of Hong Kong
 
SLP-P20.8: SEMI-SUPERVISED TRAINING FOR IMPROVING DATA EFFICIENCY IN END-TO-END SPEECH SYNTHESIS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Yu-An Chung; Massachusetts Institute of Technology
         Yuxuan Wang; Google, Inc.
         Wei-Ning Hsu; Massachusetts Institute of Technology
         Yu Zhang; Google, Inc.
         RJ Skerry-Ryan; Google, Inc.
 
SLP-P20.9: LEARNING LATENT REPRESENTATIONS FOR STYLE CONTROL AND TRANSFER IN END-TO-END SPEECH SYNTHESIS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Ya-Jie Zhang; University of Science and Technology of China
         Shifeng Pan; Microsoft
         Lei He; Microsoft
         Zhen-Hua Ling; University of Science and Technology of China
 
SLP-P20.10: MULTI-SPEAKER EMOTIONAL ACOUSTIC MODELING FOR CNN-BASED SPEECH SYNTHESIS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Heejin Choi; Korea Advanced Institute of Science and Technology
         Sangjun Park; Korea Advanced Institute of Science and Technology
         Jinuk Park; Korea Advanced Institute of Science and Technology
         Minsoo Hahn; Korea Advanced Institute of Science and Technology
 
SLP-P20.11: SINGING VOICE SYNTHESIS BASED ON GENERATIVE ADVERSARIAL NETWORKS
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Yukiya Hono; Nagoya Institute of Technology
         Kei Hashimoto; Nagoya Institute of Technology
         Keiichiro Oura; Nagoya Institute of Technology
         Yoshihiko Nankaku; Nagoya Institute of Technology
         Keiichi Tokuda; Nagoya Institute of Technology
 
SLP-P20.12: ENHANCED VIRTUAL SINGERS GENERATION BY INCORPORATING SINGING DYNAMICS TO PERSONALIZED TEXT-TO-SPEECH-TO-SINGING
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Kantapon Kaewtip; Oben Inc
         Fernando Villavicencio; Oben Inc
         Fang-yu Kuo; Oben Inc
         Mark Harvilla; Oben Inc
         Iris Ouyang; Oben Inc
         Pierre Lanchantin; Oben Inc
 
SLP-P20.13: INVESTIGATIONS OF REAL-TIME GAUSSIAN FFTNET AND PARALLEL WAVENET NEURAL VOCODERS WITH SIMPLE ACOUSTIC FEATURES
Manuscript Link:  Click here to view manuscript on IEEE Xplore
         Takuma Okamoto; National Institute of Information and Communications Technology
         Tomoki Toda; Nagoya University
         Yoshinori Shiga; National Institute of Information and Communications Technology
         Hisashi Kawai; National Institute of Information and Communications Technology