SLP-L7: Text to Speech Generation - O1
Wed, 17 Apr, 08:20 - 10:20 (UTC +9)
Location: Room 103
Session Type: Lecture
Session Co-Chairs: Heiga ZEN, Google and Qiong Hu, Google
Track: Speech and Language Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Wed, 17 Apr, 08:20 - 08:40 (UTC +9)
 

SLP-L7.1: CREATING PERSONALIZED SYNTHETIC VOICES FROM ARTICULATION IMPAIRED SPEECH USING AUGMENTED RECONSTRUCTION LOSS

Yusheng Tian, Jingyu Li, Tan Lee, The Chinese University of Hong Kong, Hong Kong
Wed, 17 Apr, 08:40 - 09:00 (UTC +9)
 

SLP-L7.2: VOICEFLOW: EFFICIENT TEXT-TO-SPEECH WITH RECTIFIED FLOW MATCHING

Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu, Shanghai Jiao Tong University, China
Wed, 17 Apr, 09:00 - 09:20 (UTC +9)
 

SLP-L7.3: MATCHA-TTS: A FAST TTS ARCHITECTURE WITH CONDITIONAL FLOW MATCHING

Shivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter, KTH Royal Institute of Technology, Sweden
Wed, 17 Apr, 09:20 - 09:40 (UTC +9)
 

SLP-L7.4: EXTENDING MULTILINGUAL SPEECH SYNTHESIS TO 100+ LANGUAGES WITHOUT TRANSCRIBED DATA

Takaaki Saeki, The University of Tokyo, Japan, Japan; Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov, Google, United States of America
Wed, 17 Apr, 09:40 - 10:00 (UTC +9)
 

SLP-L7.5: PROMPTTTS++: CONTROLLING SPEAKER IDENTITY IN PROMPT-BASED TEXT-TO-SPEECH USING NATURAL LANGUAGE DESCRIPTIONS

Reo Shimizu, Tohoku University, Japan; Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana, LINE Corp., Japan
Wed, 17 Apr, 10:00 - 10:20 (UTC +9)
 

SLP-L7.6: VoiceLDM: Text-to-Speech with Environmental Context

Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung, Korea Advanced Institute of Science and Technology, Korea, Republic of