SLP-P35: Text to Speech Generation - P3
Fri, 19 Apr, 08:20 - 10:20 (UTC +9)
Location: Poster Zone 3A
Session Type: Poster
Session Co-Chairs: Joon Son Chung, KAIST and Liping Chen, University of Science and Technology of China
Track: Speech and Language Processing
Click the to view the manuscript on IEEE Xplore Open Preview
 

SLP-P35.1: TNFORMER: SINGLE-PASS MULTILINGUAL TEXT NORMALIZATION WITH A TRANSFORMER DECODER MODEL

Binbin Shen, Jie Wang, Meng Meng, Yujun Wang, Xiaomi Inc., China
 

SLP-P35.2: A UNIFIED FRONT-END FRAMEWORK FOR ENGLISH TEXT-TO-SPEECH SYNTHESIS

Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Qiao Tian, Yuanyuan Huo, Yuxuan Wang, ByteDance, China
 

SLP-P35.3: COLLABORATIVE WATERMARKING FOR ADVERSARIAL SPEECH SYNTHESIS

Lauri Juvela, Aalto University, Finland; Xin Wang, National Institute of Informatics, Japan
 

SLP-P35.4: MAPACHE: MASKED PARALLEL TRANSFORMER FOR ADVANCED SPEECH EDITING AND SYNTHESIS

Guillermo Cámbara, Patrick Lumban Tobing, Mikolaj Babianski, Ravichander Vipperla, Duo Wang, Ron Shmelkin, Giuseppe Coccia, Orazio Angelini, Arnaud Joly, Mateusz Lajszczak, Vincent Pollet, Amazon, Spain
 

SLP-P35.5: Diversity based core-set selection for text-to-speech with linguistic and acoustic features

Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari, The University of Tokyo, Japan
 

SLP-P35.6: FEWER-TOKEN NEURAL SPEECH CODEC WITH TIME-INVARIANT CODES

Yong Ren, Tao Wang, Jiangyan Yi, Le Xu, Institute of Automation, Chinese Academy of Sciences, China; Jianhua Tao, Department of Automation, Tsinghua University, China; Chu Yuan Zhang, Institute of Automation, Chinese Academy of Sciences, China; Junzuo Zhou, Institute of Automation, Chinese Academy of Sciences, China
 

SLP-P35.7: CONVNEXT-TTS AND CONVNEXT-VC: CONVNEXT-BASED FAST END-TO-END SEQUENCE-TO-SEQUENCE TEXT-TO-SPEECH AND VOICE CONVERSION

Takuma Okamoto, Yamato Ohtani, National Institute of Information and Communications Technology, Japan; Tomoki Toda, Nagoya University, Japan; Hisashi Kawai, National Institute of Information and Communications Technology, Japan
 

SLP-P35.8: SYNTHE-SEES: FACE BASED TEXT-TO-SPEECH FOR VIRTUAL SPEAKER

Jae Hyun Park, Joon-Gyu Maeng, Speech AI/NCSOFT, Korea, Republic of; TaeJun Bak, SKT, Korea, Republic of; Young-Sun Joo, Speech AI/NCSOFT, Korea, Republic of
 

SLP-P35.9: LANGUAGE-ORIENTED COMMUNICATION WITH SEMANTIC CODING AND KNOWLEDGE DISTILLATION FOR TEXT-TO-IMAGE GENERATION

Hyelin Nam, Yonsei University, Korea, Republic of; Jihong Park, Jinho Choi, Deakin University, Australia; Mehdi Bennis, Oulu University, Finland; Seong-Lyun Kim, Yonsei University, Korea, Republic of