SLP-P35.4
MAPACHE: MASKED PARALLEL TRANSFORMER FOR ADVANCED SPEECH EDITING AND SYNTHESIS
Guillermo Cámbara, Patrick Lumban Tobing, Mikolaj Babianski, Ravichander Vipperla, Duo Wang, Ron Shmelkin, Giuseppe Coccia, Orazio Angelini, Arnaud Joly, Mateusz Lajszczak, Vincent Pollet, Amazon, Spain
Session:
SLP-P35: Text to Speech Generation - P3 Poster
Track:
Speech and Language Processing
Location:
Poster Zone 3A
Poster Board PZ-3A.4
Poster Board PZ-3A.4
Presentation Time:
Fri, 19 Apr, 08:20 - 10:20 (UTC +9)
Session Co-Chairs:
Joon Son Chung, KAIST and Liping Chen, University of Science and Technology of China
Session SLP-P35
SLP-P35.1: TNFORMER: SINGLE-PASS MULTILINGUAL TEXT NORMALIZATION WITH A TRANSFORMER DECODER MODEL
Binbin Shen, Jie Wang, Meng Meng, Yujun Wang, Xiaomi Inc., China
SLP-P35.2: A UNIFIED FRONT-END FRAMEWORK FOR ENGLISH TEXT-TO-SPEECH SYNTHESIS
Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Qiao Tian, Yuanyuan Huo, Yuxuan Wang, ByteDance, China
SLP-P35.3: COLLABORATIVE WATERMARKING FOR ADVERSARIAL SPEECH SYNTHESIS
Lauri Juvela, Aalto University, Finland; Xin Wang, National Institute of Informatics, Japan
SLP-P35.4: MAPACHE: MASKED PARALLEL TRANSFORMER FOR ADVANCED SPEECH EDITING AND SYNTHESIS
Guillermo Cámbara, Patrick Lumban Tobing, Mikolaj Babianski, Ravichander Vipperla, Duo Wang, Ron Shmelkin, Giuseppe Coccia, Orazio Angelini, Arnaud Joly, Mateusz Lajszczak, Vincent Pollet, Amazon, Spain
SLP-P35.5: Diversity based core-set selection for text-to-speech with linguistic and acoustic features
Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari, The University of Tokyo, Japan
SLP-P35.6: FEWER-TOKEN NEURAL SPEECH CODEC WITH TIME-INVARIANT CODES
Yong Ren, Tao Wang, Jiangyan Yi, Le Xu, Institute of Automation, Chinese Academy of Sciences, China; Jianhua Tao, Department of Automation, Tsinghua University, China; Chu Yuan Zhang, Institute of Automation, Chinese Academy of Sciences, China; Junzuo Zhou, Institute of Automation, Chinese Academy of Sciences, China
SLP-P35.7: CONVNEXT-TTS AND CONVNEXT-VC: CONVNEXT-BASED FAST END-TO-END SEQUENCE-TO-SEQUENCE TEXT-TO-SPEECH AND VOICE CONVERSION
Takuma Okamoto, Yamato Ohtani, National Institute of Information and Communications Technology, Japan; Tomoki Toda, Nagoya University, Japan; Hisashi Kawai, National Institute of Information and Communications Technology, Japan
SLP-P35.8: SYNTHE-SEES: FACE BASED TEXT-TO-SPEECH FOR VIRTUAL SPEAKER
Jae Hyun Park, Joon-Gyu Maeng, Speech AI/NCSOFT, Korea, Republic of; TaeJun Bak, SKT, Korea, Republic of; Young-Sun Joo, Speech AI/NCSOFT, Korea, Republic of
SLP-P35.9: LANGUAGE-ORIENTED COMMUNICATION WITH SEMANTIC CODING AND KNOWLEDGE DISTILLATION FOR TEXT-TO-IMAGE GENERATION
Hyelin Nam, Yonsei University, Korea, Republic of; Jihong Park, Jinho Choi, Deakin University, Australia; Mehdi Bennis, Oulu University, Finland; Seong-Lyun Kim, Yonsei University, Korea, Republic of
Contacts