SPE-P19: Machine Learning for Speech Synthesis III |
Session Type: Poster |
Time: Friday, 8 May, 11:45 - 13:45 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chairs: Yu Zhang, Google and Mounya Elhilali, Johns Hopkins University
|
|
SPE-P19.1: END-TO-END CODE-SWITCHING TTS WITH CROSS-LINGUAL LANGUAGE MODEL |
Xuehao Zhou; National University of Singapore |
Xiaohai Tian; National University of Singapore |
Grandee Lee; National University of Singapore |
Rohan Kumar Das; National University of Singapore |
Haizhou Li; National University of Singapore |
|
SPE-P19.2: CODE-SWITCHED SPEECH SYNTHESIS USING BILINGUAL PHONETIC POSTERIORGRAM WITH ONLY MONOLINGUAL CORPORA |
Yuewen Cao; Chinese University of Hong Kong |
Songxiang Liu; Chinese University of Hong Kong |
Xixin Wu; Chinese University of Hong Kong |
Shiyin Kang; Tencent |
Peng Liu; Tencent |
Zhiyong Wu; Tsinghua University |
Xunying Xliu; Chinese University of Hong Kong |
Dan Su; Tencent |
Dong Yu; Tencent |
Helen Meng; Chinese University of Hong Kong |
|
SPE-P19.3: GENERATING MULTILINGUAL VOICES USING SPEAKER SPACE TRANSLATION BASED ON BILINGUAL SPEAKER DATA |
Soumi Maiti; City University of New York |
Erik Marchi; Apple |
Alistair Conkie; Apple |
|
SPE-P19.4: SPEAKER ADAPTATION OF A MULTILINGUAL ACOUSTIC MODEL FOR CROSS-LANGUAGE SYNTHESIS |
Ivan Himawan; ObEN |
Sandesh Aryal; ObEN |
Iris Ouyang; ObEN |
Sam Kang; ObEN |
Pierre Lanchantin; ObEN |
Simon King; University of Edinburgh |
|
SPE-P19.5: SEMI-SUPERVISED SPEAKER ADAPTATION FOR END-TO-END SPEECH SYNTHESIS WITH PRETRAINED MODELS |
Katsuki Inoue; Okayama university |
Sunao Hara; Okayama university |
Masanobu Abe; Okayama university |
Tomoki Hayashi; Nagoya university |
Ryuichi Yamamoto; LINE Corporation |
Shinji Watanabe; Johns Hopkins university |
|
SPE-P19.6: BOFFIN TTS: FEW-SHOT SPEAKER ADAPTATION BY BAYESIAN OPTIMIZATION |
Henry Moss; Lancaster University |
Vatsal Aggarwal; Amazon, Inc. |
Nishant Prateek; Amazon, Inc. |
Javier Gonzalez; Amazon, Inc. |
Roberto Barra-Chicote; Amazon, Inc. |
|
SPE-P19.7: SEMI-SUPERVISED LEARNING BASED ON HIERARCHICAL GENERATIVE MODELS FOR END-TO-END SPEECH SYNTHESIS |
Takato Fujimoto; Nagoya Institute of Technology |
Shinji Takaki; Nagoya Institute of Technology |
Kei Hashimoto; Nagoya Institute of Technology |
Keiichiro Oura; Nagoya Institute of Technology |
Yoshihiko Nankaku; Nagoya Institute of Technology |
Keiichi Tokuda; Nagoya Institute of Technology |
|
SPE-P19.8: BREATHING AND SPEECH PLANNING IN SPONTANEOUS SPEECH SYNTHESIS |
Éva Székely; KTH Royal Institute of Technology |
Gustav Eje Henter; KTH Royal Institute of Technology |
Jonas Beskow; KTH Royal Institute of Technology |
Joakim Gustafson; KTH Royal Institute of Technology |
|
SPE-P19.9: ESPNET-TTS: UNIFIED, REPRODUCIBLE, AND INTEGRATABLE OPEN SOURCE END-TO-END TEXT-TO-SPEECH TOOLKIT |
Tomoki Hayashi; Nagoya University |
Ryuichi Yamamoto; LINE Corporation |
Katsuki Inoue; Okayama University |
Takenori Yoshimura; Nagoya University |
Shinji Watanabe; Johns Hopkins University |
Tomoki Toda; Nagoya University |
Kazuya Takeda; Nagoya University |
Yu Zhang; Google AI |
Xu Tan; Microsoft Research Asia |
|
SPE-P19.10: EXTRACTING UNIT EMBEDDINGS USING SEQUENCE-TO-SEQUENCE ACOUSTIC MODELS FOR UNIT SELECTION SPEECH SYNTHESIS |
Xiao Zhou; University of Science and Technology of China |
Zhen-Hua Ling; University of Science and Technology of China |
Li-Rong Dai; University of Science and Technology of China |
|
SPE-P19.11: AUDIO-ASSISTED IMAGE INPAINTING FOR TALKING FACES |
Alexandros Koumparoulis; University of Thessaly |
Gerasimos Potamianos; University of Thessaly |
Samuel Thomas; IBM |
Edmilson da Silva Morais; IBM |
|