SPE-P21: Voice Conversion |
Session Type: Poster |
Time: Friday, 8 May, 15:15 - 17:15 |
Location: On-Demand |
Virtual Session: View on Virtual Platform |
Session Chairs: Xunying Liu, Chinese University of Hong Kong and Greg Sell, Johns Hopkins University
|
|
SPE-P21.1: ONE-SHOT VOICE CONVERSION USING STAR-GAN |
Ruobai Wang; Netease Inc. |
Yu Ding; Netease Inc. |
Lincheng Li; Netease Inc. |
Changjie Fan; Netease Inc. |
|
SPE-P21.2: ONE-SHOT VOICE CONVERSION BY VECTOR QUANTIZATION |
Da-Yi Wu; National Taiwan University |
Hung-yi Lee; National Taiwan University |
|
SPE-P21.3: NEUTRAL TO LOMBARD SPEECH CONVERSION WITH DEEP LEARNING |
Enguerrand Gentet; Groupe PSA |
Bertrand David; LTCI, Télécom Paris, Institut Polytechnique de Paris |
Sébastien Denjean; Groupe PSA |
Gaël Richard; LTCI, Télécom Paris, Institut Polytechnique de Paris |
Vincent Roussarie; Groupe PSA |
|
SPE-P21.4: END-TO-END VOICE CONVERSION VIA CROSS-MODAL KNOWLEDGE DISTILLATION FOR DYSARTHRIC SPEECH RECONSTRUCTION |
Disong Wang; Chinese University of Hong Kong |
Jianwei Yu; Chinese University of Hong Kong |
Xixin Wu; Chinese University of Hong Kong |
Songxiang Liu; Chinese University of Hong Kong |
Lifa Sun; SpeechX Limited |
Xunying Liu; Chinese University of Hong Kong |
Helen Meng; Chinese University of Hong Kong |
|
SPE-P21.5: PITCHNET: UNSUPERVISED SINGING VOICE CONVERSION WITH PITCH ADVERSARIAL NETWORK |
Chengqi Deng; Zhejiang University |
Chengzhu Yu; Tencent |
Heng Lu; Tencent |
Chao Weng; Tencent |
Dong Yu; Tencent |
|
SPE-P21.6: AN IMPROVED FRAME-UNIT-SELECTION BASED VOICE CONVERSION SYSTEM WITHOUT PARALLEL TRAINING DATA |
Feng-Long Xie; Tencent |
Xin-Hui Li; Tencent |
Bo Liu; Tencent |
Yi-Bin Zheng; Tencent |
Li Meng; Tencent |
Li Lu; Tencent |
Frank K. Soong; Microsoft Research Asia |
|
SPE-P21.7: VOICE CONVERSION WITH TRANSFORMER NETWORK |
Ruolan Liu; Samsung Research China-Beijing |
Xiao Chen; Samsung Research China-Beijing |
Xue Wen; Samsung Research China-Beijing |
|
SPE-P21.8: MSPEC-NET : MULTI-DOMAIN SPEECH CONVERSION NETWORK |
Harshit Malaviya; Dhirubhai Ambani Institute of Information and Communication Technology |
Jui Shah; Dhirubhai Ambani Institute of Information and Communication Technology |
Maitreya Patel; Dhirubhai Ambani Institute of Information and Communication Technology |
Jalansh Munshi; Dhirubhai Ambani Institute of Information and Communication Technology |
Hemant Patil; Dhirubhai Ambani Institute of Information and Communication Technology |
|
SPE-P21.9: MULTI-SPEAKER AND MULTI-DOMAIN EMOTIONAL VOICE CONVERSION USING FACTORIZED HIERARCHICAL VARIATIONAL AUTOENCODER |
Mohamed Elgaar; Humelo Inc. and Korea Advanced Institute of Science and Technology |
Jung Bae Park; Humelo Inc. and Korea Advanced Institute of Science and Technology |
Sang Wan Lee; Humelo Inc. and Korea Advanced Institute of Science and Technology, KAIST Institute for Artificial Intelligence, KAIST Center for Neuroscience-inspired Artificial Intelligence |
|
SPE-P21.10: EMOTIONAL VOICE CONVERSION USING MULTITASK LEARNING WITH TEXT-TO-SPEECH |
Tae-Ho Kim; Korea Advanced Institute of Science and Technology (KAIST) |
Sungjae Cho; Korea Advanced Institute of Science and Technology (KAIST) |
Shinkook Choi; Korea Advanced Institute of Science and Technology (KAIST) |
Sejik Park; Korea Advanced Institute of Science and Technology (KAIST) |
Soo-Young Lee; Korea Advanced Institute of Science and Technology (KAIST) |
|
SPE-P21.11: EFFECTIVE WAVENET ADAPTATION FOR VOICE CONVERSION WITH LIMITED DATA |
Hongqiang Du; Northwestern Polytechnical University |
Xiaohai Tian; National University of Singapore |
Lei Xie; Northwestern Polytechnical University |
Haizhou Li; National University of Singapore |
|
SPE-P21.12: LIFTER TRAINING AND SUB-BAND MODELING FOR COMPUTATIONALLY EFFICIENT AND HIGH-QUALITY VOICE CONVERSION USING SPECTRAL DIFFERENTIALS |
Takaaki Saeki; University of Tokyo |
Yuki Saito; University of Tokyo |
Shinnosuke Takamichi; University of Tokyo |
Hiroshi Saruwatari; University of Tokyo |
|