SLP-P17.2

PROMPTVC: FLEXIBLE STYLISTIC VOICE CONVERSION IN LATENT SPACE DRIVEN BY NATURAL LANGUAGE PROMPTS

Jixun Yao, Northwestern Polytechnical University, China; Yuguang Yang, Ximalaya Inc, China; Yi Lei, Ziqian Ning, Northwestern Polytechnical University, China; Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu, Ximalaya Inc, China; Lei Xie, Northwestern Polytechnical University, China

Session:
SLP-P17: Voice Conversion: Singing, accent and emotion Poster

Track:
Speech and Language Processing

Location:
Poster Zone 2B
Poster Board PZ-2B.2

Presentation Time:
Thu, 18 Apr, 08:20 - 10:20 (UTC +9)

Session Chair:
Zack Hodari, Papercup
View Manuscript
Presentation
Discussion
Resources
Session SLP-P17
SLP-P17.1: A STUDY ON COMBINING NON-PARALLEL AND PARALLEL METHODOLOGIES FOR MANDARIN-ENGLISH CROSS-LINGUAL VOICE CONVERSION
Chang Huai You, Minghui Dong, Institute for Infocomm Research, Singapore
SLP-P17.2: PROMPTVC: FLEXIBLE STYLISTIC VOICE CONVERSION IN LATENT SPACE DRIVEN BY NATURAL LANGUAGE PROMPTS
Jixun Yao, Northwestern Polytechnical University, China; Yuguang Yang, Ximalaya Inc, China; Yi Lei, Ziqian Ning, Northwestern Polytechnical University, China; Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu, Ximalaya Inc, China; Lei Xie, Northwestern Polytechnical University, China
SLP-P17.3: SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model
Jianwei Cui, University of Science and Technology of China, China; Yu Gu, Chao Weng, Tencent, China; Jie Zhang, Liping Chen, Lirong Dai, University of Science and Technology of China, China
SLP-P17.4: EXPLOITING AUDIO-VISUAL FEATURES WITH PRETRAINED AV-HUBERT FOR MULTI-MODAL DYSARTHRIC SPEECH RECONSTRUCTION
Xueyuan Chen, Yuejiao Wang, Xixin Wu, The Chinese University of Hong Kong, Hong Kong; Disong Wang, Vocal Engineering Technologies Limited, Hong Kong; Zhiyong Wu, Tsinghua University, China; Xunying Liu, Helen Meng, The Chinese University of Hong Kong, Hong Kong
SLP-P17.5: TRANSFER THE LINGUISTIC REPRESENTATIONS FROM TTS TO ACCENT CONVERSION WITH NON-PARALLEL DATA
Xi Chen, Jiakun Pei, Liumeng Xue, Mingyang Zhang, The Chinese University of Hong Kong, Shenzhen, China
SLP-P17.6: NEURAL CONCATENATIVE SINGING VOICE CONVERSION: RETHINKING CONCATENATION-BASED APPROACH FOR ONE-SHOT SINGING VOICE CONVERSION
Binzhu Sha, Tsinghua University, China; Xu Li, Tencent, China; Zhiyong Wu, Tsinghua University, China; Ying Shan, Tencent, China; Helen Meng, The Chinese University of Hong Kon, China
SLP-P17.7: PAVITS: EXPLORING PROSODY-AWARE VITS FOR END-TO-END EMOTIONAL VOICE CONVERSION
Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian, Southeast University, China
SLP-P17.8: VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research
Sarina Meyer, Ngoc Thang Vu, XIAOXIAO MIAO, University of Stuttgart Ringgold standard institution Pfaffenwaldring 5 b , Stuttgart, Baden Württermberg 70174 Germany
SLP-P17.9: Streaming ASR encoder for whisper-to-speech online voice conversion
Tseren Andzhukaev, Artem Ivanov, Anastasia Avdeeva, Aleksei Gusev, Fluenta.AI Wilmington, Delaware United States
Contacts