SLP-L2.6

MIDI-Voice: Expressive Zero-shot Singing Voice Synthesis via MIDI-driven Priors

Dong-Min Byun, Sang-Hoon Lee, Ji-Sang Hwang, Seong-Whan Lee, Korea University, Korea, Republic of

Session:
SLP-L2: Voice Conversion I Lecture

Track:
Speech and Language Processing

Location:
Room 201

Presentation Time:
Tue, 16 Apr, 14:50 - 15:10 (UTC +9)

Session Co-Chairs:
Berrak Sisman, The University of Texas at Dallas and Junichi Yamagishi, NII Japan
View Manuscript
Presentation
Discussion
Resources
Session SLP-L2
SLP-L2.1: FIRNET: FUNDAMENTAL FREQUENCY CONTROLLABLE FAST NEURAL VOCODER WITH TRAINABLE FINITE IMPULSE RESPONSE FILTER
Yamato Ohtani, Takuma Okamoto, National Institute of Information and Communications Technology, Japan; Tomoki Toda, Nagoya University, Japan; Hisashi Kawai, National Institute of Information and Communications Technology, Japan
SLP-L2.2: DUALVC 2: DYNAMIC MASKED CONVOLUTION FOR UNIFIED STREAMING AND NON-STREAMING VOICE CONVERSION
Ziqian Ning, Yuepeng Jiang, Northwestern Polytechnical University, China; Pengcheng Zhu, NetEase Inc., China; Shuai Wang, The Chinese University of Hong Kong, China; Jixun Yao, Lei Xie, Northwestern Polytechnical University, China; Mengxiao Bi, NetEase Inc., China
SLP-L2.3: Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations
Hui Lu, Xixin Wu, Haohan Guo, The Chinese University of Hong Kong, Hong Kong; Songxiang Liu, Tencent, Hong Kong; Zhiyong Wu, Helen Meng, The Chinese University of Hong Kong, Hong Kong
SLP-L2.4: ESVC: COMBINING ADAPTIVE STYLE FUSION AND MULTI-LEVEL FEATURE DISENTANGLEMENT FOR EXPRESSIVE SINGING VOICE CONVERSION
Zeyu Yang, Nanjing University of Posts and Telecommunications, China; Minchuan Chen, Ping An Technology, China; Yanping Li, Nanjing University of Posts and Telecommunications, China; Wei Hu, Shaojun Wang, Jing Xiao, Ping An Technology, China; Zijian Li, Georgia Institute of Technology, United States of America
SLP-L2.5: UNIT-DSR: DYSARTHRIC SPEECH RECONSTRUCTION SYSTEM USING SPEECH UNIT NORMALIZATION
Yuejiao Wang, Xixin Wu, The Chinese University of Hong Kong, Hong Kong; Disong Wang, Vocal Engineering Technologies Limited, Hong Kong; Lingwei Meng, Helen Meng, The Chinese University of Hong Kong, Hong Kong
SLP-L2.6: MIDI-Voice: Expressive Zero-shot Singing Voice Synthesis via MIDI-driven Priors
Dong-Min Byun, Sang-Hoon Lee, Ji-Sang Hwang, Seong-Whan Lee, Korea University, Korea, Republic of
Contacts