IEEE ICASSP 2024 || Seoul, Korea || 14-19 April 2024

GC-L4.1

LIMMITS’24: MULTI-SPEAKER, MULTI-LINGUAL INDIC TTS WITH VOICE CLONING

Abhayjeet Singh, Amala Nagireddi, Deekshitha G, Jesuraja Bandekar, Roopa R, Sandhya Badiger, Sathvik Udupa, Prasanta Kumar Ghosh, Indian Institute of Science, India; Hema A Murthy, Indian Institute of Technology, Madras, India; Pranaw Kumar, Centre for Development of Advanced Computing, India; Keiichi Tokuda, Nagoya Institute of Technology, Japan, India; Mark Hasegawa-Johnson, University of Illinois, India; Philipp Olbrich, Deutsche Gesellschaft f ̈ur Internationale Zusammenarbeit (GIZ), India

Session:

GC-L4: LIMMITS'24: Multi-speaker, Multi-lingual Indic TTS with voice cloning Lecture

Location:

Room 209B

Presentation Time:

Wed, 17 Apr, 13:10 - 13:30 (UTC +9)

Session Co-Chairs:

Sathvik Udupa, Indian Institute of Science (IISc) Bangalore, India and Saurabh Kumar, Indian Institute of Science (IISc) Bangalore, India

Session GC-L4

GC-L4.1: LIMMITS’24: MULTI-SPEAKER, MULTI-LINGUAL INDIC TTS WITH VOICE CLONING

Abhayjeet Singh, Amala Nagireddi, Deekshitha G, Jesuraja Bandekar, Roopa R, Sandhya Badiger, Sathvik Udupa, Prasanta Kumar Ghosh, Indian Institute of Science, India; Hema A Murthy, Indian Institute of Technology, Madras, India; Pranaw Kumar, Centre for Development of Advanced Computing, India; Keiichi Tokuda, Nagoya Institute of Technology, Japan, India; Mark Hasegawa-Johnson, University of Illinois, India; Philipp Olbrich, Deutsche Gesellschaft f ̈ur Internationale Zusammenarbeit (GIZ), India

GC-L4.2: LEVERAGING EFFECTIVE LANGUAGE AND SPEAKER CONDITIONING IN INDIC TTS FOR LIMMITS 2024 CHALLENGE

Yejin Jeon, Youngjae Kim, Gary Geunbae Lee, POSTECH, Korea, Republic of

GC-L4.3: SINGLE-STAGE TTS WITH ADAPTED VOCODER AND CROSS-ATTENTION: TALTECH SYSTEMS FOR THE LIMMITS’24 CHALLENGE

Daniil Rõbnikov, Tanel Alumäe, Tallinn University of Technology, Estonia

GC-L4.4: SCALING NVIDIA's MULTI-SPEAKER MULTI-LINGUAL TTS SYSTEMS WITH ZERO-SHOT TTS TO INDIC LANGUAGES

Akshit Arora, Rohan Badlani, Sungwon Kim, Rafael Valle, Bryan Catanzaro, NVIDIA, United States of America

GC-L4.5: THE THU-HCSI MULTI-SPEAKER MULTI-LINGUAL FEW-SHOT VOICE CLONING SYSTEM FOR LIMMITS’24 CHALLENGE

Yixuan Zhou, Shuoyi Zhou, Shun Lei, Zhiyong Wu, Tsinghua University, China; Menglin Wu, ByteDance, China

GC-L4.6: Cross-lingual Text-to-Speech via Hierarchical Style Transfer

Sang-Hoon Lee, Ha-Yeong Choi, Seong-Whan Lee, Korea University, Korea, Republic of

Contact | Accessibility | Nondiscrimination Policy | IEEE Ethics Reporting | IEEE Privacy Policy | Terms | Signal Processing Society

©2026 IEEE – All rights reserved.

Last updated Last updated 11 April 2024.

Use of this website signifies your agreement to the IEEE Terms and Conditions.

Support: info@2024.ieeeicassp.org Host: https://cmsworldwide.com/