Paper ID | F-1-1.2 |
Paper Title |
ACOUSTIC AND TEXTUAL DATA AUGMENTATION FOR CODE-SWITCHING SPEECH RECOGNITION IN UNDER-RESOURCED LANGUAGE |
Authors |
I-Ting Hsieh, Chung-Hsien Wu, Chun-Huang Wang, National Cheng Kung University, Taiwan |
Session |
F-1-1: Emotion, Dialect, and Age Recognition |
Time | Tuesday, 08 December, 12:30 - 14:00 |
Presentation Time: | Tuesday, 08 December, 12:45 - 13:00 Check your Time Zone |
|
All times are in New Zealand Time (UTC +13) |
Topic |
Speech, Language, and Audio (SLA): |
Abstract |
Under-resourced and code-switching speech recognition have recently received research interest, resulting in several robust acoustic modeling and language modeling approaches. As Taiwanese and Mandarin have been popularly and widely used in Taiwan, this paper aims to address the under-resourced and code-switching issues. First, phone sharing between Taiwanese and Mandarin is employed for acoustic data augmentation to construct the acoustic models of Taiwanese speech recognizer. Regarding the lack of Taiwanese text corpus, this paper translates Mandarin corpus into Taiwanese corpus based on word-to-word translation. Moreover, additional translation rules for code-switching text are manually designed. The augmented text corpus is then used for training the code-switching language models. For evaluation, the word error rate for code-switching speech recognition was 26.02%, which was better than that trained by the pure Taiwanese corpus. |