E-2-3: Speech Recognition |
All times are in New Zealand Time (UTC +13) |
| Presentation Time: Wednesday, December 9, 17:15 - 19:15 Check your Time Zone |
| E-2-3.1: PRIVACY PRESERVING ACOUSTIC MODEL TRAINING FOR SPEECH RECOGNITION |
| Yuuki Tachioka; Denso IT Laboratory |
| E-2-3.2: END-TO-END AUTOMATIC SPEECH RECOGNITION WITH DEEP MUTUAL LEARNING |
| Ryo Masumura; NTT Corporation |
| Mana Ihori; NTT Corporation |
| Akihiko Takashima; NTT Corporation |
| Tomohiro Tanaka; NTT Corporation |
| Takanori Ashihara; NTT Corporation |
| E-2-3.3: ATTENTIVE FUSION ENHANCED AUDIO-VISUAL ENCODING FOR TRANSFORMER BASED ROBUST SPEECH RECOGNITION |
| Liangfa Wei; University of Science and Technology of China |
| Jie Zhang; University of Science and Technology of China |
| Junfeng Hou; University of Science and Technology of China |
| Lirong Dai; University of Science and Technology of China |
| E-2-3.4: QUERY-BY-EXAMPLE SPOKEN TERM DETECTION USING GENERATIVE ADVERSARIAL NETWORK |
| Neil Shah; Dhirubhai Ambani Institute of Information and Communication Technology |
| Sreeraj R; Dhirubhai Ambani Institute of Information and Communication Technology |
| Maulik Madhavi; National University of Singapore |
| Nirmesh Shah; Dhirubhai Ambani Institute of Information and Communication Technology |
| Hemant Patil; Dhirubhai Ambani Institute of Information and Communication Technology |
| E-2-3.5: REDUCTION OF SPEECH DATA POSTERIORGRAMS BY COMPRESSING MAXIMUM-LIKELIHOOD STATE SEQUENCES IN QUERY BY EXAMPLE |
| Takashi Yokota; Iwate Prefectural University |
| Kazunori Kojima; Iwate Prefectural University |
| Shi-wook Lee; National Institute of Advanced Industrial Science and Technology |
| Yoshiaki Itoh; Iwate Prefectural University |
| E-2-3.6: EFFECTS OF END-TO-END ASR AND SCORE FUSION MODEL LEARNING FOR IMPROVED QUERY-BY-EXAMPLE SPOKEN TERM DETECTION |
| Takumi Kurokawa; Shizuoka University |
| Atsuhiko Kai; Shizuoka University |
| Hiroki Kondo; Shizuoka University |