SLP-P29.3
BRIDGING THE GAPS OF BOTH MODALITY AND LANGUAGE: SYNCHRONOUS BILINGUAL CTC FOR SPEECH TRANSLATION AND SPEECH RECOGNITION
Chen Xu, Harbin Engineering University, China; Xiaoqian Liu, Erfeng He, Yuhao Zhang, Northeastern University, China; Qianqian Dong, Byte Dance, China; Tong Xiao, Jingbo Zhu, Northeastern University, China; Dapeng Man, Wu Yang, Harbin Engineering University, China
Session:
SLP-P29: Multimodal processing of speech Poster
Track:
Speech and Language Processing
Location:
Poster Zone 2A
Poster Board PZ-2A.3
Poster Board PZ-2A.3
Presentation Time:
Thu, 18 Apr, 16:30 - 18:30 (UTC +9)
Session Chair:
Kartik Audhkhasi, Google
Session SLP-P29
SLP-P29.1: MULTIMODAL SENTIMENT ANALYSIS BASED ON 3D STEREOSCOPIC ATTENTION
Jian Huang, Yuanyuan Pu, Dongming Zhou, Hang Shi, Zhengpeng Zhao, Dan Xu, Yunnan University, China; Jinde Cao, Southeast University, China
SLP-P29.2: CROSS-MODAL PARALLEL TRAINING FOR IMPROVING END-TO-END ACCENTED SPEECH RECOGNITION
Renchang Dong, Shanghai Normal University, China; Yijie Li, Dongxing Xu, Unisound AI Technology Co., Ltd., China; Yanhua Long, Shanghai Normal University, China
SLP-P29.3: BRIDGING THE GAPS OF BOTH MODALITY AND LANGUAGE: SYNCHRONOUS BILINGUAL CTC FOR SPEECH TRANSLATION AND SPEECH RECOGNITION
Chen Xu, Harbin Engineering University, China; Xiaoqian Liu, Erfeng He, Yuhao Zhang, Northeastern University, China; Qianqian Dong, Byte Dance, China; Tong Xiao, Jingbo Zhu, Northeastern University, China; Dapeng Man, Wu Yang, Harbin Engineering University, China
SLP-P29.4: ZERO-SHOT INTENT CLASSIFICATION USING A SEMANTIC SIMILARITY AWARE CONTRASTIVE LOSS AND LARGE LANGUAGE MODEL
Jaejin Cho, Rakshith Srinivasa, Ching-Hua Lee, Yashas Saidutta, Chouchang Yang, Yilin Shen, Hongxia Jin, Samsung Research America, United States of America
SLP-P29.5: VOXMM: RICH TRANSCRIPTION OF CONVERSATIONS IN THE WILD
Doyeop Kwak, Jaemin Jung, Kihyun Nam, Youngjoon Jang, Korea Advanced Institute of Science and Technology, Korea, Republic of; Jee-weon Jung, Shinji Watanabe, Carnegie Mellon University, United States of America; Joon Son Chung, Korea Advanced Institute of Science and Technology, Korea, Republic of
SLP-P29.6: MULTISCALE MATCHING DRIVEN BY CROSS-MODAL SIMILARITY CONSISTENCY FOR AUDIO-TEXT RETRIEVAL
Qian Wang, Jia-Chen Gu, Zhen-Hua Ling, University of Science and Technology of China, China
SLP-P29.7: INVESTIGATING THE CLUSTERS DISCOVERED BY PRE-TRAINED AV-HUBERT
Anja Virkkunen, Aalto University, Finland; Marek Sarvaš, Brno University of Technology, Czechia; Guangpu Huang, Tamas Grosz, Mikko Kurimo, Aalto University, Finland
Contacts