MLSP-P4.9

OLKAVS: AN OPEN LARGE-SCALE KOREAN AUDIO-VISUAL SPEECH DATASET

Jeongkyun Park, Jung-Wook Hwang, Sogang University, Korea, Republic of; Kwanghee Choi, Carnegie Mellon University, United States of America; Seung-Hyeon Lee, Sogang University, Korea, Republic of; Jun Hwan Ahn, Mindslab, Korea, Republic of; Rae-Hong Park, Hyung-Min Park, Sogang University, Korea, Republic of

Session:
MLSP-P4: Learning from Multimodal Data II Poster

Track:
Machine Learning for Signal Processing

Location:
Poster Zone 4C
Poster Board PZ-4C.9

Presentation Time:
Tue, 16 Apr, 13:10 - 15:10 (UTC +9)

Session Chair:
Zhiyong Wu, Tsinghua University
View Manuscript
Presentation
Discussion
Resources
Session MLSP-P4
MLSP-P4.1: SYNCHFORMER: EFFICIENT SYNCHRONIZATION FROM SPARSE CUES
Vladimir Iashin, Tampere University, Finland; Weidi Xie, Shanghai Jiao Tong University, China; Esa Rahtu, Tampere University, Finland; Andrew Zisserman, University of Oxford, United Kingdom of Great Britain and Northern Ireland
MLSP-P4.2: JOINT CLASSIFICATION OF HYPERSPECTRAL AND LIDAR DATA USING CROSS-MODAL HIERARCHICAL FREQUENCY FUSION NETWORK
Zheng Zeng, Tiecheng Song, Xinran Ma, Yinghao Jiu, School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, China; Huaiyi Sun, Chongqing Academy of Science and Technology, China
MLSP-P4.3: MODALITY RE-BALANCE FOR VISUAL QUESTION ANSWERING: A CAUSAL FRAMEWORK
Xinpeng Lv, Wanrong Huang, Haotian Wang, Ruochun Jin, Xueqiong Li, Zhipeng Lin, Shuman Li, Yongquan Feng, Yuhua Tang, State Key Laboratory of High Performance Computing (HPCL), National University of Defense Technology, China, China
MLSP-P4.4: A FINE-GRAINED TRI-MODAL INTERACTION MODEL FOR MULTIMODAL SENTIMENT ANALYSIS
Yuxing Zhi, Junhuai Li, Huaijun Wang, Jing Chen, Ting Cao, Xi'an University of Technology, China
MLSP-P4.5: ENHANCING AUDIO-VISUAL QUESTION ANSWERING WITH MISSING MODALITY VIA TRANS-MODAL ASSOCIATIVE LEARNING
Kyu Ri Park, Youngmin Oh, Jung Uk Kim, Kyung Hee University, Korea, Republic of
MLSP-P4.6: MULTIMODAL TRANSFORMER WITH A LOW-COMPUTATIONAL-COST GUARANTEE
Sungjin Park, Edward Choi, Korea Advanced Institute of Science and Technology, Korea, Republic of
MLSP-P4.7: VISION-SENSOR ATTENTION BASED CONTINUAL MULTIMODAL EGOCENTRIC ACTIVITY RECOGNITION
Shaoxu Cheng, Chiyuan He, Kailong Chen, Linfeng Xu, Hongliang Li, Fanman Meng, Qingbo Wu, University of Electronic Science and Technology of China, China
MLSP-P4.8: Multi-view Subspace Clustering with Consensus Graph Contrastive Learning
Jie Zhang, Yuan Sun, Yu Guo, Xi’an Jiaotong University, China; Zheng Wang, Feiping Nie, Northwestern Polytechnical University, China; Fei Wang, Xi’an Jiaotong University, China
MLSP-P4.9: OLKAVS: AN OPEN LARGE-SCALE KOREAN AUDIO-VISUAL SPEECH DATASET
Jeongkyun Park, Jung-Wook Hwang, Sogang University, Korea, Republic of; Kwanghee Choi, Carnegie Mellon University, United States of America; Seung-Hyeon Lee, Sogang University, Korea, Republic of; Jun Hwan Ahn, Mindslab, Korea, Republic of; Rae-Hong Park, Hyung-Min Park, Sogang University, Korea, Republic of
MLSP-P4.10: MULTI-LEVEL CONTRASTIVE LEARNING FOR HYBRID CROSS-MODAL RETRIEVAL
Yiming Zhao, Haoyu Lu, Renmin University of China, China; Shiqi Zhao, Haoran Wu, China Unicom Research Institute, China; Zhiwu Lu, Renmin University of China, China
MLSP-P4.11: RADEMACHER COMPLEXITY REGULARIZATION FOR CORRELATION-BASED MULTIVIEW REPRESENTATION LEARNING
Maurice Kuschel, Tanuj Hasija, Paderborn University, Germany; Timothy Marrinan, Pacific Northwest National Laboratory, United States of America
MLSP-P4.12: NAC: MITIGATING NOISY CORRESPONDENCE IN CROSS-MODAL MATCHING VIA NEIGHBOR AUXILIARY CORRECTOR
Yuqing Li, Tsinghua University, China; Haoming Huang, University of Chinese Academy of Sciences, China; Jian Xu, Shaolun Huang, Tsinghua University, China
Contacts