MLSP-L10.3
AV-SUPERB: A MULTI-TASK EVALUATION BENCHMARK FOR AUDIO-VISUAL REPRESENTATION MODELS
Yuan Tseng, National Taiwan University, Taiwan; Layne Berry, University of Texas at Austin, United States of America; Yi-Ting Chen, Academia Sinica, Taiwan; I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, National Taiwan University, Taiwan; Puyuan Peng, The University of Texas at Austin, United States of America; Yi-Jen Shih, University of Texas at Austin, United States of America; Hung-Yu Wang, Haibin Wu, National Taiwan University, Taiwan; Po-Yao Huang, Meta AI, United States of America; Chun-Mao Lai, National Taiwan University, Taiwan; Shang-Wen Li, Meta AI, United States of America; David Harwath, The University of Texas at Austin, United States of America; Yu Tsao, Academia Sinica, Taiwan; Abdelrahman Mohamed, Rembrand, United States of America; Chi Luen Feng, Hung-yi Lee, National Taiwan University, Taiwan
Session:
MLSP-L10: Learning from Multimodal Data I Lecture
Track:
Machine Learning for Signal Processing
Location:
Room E2
Presentation Time:
Wed, 17 Apr, 13:50 - 14:10 (UTC +9)
Session Co-Chairs:
Leibny Paola Garcia, Johns Hopkins University and Yang Liu, Meta
Session MLSP-L10
MLSP-L10.1: IMAGE RETRIEVAL WITH COMPOSED QUERY BY MULTI-SCALE MULTI-MODAL FUSION
Zelong Sun, Guoxing Yang, Zhiwu Lu, Renmin University of China, China; Hao Jiang, Guojie Zhu, Zhao Cao, Huawei Possion Lab, China
MLSP-L10.2: BEYOND EMPIRICAL WINDOWING: AN ATTENTION-BASED APPROACH FOR TRUST PREDICTION IN AUTONOMOUS VEHICLES
Minxue Niu, University of Michigan, United States of America; Zhaobo Zheng, Honda Research Institute USA, Inc., United States of America; Kumar Akash, Honda Research Institute USA, Inc, United States of America; Teruhisa Misu, Honda Research Institute USA, Inc., United States of America
MLSP-L10.3: AV-SUPERB: A MULTI-TASK EVALUATION BENCHMARK FOR AUDIO-VISUAL REPRESENTATION MODELS
Yuan Tseng, National Taiwan University, Taiwan; Layne Berry, University of Texas at Austin, United States of America; Yi-Ting Chen, Academia Sinica, Taiwan; I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, National Taiwan University, Taiwan; Puyuan Peng, The University of Texas at Austin, United States of America; Yi-Jen Shih, University of Texas at Austin, United States of America; Hung-Yu Wang, Haibin Wu, National Taiwan University, Taiwan; Po-Yao Huang, Meta AI, United States of America; Chun-Mao Lai, National Taiwan University, Taiwan; Shang-Wen Li, Meta AI, United States of America; David Harwath, The University of Texas at Austin, United States of America; Yu Tsao, Academia Sinica, Taiwan; Abdelrahman Mohamed, Rembrand, United States of America; Chi Luen Feng, Hung-yi Lee, National Taiwan University, Taiwan
MLSP-L10.4: INCOMPLETE MULTI-VIEW REPRESENTATION LEARNING THROUGH ANCHOR GRAPH-BASED GCN AND INFORMATION BOTTLENECK
Zhenjiao Liu, Xiao Wang, Dalian University of Technology, France; Xiaodi Huang, Charles Sturt University, Australia; Guanlin Li, Institut Polytechnique de Paris, France; Ke Sun, Zhikui Chen, Dalian University of Technology, China
MLSP-L10.5: Language-guided Few-shot Semantic Segmentation
Jing Wang, Alibaba Group, China; Yuang Liu, East China Normal University, China; Qiang Zhou, Fan Wang, Alibaba Group, China
MLSP-L10.6: SWEEPMM: A HIGH-QUALITY MULTIMODAL DATASET FOR SWEEPING ROBOTS IN HOME SCENARIOS FOR VISION-LANGUAGE MODEL
Weichen Xu, Xinxin Xu, Tianhao Fu, Jian Cao, Xiaoyang Xu, Yuetian Huang, Xixin Cao, Xing Zhang, Peking University, China
Contacts