MMSP-L5: Multimodal Processing: Vision + Language 2
Thu, 18 Apr, 16:30 - 18:30 (UTC +9)
Location: Room 101
Session Type: Lecture
Session Co-Chairs: Gene Cheung, York University, Canada and Wei Hu, Peking University, China
Track: Multimedia Signal Processing
Click the to view the manuscript on IEEE Xplore Open Preview
Thu, 18 Apr, 16:30 - 16:50 (UTC +9)
 

MMSP-L5.1: LARGE LANGUAGE MODELS AUGMENTED RATING PREDICTION IN RECOMMENDER SYSTEM

Sichun Luo, City University of Hong Kong, Hong Kong; Jiansheng Wang, Northwest A&F University, Hong Kong; Aojun Zhou, The Chinese University of Hong Kong, Hong Kong; Li Ma, Tsinghua University, Hong Kong; Linqi Song, City University of Hong Kong, Hong Kong
Thu, 18 Apr, 16:50 - 17:10 (UTC +9)
 

MMSP-L5.2: PROMPTING LARGE LANGUAGE MODELS WITH FINE-GRAINED VISUAL RELATIONS FROM SCENE GRAPH FOR VISUAL QUESTION ANSWERING

Jiapeng Liu, Chengyang Fang, Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences, China; Liang Li, Bing Li, Institute of Information Engineering, Chinese Academy of Sciences, China; Dayong Hu, Heilongjiang Network Space Research Center, China; Can Ma, Institute of Information Engineering, Chinese Academy of Sciences, China
Thu, 18 Apr, 17:10 - 17:30 (UTC +9)
 

MMSP-L5.3: Learning Density Regulated and Multi-view Consistent Unsigned Distance Fields

Rui Zhang, Jingyi Xu, Weidong Yang, Lipeng Ma, Fudan Univerity, China; Menglong Chen, SimpleHPC LTD, China; Ben Fei, Fudan Univerity, China
Thu, 18 Apr, 17:30 - 17:50 (UTC +9)
 

MMSP-L5.4: Graph-based Environment Representation for Vision-and-Language Navigation in Continuous Environments

Ting Wang, Zhejiang University, China; Zongkai Wu, Westlake University, China; Feiyu Yao, 2012 Lab, Huawei Technologies Co., Ltd, China; Donglin Wang, Westlake University, China
Thu, 18 Apr, 17:50 - 18:10 (UTC +9)
 

MMSP-L5.5: HUMAN MOTION CAPTURE DATA SEGMENTATION BASED ON ST-GCN

Xiuyun Ma, Na Lv, University of Jinan, China
Thu, 18 Apr, 18:10 - 18:30 (UTC +9)
 

MMSP-L5.6: DOES VIDEO SUMMARIZATION REQUIRE VIDEOS? QUANTIFYING THE EFFECTIVENESS OF LANGUAGE IN VIDEO SUMMARIZATION

Yoonsoo Nam, Adam Lehavi, Daniel Yang, Digbalay Bose, Swabha Swayamdipta, Shrikanth Narayanan, University of Southern California, United States of America