MMSP-P20.9
SEEING IS BELIEVING: COMPREHENSIVE SELF-REFLECTIVE EVALUATION SYSTEM FOR LARGE MULTI-MODAL MODELS
Guocheng Hu, Chaoqun Zheng, Hongjiao Guan, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Computing Power Internet and Service Computing, Shandong Fundamental Research Center for Computer Science., China; Hui Cui, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science., China; Shiwei Wu, Evay Info Co., Ltd., China; Wenpeng Lu, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Computing Power Internet and Service Computing, Shandong Fundamental Research Center for Computer Science., China
Session:
MMSP-P20: Efficient Multimodal Large Language Models and Evaluation Poster
Track:
Multimedia Signal Processing [MM]
Location:
Poster Area 19
Presentation Time:
Thu, 7 May, 16:30 - 18:30
Presentation
Discussion
Resources
No resources available.
Session MMSP-P20
MMSP-P20.1: iMathBench: Is Your Multi-modal Large Language Model Ready to Solve Mathematical Problems Embedded in Images?
Junhao Guo, Xinyi Jiang, Guoming Wang, Zhejiang University, China; Rongxing Lu, Qween‘s University, Canada; Siliang Tang, Zhejiang University, China
MMSP-P20.2: Trajectory-Enhanced Camera Motion Understanding for Multimodal Large Language Models
Yuanxin Liu, Sida Li, Kun Ouyang, Shicheng Li, Linli Yao, Xu Sun, Peking University, China; Weike Jin, Huawei Technologies Co., Ltd, China
MMSP-P20.3: PAR: Prompt-Aware Token Reduction Method for Efficient Large Multimodal Models
Yingen Liu, Fan Wu, Ruihui Li, Zhuo Tang, Kenli Li, Hunan University, China
MMSP-P20.4: ENRICH VISUAL FEATURES BY HOLISTIC SAMPLING AND HIERARCHICAL CONDENSING IN MULTIMODAL LARGE LANGUAGE MODELS
Yuting Bai, Harbin Institute of Technology, China; Suiwu Bai, Jianli Ran, B-AI Lab, China; Tonghua Su, Harbin Institute of Technology, China; Zixing Bai, Fudan University, China
MMSP-P20.5: MotionFusion: Fusing Motion and Saliency for Fast Video Large Language Model Inference
Chenxi Du, Ju Ren, Yaoxue Zhang, Tsinghua University, China
MMSP-P20.6: M2FNET: MULTI-LEVEL MODALITY-FUSED NETWORK FOR ROBUST FINGERPRINT AND FINGER VEIN RECOGNITION
Wenyang Miao, Xionghan Zhao, Hengyi Ren, Xing Li, Jinting Ren, Nanjing Forestry University, China
MMSP-P20.7: CHAIN-OF-CAPTION: TRAINING-FREE IMPROVEMENT OF MULTIMODAL LARGE LANGUAGE MODEL ON REFERRING EXPRESSION COMPREHENSION
Yik Lung Pang, Changjae Oh, Queen Mary University of London, United Kingdom of Great Britain and Northern Ireland
MMSP-P20.8: LaPrune: Layout-Aware Pruning for Efficient Multimodal Large Language Models
Hao Wu, Ke Lu, Xiuyuan Zhu, Yuqiu Li, Jian Xue, University of Chinese Academy of Sciences, China; Yi Liu, State Key Laboratory of Communication Content Cognition, Beijing, China, China
MMSP-P20.9: SEEING IS BELIEVING: COMPREHENSIVE SELF-REFLECTIVE EVALUATION SYSTEM FOR LARGE MULTI-MODAL MODELS
Guocheng Hu, Chaoqun Zheng, Hongjiao Guan, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Computing Power Internet and Service Computing, Shandong Fundamental Research Center for Computer Science., China; Hui Cui, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science., China; Shiwei Wu, Evay Info Co., Ltd., China; Wenpeng Lu, Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences); Shandong Provincial Key Laboratory of Computing Power Internet and Service Computing, Shandong Fundamental Research Center for Computer Science., China
MMSP-P20.10: SVCF: ENABLING ZERO-SHOT CORRECTION OF REASONING STEPS IN MULTI-MODAL LARGE LANGUAGE MODELS
Boyang Jiang, University of Electronic Science and Technology of China, China; Huang Tianxi, School of Humanities and General Education, Chengdu Textile College, China; Yu Yang, Yue Zhang, Guiduo Duan, Laboratory of Intelligent Collaborative Computing, University of Electronic Science and Technology of China, China; Tao He, University of Electronic Science and Technology of China, China
Contacts