MO4.L5: Multi-modal Agents for Visual Analysis and Generation
Special Session
Mon, 15 Sep, 16:30 - 18:00 Anchorage Time (UTC -8)
Location: Dena'ina: Tubughnenq 3
Session Type: Lecture
Session Co-Chairs: Qi Wang, Guizhou University and Wu Liu, University of Science and Technology of China
Track: Special Sessions
Click the to view the manuscript on IEEE Xplore Open Preview
Mon, 15 Sep, 16:30 - 16:45 Anchorage Time (UTC -8)
 

MO4.L5.1: MODALITY-AWARE DIFFUSION DISTILLATION NETWORK FOR SENTIMENT ANALYSIS IN MISSING MODALITIES

Zhiyu Liang, Yutian Li, Maolin Li, Guangdong University of Technology, China; Lap-Kei Lee, Fu Lee Wang, Hong Kong Metropolitan University, China; Zhenguo Yang, Guangdong University of Technology, China
Mon, 15 Sep, 16:45 - 17:00 Anchorage Time (UTC -8)
 

MO4.L5.2: Segment-Attention Augmented Dual-Contrastive Aggregation Learning for Unsupervised Visible-Infrared Person Re-identification

Liyun Liu, Luokun He, Wenjie Qian, Xiao Wang, Wei Wang, wuhan university of science and technology, China
Mon, 15 Sep, 17:00 - 17:15 Anchorage Time (UTC -8)
 

MO4.L5.3: PDD-AGENT: MULTIMODAL LARGE LANGUAGE MODEL-DRIVEN AI AGENT FOR ENHANCED PLANT DISEASE DIAGNOSIS

Lufu Qin, Xingcai Wu, Xinyu Dong, Huan Wang, Tingwei Yang, Qi Wang, Guizhou University, China
Mon, 15 Sep, 17:15 - 17:30 Anchorage Time (UTC -8)
 

MO4.L5.4: MCM: A MULTI-AGENT COLLABORATIVE MULTIMODAL FRAMEWORK FOR TRADITIONAL CHINESE MEDICINE DIAGNOSIS

Chendan Liang, Shanghai Polytechnic University, China; Zeyu Ma, Wanying Wang, Minjie Ding, Shanghai Development Center of Computer Software Technology, China; Zhiyuan Cao, Shanghai Polytechnic University, China; Mingang Chen, Shanghai Development Center of Computer Software Technology, China
Mon, 15 Sep, 17:30 - 17:45 Anchorage Time (UTC -8)
 

MO4.L5.5: TAGSIM: TOPIC-INFORMED ATTENTION GUIDED SIMILARITY METRIC FOR IMAGE CAPTION COMPARISON

Vipul Chanchlani, Vishal Himmatsinghka, Ayush Himmatsinghka, Indian Institute of Technology Kanpur, India; Jivnesh Sandhan, Indian Institute of Technology Dharwad, India; Tushar Sandhan, Indian Institute of Technology Kanpur, India
Mon, 15 Sep, 17:45 - 18:00 Anchorage Time (UTC -8)
 

MO4.L5.6: STRUCTURED INSTRUCTION PARSING AND SCENE ALIGNMENT FOR UAV VISION-LANGUAGE NAVIGATION

Liangyu Zhou, Rui Xue, Xiaoyan Luo, beihang university, China