MO4.PB.2

Vision Language Model Interpretability with Concept Guided Decoding

Pedro H V Valois, Dipesh Satav, Rodrigo Abreu Pizzatto de Campos, Gulpi Qorik Oktagalu Pratamasunu, Kazuhiro Fukui, University of Tsukuba, Japan

Session:
MO4.PB: Image and Video Analysis, Synthesis, and Retrieval 6 Poster

Track:
[IV-ANA] Image and video analysis, synthesis, and retrieval

Location:
Poster Area B

Presentation Time:
Mon, 15 Sep, 16:30 - 18:00 Anchorage Time (UTC -8)

Session Chair:
Yannick Berthoumieu, Bordeaux University
Presentation
Discussion
Resources
No resources available.
Session MO4.PB
MO4.PB.1: VIDEO INDIVIDUAL COUNTING WITH IMPLICIT ONE-TO-MANY MATCHING
Xuhui Zhu, Huazhong University of Science and Technology, China; Jing Xu, FiberHome Telecommunication Technologies Company Limited, China; Bingjie Wang, University of Rochester, China; Huikang Dai, FiberHome Telecommunication Technologies Company Limited, China; Hao Lu, Huazhong University of Science and Technology, China
MO4.PB.2: Vision Language Model Interpretability with Concept Guided Decoding
Pedro H V Valois, Dipesh Satav, Rodrigo Abreu Pizzatto de Campos, Gulpi Qorik Oktagalu Pratamasunu, Kazuhiro Fukui, University of Tsukuba, Japan
MO4.PB.3: CHTMAE: Cross-modal Hierarchical Temporal-spatial Masked Autoencoder Model for Micro-expression Recognition
Zhihua Xie, Haolin Chang, Guohua Miao, Jiangxi Science and Technology Normal University, China; Zhaojin Lu, Jiangxi Tellhow Animation College, Tellhow Group Co.,LTD, China
MO4.PB.4: ENHANCING MULTISCALE FEATURE REPRESENTATION FOR OBJECT-LEVEL RECOGNITION IN MASKED IMAGE MODELING
Tsatsral Amarbayasgalan, Sungjun Wang, Mooseop Kim, Chi Yoon Jeong, Electronics and Telecommunications Research Institute, Korea (South)
MO4.PB.5: CONTINUOUS ACTION UNIT INTENSITY MODELING FOR MICRO-EXPRESSION RECOGNITION
Hanyu Jiang, Jiayi Lyu, Xing Lan, Jian Xue, University of Chinese Academy of Sciences, China
MO4.PB.6: MULTI-VIEW AMODAL INSTANCE SEGMENTATION BASED ON 3D REPRESENTATION
Liu Chang, Hou Ya-Li, Chen Bingchuan, Beijing Jiaotong University, China
MO4.PB.7: ZERO-SHOT PSEUDO LABELS GENERATION USING SAM AND CLIP FOR SEMI-SUPERVISED SEMANTIC SEGMENTATION
Nagito Saito, Shintaro Ito, Koichi Ito, Takafumi Aoki, Tohoku University, Japan
MO4.PB.8: Scale-Aware Crowd Counting Network With Annotation Error Modeling
Yi-Kuan Hsieh, JunWei Hsieh, National Yang Ming Chiao Tung University, Taiwan; Xin Li, University at Albany, Taiwan; Yu-Ming Zhang, National Central University, Taiwan; Yu-Chee Tseng, National Yang Ming Chiao Tung University, Taiwan; Ming-Ching Chang, University at Albany, Taiwan
MO4.PB.9: Appearance Estimation and Image Segmentation via Tensor Factorization
Jeová Farias Sales Rocha Neto, Bowdoin College, United States
MO4.PB.10: Tiny-VPS: Tiny Video Panoptic Segmentation Standing on the Shoulder of Giant-VPS
Qingfeng Liu, Mostafa El-Khamy, Kee-Bong Song, Samsung Semiconductor Inc USA, United States
MO4.PB.11: Unsupervised Action Anticipation through Action Cluster Prediction
Jiuxu Chen, Nupur Thakur, Sachin Chhabra, Baoxin Li, Arizona State University School of Computing and Augmented Intelligence, United States
MO4.PB.12: DGRGaze: A difference-guided gaze estimation framework based on 6D rotation matrix representation
Xiaohao Wang, Sirui Zhao, Xinglong Mao, Yiming Zhang, Shifeng Liu, Tong Xu, Enhong Chen, University of Science and Technology of China, China
MO4.PB.13: SCL-GAN: Spatially-Correlative Lightweight GAN for Efficient and High-Fidelity Thermal-Visible Face Synthesis
Nand Kumar Yadav, Rodrigue Rizk, KC Santosh, University Of South Dakota, United States; Rayeesa Mehmood, IIIT-Allahabad, India
Contacts