MMSP-L1.4

TEXTUAL TOKENS CLASSIFICATION FOR MULTI-MODAL ALIGNMENT IN VISION-LANGUAGE TRACKING

Zhongjie Mao, Yucheng Wang, Xi Chen, Jia Yan, Wuhan University, China

Session:
MMSP-L1: Multimodal Processing: Vision + Language 1 Lecture

Track:
Multimedia Signal Processing

Location:
Room 201

Presentation Time:
Tue, 16 Apr, 17:30 - 17:50 (UTC +9)

Session Co-Chairs:
Jin Zeng, Tongji University, Shanghai, China and Fernando Pereira, IST, Portugal
View Manuscript
Presentation
Discussion
Resources
Contacts