IVMSP-L11.5
HALTINGVT: ADAPTIVE TOKEN HALTING TRANSFORMER FOR EFFICIENT VIDEO RECOGNITION
Qian Wu, Ruoxuan Cui, Yuke Li, Haoqi Zhu, NetEase, China
Session:
IVMSP-L11: Action recognition Lecture
Track:
Image, Video, and Multidimensional Signal Processing
Location:
Room E3
Presentation Time:
Thu, 18 Apr, 14:30 - 14:50 (UTC +9)
Session Chair:
Nicola Conci, University of Trento
Session IVMSP-L11
IVMSP-L11.1: HIGH-ORDER TENSOR POOLING WITH ATTENTION FOR ACTION RECOGNITION
Lei Wang, ANU & Data61/CSIRO, Australia; Ke Sun, Piotr Koniusz, Data61/CSIRO & ANU, Australia
IVMSP-L11.2: Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action Localization
Yuanpeng He, School of Computer Science, Peking University, Beijing, China; Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China; Lijian Li, Tianxiang Zhan, Department of Computer and Information Science, University of Macau, Macau, China, China; Wenpin Jiao, School of Computer Science, Peking University, Beijing, China; Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, China; Chi-Man Pun, Department of Computer and Information Science, University of Macau, Macau, China, China
IVMSP-L11.3: Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval
Rui Deng, Qian Wu, Yuke Li, NetEase Yidun AI Lab, China; Haoran Fu, Department of Civil Engineering, Zhejiang University, China
IVMSP-L11.4: Open-Vocabulary Skeleton Action Recognition with Diffusion Graph Convolutional Network and Pre-Trained Vision-Language Models
Chao Wei, Zhidong Deng, Tsinghua University, China
IVMSP-L11.5: HALTINGVT: ADAPTIVE TOKEN HALTING TRANSFORMER FOR EFFICIENT VIDEO RECOGNITION
Qian Wu, Ruoxuan Cui, Yuke Li, Haoqi Zhu, NetEase, China
Contacts