TA2.PB.2

EXTENDING SEGMENT ANYTHING MODEL INTO AUDITORY AND TEMPORAL DIMENSIONS FOR AUDIO-VISUAL SEGMENTATION

Juhyeong Seon, Woobin Im, Sebin Lee, Jumin Lee, Sung-Eui Yoon, Korea Advanced Institute of Science and Technology, Korea, Republic of

Session:
TA2.PB: Image & Video Interpretation and Understanding - V Poster

Track:
Image and Video Analysis, Synthesis, and Retrieval

Location:
Poster Area B

Presentation Time:
Tue, 29 Oct, 10:30 - 12:00 Gulf Standard Time (UTC +4)

Session Co-Chairs:
Adrian Barbu, Florida State University and Sérgio Faria, Instituto de Telecomunicacoes / Politécnico de Leiria
View Manuscript
Presentation
Discussion
Resources
Session TA2.PB
TA2.PB.1: Learning Temporal Cues for Fine-grained Action Recognition
Zhihao Liu, Yi Zhang, Wenhui Huang, Yan Liu, China Mobile Research Institute, China; Mengyang Pu, North China Electric Power University, China; Chao Deng, Junlan Feng, China Mobile Research Institute, China
TA2.PB.2: EXTENDING SEGMENT ANYTHING MODEL INTO AUDITORY AND TEMPORAL DIMENSIONS FOR AUDIO-VISUAL SEGMENTATION
Juhyeong Seon, Woobin Im, Sebin Lee, Jumin Lee, Sung-Eui Yoon, Korea Advanced Institute of Science and Technology, Korea, Republic of
TA2.PB.3: STAY FOCUS ON OBJECT: CROSS-DOMAIN DETECTION USING DOMAIN-INVARIANT OBJECT REPRESENTATION
Taehoon Kim, Ajou University, Korea, Republic of; Jaemin Na, Tech. Innovation Group, KT, Korea, Republic of; Joong-won Hwang, Electronics and Telecommunications Research Institute, Korea, Republic of; Wonjun Hwang, Ajou University, Korea, Republic of
TA2.PB.4: Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model
Yi-Chia Chen, Wei-Hua Li, Chu-Song Chen, National Taiwan University, Taiwan
TA2.PB.5: GABOR FEATURE NETWORK FOR TRANSFORMER-BASED BUILDING CHANGE DETECTION MODEL IN REMOTE SENSING
Priscilla Indira Osa, University of Genoa, Italy; Josiane Zerubia, Inria, France; Zoltan Kato, University of Szeged, Hungary
TA2.PB.6: LEVERAGING GENERATED IMAGE CAPTIONS FOR VISUAL COMMONSENSE REASONING
Subham Das, Chandra Sekhar C, Indian Institute of Technology Madras, India
TA2.PB.7: Motion-Lie Transformer : Geometric Attention for 3D Human Pose Motion Prediction
Mayssa Zaier, Hazem Wannous, IMT Nord Europe, France; Hassen Drira, University of Strasbourg, France; Jacques Boonaert, IMT Nord Europe, France
TA2.PB.8: PCA-UNet for Object Segmentation
Cheng Long, Sayantika Nag, Adrian Barbu, Florida State University, United States of America
TA2.PB.9: EXPLORING ATTENTION MECHANISMS IN INTEGRATION OF MULTI-MODAL INFORMATION FOR SIGN LANGUAGE RECOGNITION AND TRANSLATION
Zaber Ibn Abdul Hakim, Rasman Mubtasim Swargo, Muhammad Abdullah Adnan, Bangladesh University of Engineering and Technology (BUET), Bangladesh
TA2.PB.10: SPATIAL-CHANNEL COLLABORATED ATTENTION FOR CROSS-SCALE CROWD COUNTING
Yongpeng Chang, Zhejiang University, China; Guangchun Gao, Hangzhou City University, China
Contacts