IEEE ICIP 2024 || Abu Dhabi, United Arab Emirates || 27-30 October 2024

TA2.PB.2

EXTENDING SEGMENT ANYTHING MODEL INTO AUDITORY AND TEMPORAL DIMENSIONS FOR AUDIO-VISUAL SEGMENTATION

Juhyeong Seon, Woobin Im, Sebin Lee, Jumin Lee, Sung-Eui Yoon, Korea Advanced Institute of Science and Technology, Korea, Republic of

Session:

TA2.PB: Image & Video Interpretation and Understanding - V Poster

Location:

ICC Hall 3/4: Poster Area B

Presentation Time:

Tue, 29 Oct, 10:30 - 12:00 Gulf Standard Time (UTC +4)

Session Chair:

Adrian Barbu, Florida State University

View Manuscript

Session TA2.PB

TA2.PB.1: Learning Temporal Cues for Fine-grained Action Recognition

Zhihao Liu, Yi Zhang, Wenhui Huang, Yan Liu, China Mobile Research Institute, China; Mengyang Pu, North China Electric Power University, China; Chao Deng, Junlan Feng, China Mobile Research Institute, China

TA2.PB.2: EXTENDING SEGMENT ANYTHING MODEL INTO AUDITORY AND TEMPORAL DIMENSIONS FOR AUDIO-VISUAL SEGMENTATION

Juhyeong Seon, Woobin Im, Sebin Lee, Jumin Lee, Sung-Eui Yoon, Korea Advanced Institute of Science and Technology, Korea, Republic of

TA2.PB.3: STAY FOCUS ON OBJECT: CROSS-DOMAIN DETECTION USING DOMAIN-INVARIANT OBJECT REPRESENTATION

Taehoon Kim, Ajou University, Korea, Republic of; Jaemin Na, Tech. Innovation Group, KT, Korea, Republic of; Joong-won Hwang, Electronics and Telecommunications Research Institute, Korea, Republic of; Wonjun Hwang, Ajou University, Korea, Republic of

TA2.PB.4: Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model

Yi-Chia Chen, Wei-Hua Li, Chu-Song Chen, National Taiwan University, Taiwan

TA2.PB.5: GABOR FEATURE NETWORK FOR TRANSFORMER-BASED BUILDING CHANGE DETECTION MODEL IN REMOTE SENSING

Priscilla Indira Osa, University of Genoa, Italy; Josiane Zerubia, Inria, France; Zoltan Kato, University of Szeged, Hungary

TA2.PB.6: LEVERAGING GENERATED IMAGE CAPTIONS FOR VISUAL COMMONSENSE REASONING

Subham Das, Chandra Sekhar C, Indian Institute of Technology Madras, India

TA2.PB.7: Motion-Lie Transformer : Geometric Attention for 3D Human Pose Motion Prediction

Mayssa Zaier, Hazem Wannous, IMT Nord Europe, France; Hassen Drira, University of Strasbourg, France; Jacques Boonaert, IMT Nord Europe, France

TA2.PB.8: PCA-UNet for Object Segmentation

Cheng Long, Sayantika Nag, Adrian Barbu, Florida State University, United States of America

TA2.PB.9: EXPLORING ATTENTION MECHANISMS IN INTEGRATION OF MULTI-MODAL INFORMATION FOR SIGN LANGUAGE RECOGNITION AND TRANSLATION

Zaber Ibn Abdul Hakim, Rasman Mubtasim Swargo, Muhammad Abdullah Adnan, Bangladesh University of Engineering and Technology (BUET), Bangladesh

TA2.PB.10: SPATIAL-CHANNEL COLLABORATED ATTENTION FOR CROSS-SCALE CROWD COUNTING

Yongpeng Chang, Zhejiang University, China; Guangchun Gao, Hangzhou City University, China

Contact | Accessibility | Nondiscrimination Policy | IEEE Ethics Reporting | IEEE Privacy Policy | Terms | Signal Processing Society

©2026 IEEE – All rights reserved.

Last updated Last updated 07 October 2024.

Use of this website signifies your agreement to the IEEE Terms and Conditions.

Support: webmaster@2024.ieeeicip.org Host: https://cmsworldwide.com/