MMSP-P16.8
OVID: Text-Guided Open-Vocabulary Dense Object Counting and Localization
Ma Hao-Yuan, Li Zhang, Qiang Minjie, Soochow University, China
Session:
MMSP-P16: Visual Grounding and Open-Vocabulary Segmentation Poster
Track:
Multimedia Signal Processing [MM]
Location:
Poster Area 21
Presentation Time:
Thu, 7 May, 09:00 - 11:00
Presentation
Discussion
Resources
No resources available.
Session MMSP-P16
MMSP-P16.1: Improving the Reasoning of Multi-Image Grounding in MLLMs via Reinforcement Learning
Bob Zhang, Xiaohongshu Inc., China; Haoran Li, University of Science and Technology of China, China; Tao Zhang, Wuhan University, China; Jianan Li, Technical University of Munich, China; Cilin Yan, Xikai Liu, Jiayin Cai, Xiaohongshu Inc., China; Yanbin Hao, Hefei University of Technology, China
MMSP-P16.2: COMPOSED VISUAL GROUNDING IN REMOTE SENSING IMAGES
Yuxi Sun, Sen Jia, Meng Xu, Shenzhen University, China; Baoquan Zhang, Harbin Institute of Technology, China; Jian Kang, Soochow University, China
MMSP-P16.3: Refining Open-Vocabulary Semantic Segmentation via Regional Semantics and Visual Prototypes
Aijing Yu, Institute of Information Engineering, Chinese Academy of Sciences, China; Zhengbo Wang, Lijun Sheng, University of Science and Technology of China, China; Jian Liang, NLPR & MAIS, Institute of Automation, Chinese Academy of Sciences, China; Xiaoyu Zhang, Institute of Information Engineering, Chinese Academy of Sciences, China
MMSP-P16.4: AUGMENTING IMAGE LLMS FOR DIVERSE VIDEO GROUNDING TASKS WITHOUT TRAINING
Mohan Chen, Chunguang Du, Qingqiu Li, Yuejie Zhang, Rui Feng, Fudan University, China; Tao Zhang, Shanghai University of Finance and Economics, China; Shang Gao, Deakin University, Australia
MMSP-P16.5: RGSC: Retrieve and then Generate Image-text Pairs from Semantic Concepts for Unsupervised Vision-Language Pre-training
Zhaopan Xu, Harbin Institute of Technology, China; Wangbo Zhao, National University of Singapore, Singapore; Sijie JI, California Institute of Technology, United States of America; Panpan Zhang, National University of Singapore, Singapore; Kaipeng Zhang, Shanghai Artificial Intelligence Laboratory, China; Hongxun Yao, Harbin Institute of Technology, China
MMSP-P16.6: Zero-Shot VISUAL GROUNDING in 3D Gaussians via View Retrieval
Liwei Liao, Peking Univerisity, China; Xufeng Li, City University of Hong Kong, China; Xiaoyun Zheng, Boning Liu, Pengcheng Laboratory, China; Feng Gao, Peking University, China; Ronggang Wang, Peking Univerisity, China
MMSP-P16.7: ScaleMamba: Multi-scale Context Fusion for Training-Free Open-Vocabulary Remote Sensing Segmentation
Zhicai Huang, Mingming Chen, Xiamen Huaxia University, China; Mingqiang Huang, Xiamen University of Technology, China
MMSP-P16.8: OVID: Text-Guided Open-Vocabulary Dense Object Counting and Localization
Ma Hao-Yuan, Li Zhang, Qiang Minjie, Soochow University, China
MMSP-P16.9: OPENHIER: AN OPEN-VOCABULARY HIERARCHICAL IMAGE CLASSIFICATION FRAMEWORK
Pu Yang, Dongjing Miao, Harbin Institute of Technology, China
MMSP-P16.10: DISTRIBUTION-AWARE DATA CURATION FOR SEMANTIC SEGMENTATION VIA MIXTURE OF VMFS
ZHI HU, KEXIN YANG, Aravindkumar Vijayalingam, ZHONGHUAN DAI, Klass Engineering and Solutions, Singapore
Contacts