SLP-P14.10
EMPOWERING VISION-LANGUAGE MODELS FOR REASONING ABILITY THROUGH LARGE LANGUAGE MODELS
Yueting Yang, Xintong Zhang, Jinan Xu, Wenjuan Han, Beijing Jiaotong University, China
Session:
SLP-P14: Multimodal processing of language Poster
Track:
Speech and Language Processing
Location:
Poster Zone 3A
Poster Board PZ-3A.10
Poster Board PZ-3A.10
Presentation Time:
Wed, 17 Apr, 16:30 - 18:30 (UTC +9)
Session Co-Chairs:
Jun Du, University of Science and Technology of China and perdo Moreno, Google
Session SLP-P14
SLP-P14.1: COOKING-CLIP: CONTEXT-AWARE LANGUAGE-IMAGE PRETRAINING FOR ZERO-SHOT RECIPE GENERATION
Lin Wang, University of Science and Technology of China, China; Haithm M.Al-Gunid, Ammar Hawbani, Yan Xiong, university of science and technology of china, China
SLP-P14.2: Exploring Object-centered External Knowledge for Fine-grained Video Paragraph Captioning
Guorui Yu, Yimin Hu, Yiqian Xu, Yuejie Zhang, Rui Feng, Fudan University, China; Tao Zhang, Shanghai University of Finance and Economics, China; Shang Gao, Deakin University, China
SLP-P14.3: RELATIONAL GRAPH-BRIDGED IMAGE-TEXT INTERACTION: A NOVEL METHOD FOR MULTI-MODAL RELATION EXTRACTION
Zihao Zheng, Tao He, Ming Liu, Zhongyuan Wang, Ruiji Fu, Bing Qin, Harbin Institute of Technology, China
SLP-P14.4: DIALCLIP: EMPOWERING CLIP AS MULTI-MODAL DIALOG RETRIEVER
Zhichao Yin, University of Science and Technology of China, China; Binyuan Hui, DAMO Academy, Alibaba Group, China; Min Yang, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, China; Fei Huang, Yongbin Li, DAMO Academy, Alibaba Group, China
SLP-P14.5: VECTOR QUANTIZATION KNOWLEDGE TRANSFER FOR END-TO-END TEXT IMAGE MACHINE TRANSLATION
Cong Ma, School of Artificial Intelligence, University of Chinese Academy of Sciences, China; Yaping Zhang, Yang Zhao, Yu Zhou, Chengqing Zong, Institute of Automation, Chinese Academy of Sciences, China
SLP-P14.6: EMORED: A DATASET FOR RELATION EXTRACTION IN TEXTS WITH EMOTICONS
Lingxing Kong, Zheng Ma, Jianbing Zhang, Liang He, Jiajun Chen, Nanjing University, China
SLP-P14.7: MSG-BART: Multi-granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-grounded Dialogue Generation
hongcheng liu, zhe chen, hui li, pingjie wang, yanfeng wang, yu wang, Shanghai Jiao Tong University, China
SLP-P14.8: CAUSALME: BALANCING BI-MODALITIES IN VISUAL QUESTION ANSWERING
Chenji Lu, Ge Bai, Shilong Li, Ying Liu, Xiyan Liu, Zerong Zeng, Ruifang Liu, Beijing University of Posts and Telecommunications, China
SLP-P14.9: MHPS: MULTIMODALITY-GUIDED HIERARCHICAL POLICY SEARCH FOR KNOWLEDGE GRAPH REASONING
Chen Gao, Xugong Qin, Peng Zhang, Nanjing University of Science and Technology, China; Yongquan He, Meituan, China; Xinjian Huang, Ming Zhou, Nanjing University of Science and Technology, China; Liehuang Zhu, Beijing Institute of Technology, China; Qingfeng Tan, Guangzhou University, China
SLP-P14.10: EMPOWERING VISION-LANGUAGE MODELS FOR REASONING ABILITY THROUGH LARGE LANGUAGE MODELS
Yueting Yang, Xintong Zhang, Jinan Xu, Wenjuan Han, Beijing Jiaotong University, China
SLP-P14.11: PVCG: Prompt-based Vision-aware Classification and Generation for Multi-modal Rumor Detection
Ting Zou, Zhong Qian, Peifeng Li, Qiaoming Zhu, Soochow University, China
SLP-P14.12: LABCLIP: LABEL-ENHANCED CLIP FOR IMPROVING ZERO-SHOT TEXT CLASSIFICATION
Yongheng Zhang, Central South University, China; Peng Wang, Anhui Agricultural University, China; Qiguang Chen, Jingxuan Zhou, Central South University, China; Yongmei Wang, Anhui Agricultural University, China; Min Li, Libo Qin, Central South University, China
SLP-P14.13: CONTEXT-AWARE DUAL ATTENTION NETWORK FOR MULTIMODAL SARCASM DETECTION
Liangyi Kang, Jie Liu, Dan Ye, Zhiyang Zhou, Institute of Software, Chinese Academy of Sciences, China
Contacts