MLSP-L25.6
ENHANCING RISK AWARENESS IN LLM AGENTS VIA PROBING SAFETY BOUNDARIES
Yu Jiang, Hanli Peng, Yongsen Zheng, Ziyao Liu, Kwok-Yan Lam, Chee Wei Tan, Nanyang Technological University, Singapore
Session:
MLSP-L25: Trustworthy Learning for Large Language Models Oral
Track:
Machine Learning for Signal Processing [ML]
Location:
Room 112
Presentation Time:
Thu, 7 May, 18:10 - 18:30
Presentation
Discussion
Resources
No resources available.
Session MLSP-L25
MLSP-L25.1: AN ENSEMBLE DEFENSE METHOD AGAINST FALSE DATA IN STRUCTURED PREFERENCE LEARNING
Mingyue Zhang, Shuyan Feng, Zhengguang Meng, Bo Liu, Zhiming Liu, School of Computer Science, Southwest University, Chongqing, China, China
MLSP-L25.2: BADREASONER: PLANTING TUNABLE OVERTHINKING BACKDOORS INTO LARGE REASONING MODELS FOR FUN OR PROFIT
Biao Yi, Zekun Fei, Jianing Geng, Tong Li, Lihai Nie, Zheli Liu, Nankai University, China; Yiming Li, Nanyang Technological University, Singapore
MLSP-L25.3: UNLABELED TARGET-DOMAIN CALIBRATION FOR TABULAR CLASSIFIERS UNDER LABEL SHIFT
Yuechan Li, Wuhan University, China; Jiaqi Shi, University of Science and Technology of China, China; Xiaohui Cui, Wuhan University, China
MLSP-L25.4: LANTERN: LANGUAGE MODEL ASSESSMENT ON NOISY AND TRANSFORMED TASKS FOR UNDERSTANDING ERROR AND ROBUSTNESS NUANCES
Vamsi Krishna Kodavali, Rituraj Singh, Samsung R&D Institute India-Bangalore, India
MLSP-L25.5: ENHANCING VALUE ALIGNMENT OF LLMS WITH MULTI-AGENT SYSTEM AND COMBINATORIAL FUSION
Yuanhong Wu, Fordham University, United States of America; Djallel Bouneffouf, IBM Research, United States of America; D. Frank Hsu, Fordham University, United States of America
MLSP-L25.6: ENHANCING RISK AWARENESS IN LLM AGENTS VIA PROBING SAFETY BOUNDARIES
Yu Jiang, Hanli Peng, Yongsen Zheng, Ziyao Liu, Kwok-Yan Lam, Chee Wei Tan, Nanyang Technological University, Singapore
Contacts