SLP-P22.8
EMOTIONAL DAMAGE: INVESTIGATING SAFETY VULNERABILITIES OF LARGE AUDIO-LANGUAGE MODELS UNDER SPEAKER EMOTIONAL VARIATIONS
Bo-Han Feng, Chien-Feng Liu, Yu-Hsuan Li Liang, Chih-Kai Yang, National Taiwan University, Taiwan; Szu-Wei Fu, Zhehuai Chen, NVIDIA, Taiwan; Ke-Han Lu, National Taiwan University, Taiwan; Sung-Feng Huang, Chao-Han Huck Yang, Yu-Chiang Frank Wang, NVIDIA, Taiwan; Yun-Nung Chen, Hung-yi Lee, National Taiwan University, Taiwan
Session:
SLP-P22: Speech LLM: Robustness & Adaptation Poster
Track:
Speech and Language Processing [SL]
Location:
Poster Area 30
Presentation Time:
Wed, 6 May, 14:00 - 16:00
Presentation
Discussion
Resources
No resources available.
Session SLP-P22
SLP-P22.1: WHEN SILENCE MATTERS: THE IMPACT OF IRRELEVANT AUDIO ON TEXT REASONING IN LARGE AUDIO-LANGUAGE MODELS
Chen-An Li, Tzu-Han Lin, Hung-yi Lee, National Taiwan University, Taiwan
SLP-P22.2: Grey-Box Prompt Tuning with Graph Alignment for Speech-Language Models
Yuhang Lu, Guangxi Normal University, China; Linghui Meng, Southeast University, China; Li-e Wang, Xianxian Li, Feng Yu, Guangxi Normal University, China
SLP-P22.3: RCAL: Reinforced Cross-modal Alignment for Multimodal Sentiment Analysis with Sparse Visual Frames
Xinwei Song, Xinran Tao, Jiachuan Wu, Tala Khoei, Northeastern University, United States of America
SLP-P22.4: OMNI-AVSR: TOWARDS UNIFIED MULTIMODAL SPEECH RECOGNITION WITH LARGE LANGUAGE MODELS
Umberto Cappellazzo, Imperial College London, United Kingdom of Great Britain and Northern Ireland; Xubo Liu, Pingchuan Ma, Meta, United Kingdom of Great Britain and Northern Ireland; Stavros Petridis, Maja Pantic, Imperial College London, United Kingdom of Great Britain and Northern Ireland
SLP-P22.5: SLM-TTA: A Framework for Test-Time Adaptation of Generative Spoken Language Models
Yuan-Kuei Wu, National Taiwan University, Taiwan; Yang Liu, Yiteng Huang, Zhaojun Yang, Haibin Wu, Ruizhe Huang, Yi-Te Hsu, Shuyu Kong, Ming Sun, Florian Metze, Li Wan, Meta, United States of America
SLP-P22.6: ZSV2C-MLLM: Zero-Shot Visual Voice Cloning via Multimodal Large Language Models
Yanling Zhang, Linqing Wang, Shengxiang Gao, Kunming University of Science and Technology, China
SLP-P22.7: CROSS-LINGUAL INTERLEAVING FOR SPEECH LANGUAGE MODELS
Adel Moumen, Guangzhi Sun, Philip Woodland, University of Cambridge, United Kingdom of Great Britain and Northern Ireland
SLP-P22.8: EMOTIONAL DAMAGE: INVESTIGATING SAFETY VULNERABILITIES OF LARGE AUDIO-LANGUAGE MODELS UNDER SPEAKER EMOTIONAL VARIATIONS
Bo-Han Feng, Chien-Feng Liu, Yu-Hsuan Li Liang, Chih-Kai Yang, National Taiwan University, Taiwan; Szu-Wei Fu, Zhehuai Chen, NVIDIA, Taiwan; Ke-Han Lu, National Taiwan University, Taiwan; Sung-Feng Huang, Chao-Han Huck Yang, Yu-Chiang Frank Wang, NVIDIA, Taiwan; Yun-Nung Chen, Hung-yi Lee, National Taiwan University, Taiwan
SLP-P22.9: ADVANCING SPEECH UNDERSTANDING IN SPEECH-AWARE LANGUAGE MODELS WITH GRPO
Avishai Elmakies, Hagai Aronowitz, Nimrod Shabtay, Eli Schwartz, Ron Hoory, Avihu Dekel, IBM, Israel
SLP-P22.10: Multimodal LLMs as Expert Speech Annotators: Acoustic Macro-Descriptors for Parkinson's Detection
David Ortiz-Perez, University of Alicante, Spain; Catarina Botelho, Anna Pompili, Alberto Abad, INESC-ID, Portugal; Jose Garcia-Rodriguez, University of Alicante, Spain
Contacts