SLP-P55: Language Generation: Inference, Compression and Optimization
Poster
Fri, 8 May, 14:00 - 16:00
Location: Poster Area 29
Session Type: Poster
Track: Speech and Language Processing [SL]
Click the to view the manuscript on IEEE Xplore Open Preview

SLP-P55.1: CLUSTERING-DRIVEN MEMORY COMPRESSION FOR ON-DEVICE LARGE LANGUAGE MODELS

Ondrej Bohdal, Samsung R&D Institute UK, United Kingdom of Great Britain and Northern Ireland; Pramit Saha, University of Oxford, United Kingdom of Great Britain and Northern Ireland; Umberto Michieli, Mete Ozay, Taha Ceritli, Samsung R&D Institute UK, United Kingdom of Great Britain and Northern Ireland

SLP-P55.2: Sparsity Induction for Accurate Post-Training Pruning of Large Language Models

Minhao Jiang, Zhikai Li, Xuewen Liu, Jing Zhang, Mengjuan Chen, Qingyi Gu, Institute of Automation, Chinese Academy of Sciences, China

SLP-P55.3: ATTENTION OUTPUT PROJECTION IMPORTANCE SCORE FOR KEY-VALUE EVICTION

Jian Yuan, Shanghai Jiao Tong University, China; Ziwei He, Shanghai Innovation Institute, China; Zhouhan Lin, Bo Jiang, Shanghai Jiao Tong University, China

SLP-P55.4: INTERMITTENT SEMI-WORKING MASK: A NEW MASKING PARADIGM FOR LLMS

Haoyuan Hu, Zhejiang University, China; Mingcong Lu, Di Luo, Xinya Wu, Jiangcai Zhu, Taoye Yin, Zheng Li, Hao Wang, Shusheng Zhang, KeZun Zhang, KaiLai Shao, Chao Chen, Feng Wang, Ant group, China

SLP-P55.5: BaldWhisper: Faster Whisper with Head Shearing and Layer Merging

Yaya Sy, Christophe Cerisara, Irina Illina, Loria, France

SLP-P55.6: DPI: EXPLOITING PARAMETER HETEROGENEITY FOR INTERFERENCE-FREE FINE-TUNING

Xiaoyu Liu, Northeastern University, United States of America; Xiaoyu Guan, University of Florida, United States of America; Di Liang, Xianjie Wu, Northeastern University, China

SLP-P55.7: CLEAN: COMPLIANT LOOPS WITH ENHANCED ADJUSTMENT FOR TRAINING-FREE UNLEARNING

Jingwen Pu, Jinyu Guo, Yuang Li, Zhaokun Wang, Wenhong Tian, University of Electronic Science and Technology of China, China