MLSP-P11.7

OCTOPUS: ENHANCING DISTRIBUTIONAL REINFORCEMENT LEARNING THROUGH REGULARIZATION

Jinyuan Zhang, University of Edinburgh, United Kingdom of Great Britain and Northern Ireland; Pengqian Yu, National University of Singapore, Singapore

Session:
MLSP-P11: Reinforcement Learning for Signal Processing and Control I Poster

Track:
Machine Learning for Signal Processing [ML]

Location:
Poster Area 6

Presentation Time:
Tue, 5 May, 16:30 - 18:30

Presentation
Discussion
Resources
No resources available.
Session MLSP-P11
MLSP-P11.1: HIERARCHICAL MARL FOR TASK ALLOCATION: DISTRIBUTED SUBTASK SELECTION WITH MUTUAL INFORMATION
Heng Zhang, Shihong Duan, Yadong Wan, University of Science and Technology Beijing, China
MLSP-P11.2: ENHANCING ONLINE RL FINE-TUNING VIA ADAPTIVE Q-FUNCTION SELECTION
Jiaqi Zhang, Yuheng Xu, Chongqing University, China; Hongfei Li, Southwest University, China; Yan Gan, Ning Wang, Chongqing University, China; Jiamou Liu, The University of Auckland, New Zealand
MLSP-P11.3: TGPO: TREE-GUIDED PREFERENCE OPTIMIZATION FOR ROBUST WEB AGENT REINFORCEMENT LEARNING
Ziyuan Chen, East China Normal University, China; Zhenghui Zhao, Zhangye Han, Miancan Liu, Yiqing Li, Xianhang Ye, Hongbo Min, Jinkui Ren, Xiantao Zhang, Alibaba Group, China; Guitao Cao, East China Normal University, China
MLSP-P11.4: EFFICIENT AND GLOBAL INTERACTION-AWARE RETRAINING-FREE TOKEN PRUNING FOR VISION TRANSFORMERS
Chenqi Shi, China University of Mining and Technology, China; Yi Tian, Monash University, Australia; Hongkun Du, Flinders University, Australia; Yuxin Li, Monash University, Australia; Ge Zhang, Independent Researcher, China; Guangzhen Yao, Yu Li, China University of Mining and Technology, China; Qiang Niu, University of Mining and Technology, China; Wenxin Zhang, China University of Mining and Technology, China; Zhenyu Yu, University Malaya, Malaysia; Randa Han, China University of Mining and Technology, China
MLSP-P11.5: One Timestep Spiking Actor Network with Adaptive Global-connected Encoding and Threshold Learning
Zhiyuan Hu, Ping He, Wanying Xu, Rong Xiao, Chenwei Tang, Jiancheng Lv, Sichuan University, China; Huajin Tang, Zhejiang University, China
MLSP-P11.6: AdaParse: A Structured Lipschitz Regularization Framework for Robust Reinforcement Learning
Xiangyu Shuai, Xinhai Chen, Binglin Wang, Qinglin Wang, Jie Liu, National University of Defense Technology, China
MLSP-P11.7: OCTOPUS: ENHANCING DISTRIBUTIONAL REINFORCEMENT LEARNING THROUGH REGULARIZATION
Jinyuan Zhang, University of Edinburgh, United Kingdom of Great Britain and Northern Ireland; Pengqian Yu, National University of Singapore, Singapore
MLSP-P11.8: DISTRIBUTIONAL PPO FOR STABLE POLICY GRADIENT OPTIMIZATION
Nan Zhou, Fan Liu, Yixin Zhou, University of Electronic Science and Technology of China, China; Jikang Liao, Kashi Institute of Electronics and Information Industry, University of Electronic Science and Technology of China, China; Fan Zhou, Guangqiang Yin, University of Electronic Science and Technology of China, China
MLSP-P11.9: SAMPLE EFFICIENT EXPERIENCE REPLAY IN NON-STATIONARY ENVIRONMENTS
Tianyang Duan, Zongyuan Zhang, Songxiao Guo, The University of Hong Kong, Hong Kong; Yuanye Zhao, Hebei University of Economics and Business, China; Zheng Lin, The University of Hong Kong, Hong Kong; Zihan Fang, Yi Liu, City University of Hong Kong, Hong Kong; Dianxin Luan, University of Edinburgh, United Kingdom of Great Britain and Northern Ireland; Dong Huang, National University of Singapore, Singapore; Heming Cui, The University of Hong Kong, Hong Kong; Yong Cui, Tsinghua University, China
Contacts