MLSP-P6.9

OFFLINE REINFORCEMENT LEARNING WITH POLICY GUIDANCE AND UNCERTAINTY ESTIMATION

Lan Wu, Quan Liu, Lihua Zhang, Zhigang Huang, Soochow University, China

Session:
MLSP-P6: Reinforcement Learning II Poster

Track:
Machine Learning for Signal Processing

Location:
Poster Zone 3B
Poster Board PZ-3B.9

Presentation Time:
Tue, 16 Apr, 16:30 - 18:30 (UTC +9)

Session Chair:
Konstantinos Slavakis, Tokyo Institute of Technology
View Manuscript
Presentation
Discussion
Resources
Session MLSP-P6
MLSP-P6.1: Tensor Low-rank Approximation of Finite-horizon Value Functions
Sergio Rozada, Antonio G. Marques, King Juan Carlos University, Spain
MLSP-P6.2: OFFLINE REINFORCEMENT LEARNING WITH GENERATIVE ADVERSARIAL NETWORKS AND UNCERTAINTY ESTIMATION
Lan Wu, Quan Liu, Lihua Zhang, Zhigang Huang, Soochow University, China
MLSP-P6.3: OFFLINE REINFORCEMENT LEARNING BASED ON NEXT STATE SUPERVISION
Jie Yan, Quan Liu, Lihua Zhang, Soochow University, China
MLSP-P6.4: PROXIMAL BELLMAN MAPPINGS FOR REINFORCEMENT LEARNING AND THEIR APPLICATION TO ROBUST ADAPTIVE FILTERING
Yuki Akiyama, Konstantinos Slavakis, Tokyo Institute of Technology, Japan
MLSP-P6.5: A ROBUST QUANTILE HUBER LOSS WITH INTERPRETABLE PARAMETER ADJUSTMENT IN DISTRIBUTIONAL REINFORCEMENT LEARNING
Parvin Malekzadeh, Konstantinos N. Plataniotis, Zissis Poulos, Zeyu Wang, University of Toronto, Canada
MLSP-P6.6: Graph-enhanced Hybrid Sampling for Multi-armed Bandit Recommendation
Fen Wang, Taihao Li, Wuyue Zhang, zhejiang lab, China; Xue Zhang, Shandong University of Science and Technology, China; Cheng Yang, Shanghai University of Electric Power, China
MLSP-P6.7: INTERPRETABLE POLICY EXTRACTION WITH NEURO-SYMBOLIC REINFORCEMENT LEARNING
Rajdeep Dutta, Institute for Infocomm Research (I2R), A*STAR, Singapore; Qincheng Wang, Nanyang Technological University, Singapore; Ankur Singh, Institute for Infocomm Research (I2R), A*STAR, Singapore; Dhruv Kumarjiguda, Nanyang Technological University, Singapore; Li Xiaoli, Senthilnath Jayavelu, Institute for Infocomm Research (I2R), A*STAR, Singapore
MLSP-P6.8: SELF-SUPERVISED REINFORCEMENT LEARNING FOR OUT-OF-DISTRIBUTION RECOVERY VIA AUXILIARY REWARD
Yufeng Xie, Yinan Wang, Han Wang, Qingshan Li, Xidian University, China
MLSP-P6.9: OFFLINE REINFORCEMENT LEARNING WITH POLICY GUIDANCE AND UNCERTAINTY ESTIMATION
Lan Wu, Quan Liu, Lihua Zhang, Zhigang Huang, Soochow University, China
MLSP-P6.10: MULTI-AGENT SPARSE INTERACTION MODELING IS AN ANOMALY DETECTION PROBLEM
Chao Li, Shaokang Dong, Nanjing University, China; Shangdong Yang, Nanjing University of Posts and Telecommunications, China; Hongye Cao, Wenbin Li, Yang Gao, Nanjing University, China
MLSP-P6.11: A New Pre-training Paradigm for Offline Multi-agent Reinforcement Learning with Suboptimal Data
Linghui Meng, Xi Zhang, Dengpeng Xing, Bo Xu, Institute of Automation, Chinese Academy of Sciences, China
MLSP-P6.12: TREND-HEURISTIC REINFORCEMENT LEARNING FRAMEWORK FOR NEWS-ORIENTED STOCK PORTFOLIO MANAGEMENT
Wei Ding, Zhennan Chen, Hanpeng Jiang, Yuanguo Lin, Fan Lin, Xiamen University, China
Contacts