TP2b: Reinforcement Learning
Tue, 31 Oct, 15:30 - 17:35 PT (UTC -8)
Location: Oak Shelter
Session Type: Lecture
Session Chair: Talha Bozkus, University of Southern California
Track: Adaptive Systems, Machine Learning, Data Analytics
Tue, 31 Oct, 15:30 - 15:55 PT (UTC -8)

TP2b.1: Predictive Estimation for Reinforcement Learning with Time-Varying Reward Functions

Abolfazl Hashemi, Antesh Upadhyay, Purdue University, United States
Tue, 31 Oct, 15:55 - 16:20 PT (UTC -8)

TP2b.2: Practical Robust Reinforcement Learning Via Adjacent Uncertainty Set

Ukjo Hwang, Songnam Hong, Hanyang University, Republic of Korea
Tue, 31 Oct, 16:20 - 16:45 PT (UTC -8)

TP2b.3: A Novel Ensemble Q-Learning Algorithm for Policy Optimization in Large-Scale Networks

Talha Bozkus, Urbashi Mitra, University of Southern California, United States
Tue, 31 Oct, 16:45 - 17:10 PT (UTC -8)

TP2b.4: Reward Attack on Stochastic Bandits with Non-stationary Rewards

Chenye Yang, Guanlin Liu, Lifeng Lai, University of California, Davis, United States
Tue, 31 Oct, 17:10 - 17:35 PT (UTC -8)

TP2b.5: Multi-Agent Recurrent Deterministic Policy Gradient with Inter-Agent Communication

Joohyun Cho, Mingxi Liu, Yi Zhou, Rong-Rong Chen, University of Utah, United States