MO1.R2.1

Towards General Function Approximation in Nonstationary Reinforcement Learning

Songtao Feng, University of Florida, United States; Ming Yin, Princeton University, United States; Ruiquan Huang, Pennsylvania State University, United States; Yu-Xiang Wang, UC Santa Barbara, United States; Jing Yang, Pennsylvania State University, United States; Yingbin Liang, Ohio State University, United States

Session:
Topics in Machine Learning 1

Track:
8: Machine Learning

Location:
Ypsilon I-II-III

Presentation Time:
Mon, 8 Jul, 10:05 - 10:25

Session Chair:
Deniz Gündüz, Imperial College
Abstract
Function approximation has experienced significant success in the field of reinforcement learning (RL). Despite a handful of progress on developing theory for Nonstationary RL with function approximation under structural assumptions, existing work for nonstationary RL with general function approximation is still limited. In this work, we propose a UCB-type of algorithm LSVI-Nonstationary following the popular least-square-value-iteration (LSVI) framework. LSVI-Nonstationary features the restart mechanism and a new design of bonus term to handle nonstationarity, and performs no worse than the existing confidence-set based algorithm SW-OPEA in [1], which has been shown to outperform the existing algorithms for nonstationary linear and tabular MDPs in the small variation budget setting.
Resources