TA4a: Safe Reinforcement Learning
Tue, 29 Oct, 08:15 - 09:55 PT (UTC -7)
Location: Nautilus
Session Type: Lecture
Track: Adaptive Systems, Machine Learning, and Data Analytics
Tue, 29 Oct, 08:15 - 08:40 PT (UTC -7)

TA4a.1: Tensor low-rank approximation of value functions in multi-task RL

Sergio Rozada, King Juan Carlos University, Spain; Santiago Paternain, Rensselaer Polytechnic Institute, United States; Juan Andrés Bazarque, University of Pittsburgh, United States; Antonio G. Marques, King Juan Carlos University, Spain
Tue, 29 Oct, 08:40 - 09:05 PT (UTC -7)

TA4a.2: Asynchronous best-response for learning Nash in constrained Markov games with an α-potential

Soham Das, Ceyhun Eksin, Texas A&M University, United States
Tue, 29 Oct, 09:05 - 09:30 PT (UTC -7)

TA4a.3: Bimodal Bandits: Max-Mean Regret Minimization

Adit Jain, Cornell University, United States; Sujay Bhatt, JP Morgan Chase and Co., United States; Vikram Krishnamurthy, Cornell University, United States; Alec Koppel, JP Morgan Chase and Co., United States