TA6a2.1: Learning Non-myopic Power Allocation in Constrained Scenarios
Arindam Chowdhury, Rice University, United States; Santiago Paternain, Rensselaer Polytechnic Institute, United States; Gunjan Verma, Ananthram Swami, DEVCOM Army Research Laboratory, United States; Santiago Segarra, Rice University, United States
TA6a2.2: Learning Safety Critics via a Non-Contractive Binary Bellman Operator
Agustin Castellano, Johns Hopkins University, United States; Hancheng Min, University of Pennsylvania, United States; Juan Andrés Bazerque, University of Pittsburgh, United States; Enrique Mallada, Johns Hopkins University, United States
TA6a2.3: Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance
Lisha Chen, Heshan Fernando, Rensselaer Polytechnic Institute, United States; Yiming Ying, State University of New York at Albany, United States; Tianyi Chen, Rensselaer Polytechnic Institute, United States