Offered: Fall 2025 (current)
Fundamentals of Reinforcement Learning: Markov decision processes (MDPs), reward structures, exploration vs. exploitation; Value-Based Methods: Q-learning, deep Q-networks (DQNs), double DQN, dueling DQN; Policy-Based Methods: policy gradient methods, REINFORCE algorithm, actor-critic; Model-Based Reinforcement Learning: planning, world models, model-predictive control (MPC); Multi-Agent Reinforcement Learning: cooperative and competitive agents, multi-agent learning frameworks; Deep Reinforcement Learning: deep Q-learning, trust region policy optimization (TRPO), proximal policy optimization (PPO); Hierarchical Reinforcement Learning: options framework, hierarchical DQN, learning sub-policies; Exploration Strategies: epsilon-greedy, Thompson sampling, curiosity-driven exploration; Inverse Reinforcement Learning: inverse optimization, learning from demonstration; Safe and Robust Reinforcement Learning: reward shaping, safe exploration, robust control; Imitation Learning: behavior cloning, generative adversarial imitation learning (GAIL); Applications: robotics, autonomous vehicles, game playing, recommendation systems; Ethical Considerations: fairness, reward specification, interpretability in RL.
The core objectives of this course are to:
To provide in-depth knowledge of advanced RL algorithms and techniques, including deep RL and multi-agent RL.
To develop proficiency in implementing and applying Q-learning, DQN, policy gradient, and actor-critic methods.
To understand and implement planning algorithms and world models for efficient RL.
To gain expertise in designing and implementing RL systems for cooperative and competitive agents, and hierarchical tasks.
To apply deep RL algorithms like TRPO and PPO to solve complex control problems.
To explore methods for learning from demonstrations and inferring reward functions.
To critically analyze ethical and safety issues in RL, including reward specification and safe exploration.
1. To Be Added
| # | Description | Weight | Edit |
|---|