Textbook
I have been working on a textbook on reinforcement learning started from when I taught the new Introduction to Reinforcement Learning course in Spring 2021. The current version is from 2021. I am hoping to have a major update in 2025.
-
A.M. Farahmand, Lecture Notes on Reinforcement Learning, 2021.
The textbook is introductory in the sense that it does not assume prior exposure to reinforcement learning. It is not, however, focused on being a collection of algorithms or only providing high-level intuition. Instead, it tries to build the mathematical intuition behind many important ideas and concepts often encountered in RL. We prove many basic, or sometimes not so basic, results in RL. If the proof of some result is too complicated, we prove a simplified version of it.
If you are a university instructor and wish to use slides for your own course, please contact me.
Papers
2024
- Amin Rakhsha, Mete Kemertas, Mohammad Ghavamzadeh, and A.M. Farahmand, “Maximum Entropy Model Correction in Reinforcement Learning,” International Conference on Learning Representations (ICLR), 2024. (PDF; arXiv; OpenReview)
- Mark Bedaywi, Amin Rakhsha, A.M. Farahmand, “PID Accelerated Temporal Difference
Algorithms,” Reinforcement Learning Conference (RLC), 2024. (PDF; arXiv)
- Claas Voelcker, Tyler Kastner, Igor Gilitschenski, A.M. Farahmand, “When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning,” Reinforcement Learning Conference (RLC), 2024. (PDF; arXiv)
- Marcel Hussing, Claas Voelcker, Igor Gilitschenski, and A.M. Farahmand, Eric Eaton “Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence,” Reinforcement Learning Conference (RLC), 2024. (PDF; arXiv)
- Avery Ma, A.M. Farahmand, Yangchen Pan, Philip Torr, Jindong Gu, “Improving Adversarial Transferability via Model Alignment,” European Conference on Computer Vision (ECCV), 2024. (PDF; arXiv; Code)
2023
- Tyler Kastner, Murat A. Erdogdu, and A.M. Farahmand, “Distributional Model Equivalence for Risk-Sensitive Reinforcement Learning,” Neural Information Processing Systems (NeurIPS), 2023. (PDF; OpenReview)
- Avery Ma, Yangchen Pan, and A.M. Farahmand, “Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods,” Transactions on Machine Learning Research (TMLR), 2023. (PDF; OpenReview) [Featured certificate: ~3.5% of accepted papers]
- Claas A. Voelcker, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, and A.M. Farahmand, “λ-AC: Learning latent decision-aware models for reinforcement learning in continuous state-spaces,” 2023. (arXiv)
- Mete Kemertas, Allan Jepson, and A.M. Farahmand, “Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients”, 2023. (arXiv)
2022
- Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, and A.M. Farahmand, “Operator Splitting Value Iteration,” Neural Information Processing Systems (NeurIPS), 2022. (PDF; OpenReview)
- Claas A. Voelcker, Victor Liao, Animesh Garg, and A.M. Farahmand, “Value Gradient Weighted Model-Based Reinforcement Learning,” International Conference on Learning Representations (ICLR), 2022. (PDF; OpenReview)
- Guiliang Liu, Ashutosh Adhikari, A.M. Farahmand, and Pascal Poupart, “Learning Object-Oriented Dynamics for Planning from Text,” International Conference on Learning Representations (ICLR), 2022. (PDF; OpenReview; GitHub)
- Jincheng Mei, Yangchen Pan, A.M. Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani and, Jun Luo, “Understanding and Mitigating the Limitations of Prioritized Replay,” Conference on Uncertainty in Artificial Intelligence (UAI), 2022. (PDF)
2021
2020
- Yangchen Pan, Jincheng Mei, A.M. Farahmand, “Frequency-based Search-Control in Dyna,” International Conference on Learning Representations (ICLR), 2020. (PDF)
- Yangchen Pan, Ehsan Imani, A.M. Farahmand, Martha White, “An Implicit Function Learning Approach for Regression,” Neural Information Processing Systems (NeurIPS), 2020. (PDF on arXiv)
- Romina Abachi, Mohammad Ghavamzadem, A.M. Farahmand, “Policy-Aware Model Learning for Policy-Gradient Methods,” 2020. (arXiv)
- Avery Ma, Fartash Faghri, Nicolas Papernot, A.M. Farahmand, “SOAR: Second-Order Adversarial Regularization,” 2020. (arXiv)
- Rodrigo Toro Icarte, Richard Valenzano, Toryn Klassen, Phillip Christoffersen, A.M. Farahmand, Sheila McIlraith, “The Act of Remembering: A Study in Partially Observable Reinforcement Learning,” 2020. (arXiv)
2019
- A.M. Farahmand, “Value Function in Frequency Domain and the Characteristic Value Iteration Algorithm,” In the Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2019. (PDF; Extended Version PDF)
- Mohamed Akrout, A.M. Farahmand, Tory Jarmain, Latif Abid, “Improving Skin Condition Classification with a Visual Symptom Checker Trained using Reinforcement Learning,” International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2019. (arXiv)
- Yangchen Pan, Hengshuai Yao, A.M. Farahmand, and Martha White, “Hill Climbing on Value Estimates for Search-control in Dyna,” International Joint Conference on Artificial Intelligence (IJCAI), 2019. (PDF)
- Marc T. Law, Jake Snell, A.M. Farahmand, Raquel Urtasun, and Richard S. Zemel, “Dimensionality Reduction for Representing the Knowledge of Probabilistic Models,” International Conference on Learning Representations (ICLR), 2019. (PDF) (OpenReview)
- Mouhacine Benosman, A.M. Farahmand, Meng Xia, “Learning-Based Iterative Modular Adaptive Control for Nonlinear Systems,” International Journal of Adaptive Control and Signal Processing, Vol. 33, No. 2, pp. 335–355, 2019. (Publisher’s Version)
2018
- A.M. Farahmand, “Iterative Value-Aware Model Learning,” Neural Information Processing Systems (NeurIPS), 2018. (PDF; Extended Version PDF)
- Yangchen Pan, A.M. Farahmand, Martha White, Saleh Nabi, Piyush Grover, and Daniel Nikovski, “Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control,” International Conference on Machine Learning (ICML), 2018. (PDF; Extended Version PDF)
2017
- A. M. Farahmand, Sepideh Pourazarm, and Daniel Nikovski, “Random Projection Filter Bank for Time Series Data,” In the Proceedings of Advances in Neural Information Processing Systems (NeurIPS), December 2017. (PDF; Extended Version PDF; NeurIPS website)
- A. M. Farahmand, André M.S. Barreto, and Daniel Nikovski, “Value-Aware Loss Function for Model-based Reinforcement Learning,“ The 20th International Conference on Artificial Intelligence and Statistics (AISTATS), April 2017. (PDF; Extended Version PDF) Note: An extended abstract version of this paper appeared at EWRL 2016.
- A. M. Farahmand, Saleh Nabi, and Daniel Nikovski, “Deep Reinforcement Learning for Partial Differential Equation Control,” American Control Conference (ACC), May 2017 (PDF)
- Sepideh Pourazarm, A. M. Farahmand, and Daniel Nikovski, “Fault Detection and Prognosis of Time Series Data with Random Projection Filter Bank,” In the Proceedings of Annual Conference of Prognostics and Health Management Society (PHM), October 2017. (PDF; PHM Version) [This is a mostly empirical version of the Random Projection Filter Bank work geared towards fault detection and prognosis applications. See the NeurIPS paper for more theoretical version (with proof/without proofs).]
Continue reading “Publications”