

I have been working on a textbook on reinforcement learning started from when I taught the new Introduction to Reinforcement Learning course in Spring 2021. The current version is from 2021. I am hoping to have a major update in 2025.

  • A.M. Farahmand, Lecture Notes on Reinforcement Learning, 2021.

    The textbook is introductory in the sense that it does not assume prior exposure to reinforcement learning. It is not, however, focused on being a collection of algorithms or only providing high-level intuition. Instead, it tries to build the mathematical intuition behind many important ideas and concepts often encountered in RL. We prove many basic, or sometimes not so basic, results in RL. If the proof of some result is too complicated, we prove a simplified version of it.

    If you are a university instructor and wish to use slides for your own course, please contact me.


This list is regularly updated. You can also check my Google Scholar page.























  • A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Policy Iteration,” Twenty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2008), Vancouver, Canada, December 2008. (24% acceptance rate) (PDF)
  • A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Fitted Q-Iteration: Application to Bounded Resource Planning,” in Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL 2008, Revised and Selected Papers, Springer, LNCS 5323, pp. 55—68, 2008. (PDF)
  • A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Policy Iteration,” Eighth European Workshop on Reinforcement Learning (EWRL 2008), Villeneuve d’Ascq, France, July 2008.



  • A. M. Farahmand, Majid Nili Ahmadabadi, Caro Lucas, and Babak N. Araabi, “Hybrid Behavior Co-evolution and Structure Learning in Behavior-based Systems,” In the Proceedings of IEEE Congress on Evolutionary Computation (CEC), Vancouver, Canada, 2006. (Chosen as the best presentation of the “Evolving Learning Systems” technical session) (PDF) (Presentation: PDF, PPT)
  • A. M. Farahmand and Mohammad javad Yazdanpanah, “Channel Assignment using Chaotic Simulated Annealing Enhanced Hopfield Neural Network,” In the Proceedings of International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, 2006. (PDF) (Presentation: PDF, PPT)
  • Mohammad G. Azar, Majid Nili Ahmadabadi, A. M. Farahmand, and Babak N. Araabi, “Learning to Coordinate Behaviors in Soft Behavior-based Systems using Reinforcement Learning,” International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, 2006.


  • A. M. Farahmand and M. J. Yazdanpanah, “Locally Optimal Takagi-Sugeno Fuzzy Controllers,” Proceedings of the 44th IEEE Conference on Decision and Control (CDC) and the European Control Conference (ECC), pp. 4095-4099, Seville, Spain, December 2005. (PDF) (Presentation: PDF, PPT)
  • M. J. Yazdanpanah, E. Madanian, and A. M. Farahmand, “Channel Assignment in Cellular Communications using a New Modification on Hopfield Networks,” Iranian Journal of Science and Technology, Transaction B: Engineering, Vol. 29, No. B4, 2005.
  • A. M. Farahmand and Majid Nili Ahmadabadi, “The Effect of Reinforcement Signal Error in Reinforcement Learning,” Computer Society of Iran Computer Conference (CSICC), 2005 (in Persian).



  • A. M. Farahmand, Roxana Akhbari, and Maryam Tajvidi, “Evolving Hidden Markov Models,” 4th Iranian Student Conference on Electrical Engineering (ISCEE), 2001 (in Persian). (PDF)


  • A. M. Farahmand and Amir Emad Mirmirani, “Distributed Genetic Algorithms,” 3rd Iranian Student Conference on Electrical Engineering (ISCEE), 2000 (in Persian).

Theses and Dissertation

Old Technical Reports (Selected)

  • A. M. Farahmand, Majid Nili Ahmadabadi, and Babak N. Araabi, “Behavior and Hierarchy Development in Behavior-based Systems using Reinforcement Learning,” Technical Report, 2005.
  • A. M. Farahmand, Caro Lucas, and Babak N. Araabi, “Chaos Control Survey,” a Technical Report for my Seminar Course, University of Tehran, 2004 (in Persian). (PDF)
  • A. M. Farahmand and Mohammad javad Yazdanpanah, “A Class of Nonlinear Controllers for Synchronization of Chaotic Semipassive Systems,” Technical Report, University of Tehran, 2003. (PDF)
  • A. M. Farahmand, Ramin Pashai, and Ezatollah Geranpayeh, “Effect of Metallic Electrode and Buffer Layer on Dielectric Waveguides,” Technical Report of my internship period at Iran Telecommunication Research Center (ITRC), 2001.
  • A. M. Farahmand, “On Chaotic Models of Population – A Survey,” 1999 (In Persian).
  • A. M. Farahmand, “Data Compression Methods,” (This my first technical report. I wrote it when I was still in high school. Not of particular technical value, but still an achievement at that time), 1997 (In Persian)