Publications

PhD Dissertation

Regularization in Reinforcement Learning, Department of Computing Science, University of Alberta, September 2011. (Supervisor: Csaba Szepesvári and Martin Jägersand; Examining Committee: Peter Bartlett, Michael Bowling, Alexander Melnikov, Dale Schuurmans, Richard S. Sutton) (My version, which is slightly updated; U of A’s link)

Refereed Publications

2017

‣A. M. Farahmand, Sepideh Pourazarm, and Daniel Nikovski, "Random Projection Filter Bank for Time Series Data," To appear in Neural Information Processing Systems (NIPS), December 2017. (PDF; Extended Version PDF)
‣A. M. Farahmand, André M.S. Barreto, and Daniel Nikovski, "Value-Aware Loss Function for Model-based Reinforcement Learning," The 20th International Conference on Artificial Intelligence and Statistics (AISTATS), April 2017. (PDF; Extended Version PDF) Note: An extended abstract version of this paper appeared at EWRL 2016.
‣A. M. Farahmand, Saleh Nabi, and Daniel Nikovski, “Deep Reinforcement Learning for Partial Differential Equation Control," American Control Conference (ACC), May 2017 (PDF)
‣Sepideh Pourazarm, A. M. Farahmand, and Daniel Nikovski, "Fault Detection and Prognosis of Time Series Data with Random Projection Filter Bank," In the Proceedings of Annual Conference of Prognostics and Health Management Society (PHM), October 2017. (PDF; PHM Version) [This is more empirical version geared towards fault detection and prognosis applications. See the forthcoming NIPS paper for more theoretical version (with proof/without proofs).]

2016

‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor, “Regularized Policy Iteration with Nonparametric Function Spaces," Journal of Machine Learning Research (JMLR), Vol. 17, No. 139, 2016. (PDF) (JMLR page)
‣A. M. Farahmand, Saleh Nabi, Piyush Grover, and Daniel Nikovski, “Learning to Control Partial Differential Equations: Regularized Fitted Q-Iteration Approach,” IEEE Conference on Decision and Control (CDC), December 2016. (PDF) (IEEE page)

‣A. M. Farahmand, Daniel Nikovski, Yuji Igarashi, and Hiroki Konaka, “Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value,” The 30th AAAI Conference on Artificial Intelligence (AAAI), February 2016. (PDF)
‣Mouhacine Benosman, A. M. Farahmand, and Meng Xia, “Learning-based Modular Indirect Adaptive Control for a Class of Nonlinear Systems," American Control Conference (ACC), 2016. (PDF) (IEEE page)
‣A. M. Farahmand, Andre M.S. Barreto, and Daniel Nikovski, "Value-Aware Loss Function for Model Learning in Reinforcement Learning," The 13th European Workshop on Reinforcement Learning (EWRL), December 2016. (PDF) (Also see: AISTATS 2017)

2015

‣A. M. Farahmand, Doina Precup, André M.S. Barreto, Mohammad Ghavamzadeh, “Classification-based Approximate Policy Iteration,” IEEE Transactions on Automatic Control, Vol. 60, No. 11, 2015 (preprint PDF; IEEE Version).
‣Note: This paper has an extended version with additional discussions and experiments, which might be easier to read: Copy on arXiv or here, which is slightly more up to date.
‣De-An Huang, A. M. Farahmand, Kris M. Kitani, and J. Andrew Bagnell, “Approximate MaxEnt Inverse Optimal Control and its Application for Mental Simulation of Human Interactions,” In the Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI), Jan. 2015. (PDF; Extended Version PDF)
‣De-An Huang, A. M. Farahmand, Kris M. Kitani, and J. Andrew Bagnell, “Approximate MaxEnt Inverse Optimal Control,” The 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), June 2015. (PDF)
‣J. Andrew Bagnell and A. M. Farahmand, “Learning Positive Functions in a Hilbert Space,” NIPS Workshop on Optimization for Machine Learning, December 2015. (PDF) (Presentation: Poster)

2014

‣Philip Bachman, A. M. Farahmand, and Doina Precup, “Sample-based Approximate Regularization,” International Conference on Machine Learning (ICML), 2014. (PDF; Extended Version PDF; Code on GitHub)
‣A. M. Farahmand, Doina Precup, André M.S. Barreto, Mohammad Ghavamzadeh, “Classification-based Approximate Policy Iteration: Experiments and Extended Discussions,” 2014 (copy on arXiv or here, which is slightly more up to date).
‣Note: A shorter version of this paper is published at IEEE Transactions on Automatic Control, 2015 (preprint PDF; IEEE Version).

2013

‣Beomjoon Kim, A. M. Farahmand, Joelle Pineau, and Doina Precup, “Learning from Limited Demonstrations,” In the Proceedings of Advances in Neural Information Processing Systems (NIPS-26), 2013. (PDF; Supplementary material)
‣Mahdi Milani Fard, Yuri Grinberg, A. M. Farahmand, Joelle Pineau, Doina Precup, “Bellman Error Based Feature Generation using Random Projections on Sparse Spaces,” In the Proceedings of Advances in Neural Information Processing Systems (NIPS-26), 2013. (PDF; Supplementary material)
‣Beomjoon Kim, A. M. Farahmand, Joelle Pineau, and Doina Precup, “Approximate Policy Iteration with Demonstration Data,” The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2013. (PDF)
‣A. M. Farahmand, Doina Precup, André M.S. Barreto, and Mohammad Ghavamzadeh, “CAPI: Generalized Classification-based Approximate Policy Iteration,” The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2013. (PDF)

2012

‣A. M. Farahmand and Csaba Szepesvári, “Regularized Least-Squares Regression: Learning from a β-mixing Sequence,” Journal of Statistical Planning and Inference (JSPI), Volume 142, Issue 2, February 2012. (Preprint - PDF; JSPI’s version)
‣A. M. Farahmand and Doina Precup, “Value Pursuit Iteration,” In the Proceedings of Advances in Neural Information Processing Systems (NIPS-25), 2012. (PDF; Extended Version PDF)
‣A. M. Farahmand, Doina Precup, and Mohammad Ghavamzadeh, “Generalized Classification-based Approximate Policy Iteration,” Tenth European Workshop on Reinforcement Learning (EWRL 2012), Edinburgh, Scotland, June 2012. (PDF)

2011

‣A. M. Farahmand, “Action-Gap Phenomenon in Reinforcement Learning,” In the proceedings of the Advances in Neural Information Processing Systems (NIPS-24), 2011 (PDF).
‣A. M. Farahmand and Csaba Szepesvári, “Model Selection in Reinforcement Learning,” Machine Learning Journal, Vol. 85, No. 3, Springer, 2011. (PDF; MLJ’s version)
‣A. M. Farahmand and Csaba Szepesvári, “BErMin: A Model Selection Algorithm for Reinforcement Learning Problems,” NIPS Workshop on New Frontiers in Model Order Selection, December, 2011. (PDF; video recording of the presentation) [This is a four-page summary of our Machine Learning Journal paper]

2010

‣A. M. Farahmand, Remi Munos, Csaba Szepesvári, “Error Propagation for Approximate Policy and Value Iteration,” Advances in Neural Information Processing Systems (NIPS-23), 2010. (PDF; Extended Version PDF)
‣A. M. Farahmand, Majid Nili Ahmadabadi, Babak N. Araabi, Caro Lucas, “Interaction of Culture-based Learning and Cooperative Co-evolution and its Application to Automatic Behavior-based System Design,” IEEE Transactions on Evolutionary Computation, Vol. 14, No. 1, pp. 23-57, 2010. (Preprint - PDF; IEEE’s version)
‣Azad Shademan, A. M. Farahmand, and Martin Jägersand, “Robust Jacobian Estimation for Uncalibrated Visual Servoing,” Accepted for publication in the Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, Alaska, USA, May 2010. (PDF)

2009

‣A. M. Farahmand, Azad Shademan, Martin Jägersand, and Csaba Szepesvári, “Model-based and Model-free Reinforcement Learning for Visual Servoing,” In the Proceedings of the International Conference on Robotics and Automation (ICRA), Kobe, Japan, May 2009. (PDF; IEEE’s version) (Presentation: PDF)
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Fitted Q-Iteration for Planning in Continuous-Space Markovian Decision Problems,” In the Proceedings of the American Control Conference (ACC), St. Louis, Missouri, USA, June 2009. (PDF; IEEE’s version)
‣Azad Shademan, A. M. Farahmand, Martin Jägersand, “Towards Learning Robotic Reaching and Pointing: An Uncalibrated Visual Servoing Approach,” Sixth Canadian Conference on Computer and Robot Vision (CRV), Kelowna, British Columbia, Canada, 2009.
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularization in Reinforcement Learning,” Multidisciplinary Symposium on Reinforcement Learning (MSRL-2009), Montreal, QC, Canada, June 2009. (PDF)
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Robot Learning with Regularized Reinforcement Learning,” Workshop on Regression in Robotics: Approaches and Applications, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. (PDF)

2008

‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Policy Iteration,” Twenty-Second Annual Conference on Advances in Neural Information Processing Systems (NIPS-2008), Vancouver, Canada, December 2008. (24% acceptance rate) (PDF)
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Fitted Q-Iteration: Application to Bounded Resource Planning,” in Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL 2008, Revised and Selected Papers, Springer, LNCS 5323, pp. 55—68, 2008. (PDF)
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Policy Iteration,” Eighth European Workshop on Reinforcement Learning (EWRL 2008), Villeneuve d'Ascq, France, July 2008.

2007

‣A. M. Farahmand, Csaba Szepesvári, and Jean-Yves Audibert, "Manifold-Adaptive Dimension Estimation," International Conference on Machine Learning (ICML), 2007. (PDF; ACM’s version) (Presentation given at the conference: PDF, VideoLecture’s recorded presentation)
‣A. M. Farahmand, Azad Shademan, and Martin Jagersand, "Global Visual-Motor Estimation for Uncalibrated Visual Servoing," IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2007. (PDF; IEEE’s version)
‣A. M. Farahmand, Csaba Szepesvári, and Jean-Yves Audibert, "Towards Manifold-Adaptive Learning," NIPS Workshop on Topology Learning, Whistler, Canada, 2007 (PDF)

2006

‣A. M. Farahmand, Majid Nili Ahmadabadi, Caro Lucas, and Babak N. Araabi, “Hybrid Behavior Co-evolution and Structure Learning in Behavior-based Systems,” In the Proceedings of IEEE Congress on Evolutionary Computation (CEC), Vancouver, Canada, 2006. (Chosen as the best presentation of the “Evolving Learning Systems” technical session) (PDF) (Presentation: PDF, PPT)
‣A. M. Farahmand and Mohammad javad Yazdanpanah, “Channel Assignment using Chaotic Simulated Annealing Enhanced Hopfield Neural Network,” In the Proceedings of International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, 2006. (PDF) (Presentation: PDF, PPT)
‣Mohammad G. Azar, Majid Nili Ahmadabadi, A. M. Farahmand, and Babak N. Araabi, “Learning to Coordinate Behaviors in Soft Behavior-based Systems using Reinforcement Learning,” International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, 2006.

2005

‣A. M. Farahmand and M. J. Yazdanpanah, “Locally Optimal Takagi-Sugeno Fuzzy Controllers,” Proceedings of the 44th IEEE Conference on Decision and Control (CDC) and the European Control Conference (ECC), pp. 4095-4099, Seville, Spain, December 2005. (PDF) (Presentation: PDF, PPT)
‣M. J. Yazdanpanah, E. Madanian, and A. M. Farahmand, “Channel Assignment in Cellular Communications using a New Modification on Hopfield Networks,” Iranian Journal of Science and Technology, Transaction B: Engineering, Vol. 29, No. B4, 2005.
‣A. M. Farahmand and Majid Nili Ahmadabadi, "The Effect of Reinforcement Signal Error in Reinforcement Learning," Computer Society of Iran Computer Conference (CSICC), 2005 (in Persian).

2004

‣A. M. Farahmand, Majid Nili Ahmadabadi, and Babak N. Araabi, "Behavior Hierarchy Learning in a Behavior-based System using Reinforcement Learning," Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2004), Sendai, Japan, 2004. (PDF) (Presentation: PDF, PPT)
‣A. M. Farahmand and Caro Lucas, "Fuzzy Neural Network Implementation of Q(λ) for Mobile Robots," WSEAS Transaction on Systems, Issue 1, Vol. 3, Jan. 2004.

2001

‣A. M. Farahmand, Roxana Akhbari, and Maryam Tajvidi, "Evolving Hidden Markov Models," 4th Iranian Student Conference on Electrical Engineering (ISCEE), 2001 (in Persian).

2000

‣A. M. Farahmand and Amir Emad Mirmirani, "Distributed Genetic Algorithms," 3rd Iranian Student Conference on Electrical Engineering (ISCEE), 2000 (in Persian).

Thesis

‣PhD Dissertation: Regularization in Reinforcement Learning, Department of Computing Science, University of Alberta, September 2011. (Supervisor: Csaba Szepesvári and Martin Jägersand – Examining Committee: Peter Bartlett, Michael Bowling, Alexander Melnikov, Dale Schuurmans, Richard S. Sutton) (My version, which is slightly updated; U of A’s link)
‣MS Thesis: Learning and Evolution in Hierarchical Behavior-based Systems, M.S. Thesis, University of Tehran, 2005 (in Persian) (PDF). (Advisors: Majid Nili Ahmadabadi, Babak N. Araabi, Caro Lucas) (Examining Committee: Babak Moshiri and Alireza Fatehi)
‣MS Thesis: Calculating Resonant Frequencies of a Metallic Cavity using Finite Element Method, BSEE Thesis, K. N. Toosi University of Technology, 2002 (In Persian) (PDF). (Advisor: Mohammad-Sadegh Abrishamian) (Examining Committee: Manouchehr Kamyab and Mohsen Aboutorab)

Submitted/Working papers

‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Policy Iteration for Nonparametric Function Spaces” Submitted to JMLR, 2013 January.
‣A. M. Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor, “Regularized Fitted Q-Iteration Algorithm,” Working paper.
‣A. M. Farahmand, Csaba Szepesvári, and Jean-Yves Audibert, “Nearest Neighborhood Methods for Manifold-Adaptive Dimension Estimation and Regression,” Working paper. [We never end up submitting this anywhere, even though it was almost ready to go. We may put it as TR somewhere.]

Technical Reports (Selected)

‣A. M. Farahmand, Majid Nili Ahmadabadi, and Babak N. Araabi, "Behavior and Hierarchy Development in Behavior-based Systems using Reinforcement Learning,” Technical Report, 2005.
‣A. M. Farahmand, Caro Lucas, and Babak N. Araabi, "Chaos Control Survey," a Technical Report for my Seminar Course, University of Tehran, 2004 (in Persian/Farsi). (PDF)
‣A. M. Farahmand and Mohammad javad Yazdanpanah, "A Class of Nonlinear Controllers for Synchronization of Chaotic Semipassive Systems," Technical Report, University of Tehran, 2003.
‣A. M. Farahmand, Ramin Pashai, and Ezatollah Geranpayeh, "Effect of Metallic Electrode and Buffer Layer on Dielectric Waveguides," Technical Report of my internship period at Iran Telecommunication Research Center (ITRC), 2001.
‣A. M. Farahmand, "On Chaotic Models of Population - A Survey," 1999 (In Persian/Farsi).
‣A. M. Farahmand, "Data Compression Methods," (This my first technical report. I wrote it when I was still in high school. Not of particular technical value, but still an achievement at that time), 1997 (In Persian/Farsi)