(2024 Fall) I joined Polytechnique Montréal as an associate professor and Mila as a core academic member.
(2024 Winter) I got Ontario Early Research Awards (ERA) on Accelerated Reinforcement Learning Algorithms.
(2024 Winter) I taught the undergraduate course on Neural Networks and Deep Learning at DCS, U of T. This was my last course at the University of Toronto.
(2022-2024) I did not keep the news updated from 2022 Fall until 2024. Many things have happened! Check out our papers instead!
(2022 Fall) Tyler Kastner joined my team and Murat Erdogdu’s as a PhD student. Welcome!
(2022 Summer) Farnam Mansouri did his MSc on risk-aware RL, and joined University of Waterloo afterwards. Stay tuned for some of his interesting results!
(2021 Spring & Fall) Allen Bao, an MScAc student, joined my lab in Spring, and in collaboration with AMD worked on Gameplay Test Automation with Reinforcement Learning. He graduated in the Fall. Congratulations! He is currently Senior Software Development Engineer (ML) at AMD.
(2021 Summer) Dr. Yangchen Pan defended his PhD! He is my first graduated PhD student and I am very proud of him. He is currently a Departmental Lecturer at the University of Oxford. Congratulations on both achievements!
(2021-2023) I have not updated this place between 2021 Spring until 2023 Spring. I retroactively add a few important news above.
(2020 Fall) Amin Rakhsha, Claas Voelcker, and Farnam Mansouri joined my group as graduate students. Welcome to the team!
(2020) I served as an area chair for NeurIPS 2020, ICLR 2021, and ECML 2020.
(2020 Spring) Yangchen has defended his candidacy proposal. He is now a PhD Candidate!
(2020 Winter) I teach Introduction to Machine Learning (CSC311) along with Emad Andrews. All the course material and the recorded last few lectures are available here.
(2020 Winter) Romina Abachi has defended her MS thesis Policy-Aware Model Learning for Policy Gradient Methods. Congratulations!
(2020 Winter) Frequency-based Search-Control in Dyna is accepted at the International Conference on Learning Representations (ICLR), 2019. Joint work with Yangchen and Jincheng. Summary: How can we make a model-based RL agent pay more attention to difficult regions of the state space? Populate its search-control queue with samples from high-frequency regions of the value function.
(2019 Fall) I served as an area chair/meta-reviewer for the International Conference on Learning Representations (ICLR 2020) and Artificial Intelligence and Statistics (AISTATS 2020).
(2019 Fall) Value Function in Frequency Domain and the Characteristic Value Iteration Algorithm is accepted at the Advances in Neural Information Processing Systems (NeurIPS), 2019. Summary: How can an RL agent represent the uncertainty of returns? Represent the Fourier transform of returns, and do all manipulations in the frequency domain.
(2019 Fall) Best reviewer award, NeurIPS 2019. Thanks to the organizers and the area chairs who nominated me.
(2019 Summer) Hill Climbing on Value Estimates for Search-control in Dyna is accepted at the International Joint Conference on Artificial Intelligence (IJCAI). Joint work with Yangchen, Hengshuai, and Martha. Summary: What initial imaginary/hypothetical states should an agent use for the model-based RL planning? Should they be real experienced samples? Maybe not! A better way is to hill climb the value function estimate and use the samples along the trajectory.
(2019 Spring) Selected as a top 5% reviewer for ICML. Thanks to the organizers for the recognition and the area chairs for the nomination.
(2019 Spring) I served as an area chair for the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2019).
(2019 Spring) I become a Canada CIFAR AI Chair at the Vector Institute (🇨🇦 🤖💺). Thanks to the nomination from the Vector Institute and all my wonderful references for their support of my application.
(2018 Fall) Iterative Value-Aware Model Learning is accepted at Neural Information Processing Systems (NeurIPS), 2018 December. (Short Version PDF; Extended Version PDF) Summary: How can we learn a good model for model-based RL that incorporates the underlying decision problem? Make it value aware and benefit from the structure of the planner. The extended version has a relatively detailed review of model-based RL, in addition to proofs and more discussions.
(2018 May) Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control is accepted at ICML, Sweden, 2018. Joint work with Yangchen, Martha, Saleh, Piyush, and Daniel (paper; extended version; arXiv). Summary: To control PDEs, we sometimes need to use infinite dimensional action spaces. How can we deal with them?
(2018 May) Best Reviewers Award at ICLR, 2018. Thanks to the organizers and the area chairs for the recognition.
(2018 February) After three wonderful years at MERL in Cambridge, USA, I am excited to announce that I am going to join the Vector Institute in Toronto, Canada as a Vector Faculty/Research Scientist. I have already missed MERL (a wonderful industrial research lab) as well as Cambridge and Boston (lovely and charming cities), but I am looking forward to this new chapter of my career at Vector and life in Toronto. And I am quite happy to be back to Canada 🇨🇦.
(2017 Summer/Fall): Two papers on Random Projection Filter Bank (RPFB) are accepted and presented: One at NIPS 2017 (short version; extended version with proofs and more detail) and another at PHM (Prognostics and Health Management conference) 2017. Joint work with Sepideh and Daniel. Summary: To extract features from a time series, project it onto the span of randomly generated stable dynamical filters. Similar to Random Kitchen Sink, but for dynamical systems.
(2017 April) Value-Aware Model Function for Model-based Reinforcement Learning is published at AISTATS 2017. Joint work with Andre and Daniel. Summary: A good model for prediction is not necessarily a good model for model-based RL as it ignores the decision problem. How can we incorporate the decision problem?
(2016) Regularized Policy Iteration with Nonparametric Function Spaces is published at the Journal of Machine Learning Research (JMLR), 2016. Joint work with Csaba, Mohammad, and Shie. Summary: Regularized Least Squares Temporal Difference (LSTD) is introduced and analyzed. The method is minimax optimal in a large class of nonparametric function spaces.