Fabien Pesquerel

PhD student in Reinforcement Learning

profile_pic.jpg

A06 office, Building A,

40 Halley Avenue,

Villeneuve d'Ascq, France

To the layman reader, reinforcement learning is about learning to observe in order to observe to learn in order to achieve some goal as efficiently as possible.

I am a PhD student in the SCOOL research team (previously SequeL) at INRIA and a Reinforcement Learning teacher at Polytechnique, Lille University and Centrale Supélec.

You can reach me at first_name.last_name@institution.fr.

My research interests lie in sequential learning, sequential decision making and mathematical models of information processing. Because I care deeply about real-world implications and applications, I attach great importance to theoretical guarantees. Some applications that I am interested in are bioinformatics, computational neurosciences, physics, and gymnastics.

I particularly focus on theoretically optimal reinforcement learning algorithms that are also numerically competitive.

My PhD thesis is about optimality and structures in reinforcement learning and I am doing it with Odalric-Ambrym Maillard thanks to a grant from École Normale Supérieure de Paris.

news

Oct 16, 2022 My paper, IMED-RL: Regret optimal learning of ergodic Markov decision processes, has been accepted at NeurIPS. See you there!
Aug 28, 2022 I finally took the time to set up this academic webpage.

selected publications

  1. Fast Asymptotically Optimal Algorithms for Non Parametric Stochastic Bandits
    Rémy Degenne Dorian Baudry, and  Odalric-Ambrym Maillard
    In Thirty-Seven Conference on Neural Information Processing Systems 2023
  2. Logarithmic Regret in Communicating MDPs: Leveraging Known Dynamics with Bandits
    Odalric-Ambrym Maillard Hassan Saber, and Mohammad Sadegh Talebi
    In Asian Conference on Machine Learning 2023
  3. IMED-RL: Regret optimal learning of ergodic Markov decision processes
    Fabien Pesquerel, and Odalric-Ambrym Maillard
    In Thirty-Sixth Conference on Neural Information Processing Systems 2022
  4. Stochastic bandits with groups of similar arms.
    Fabien PesquerelHassan Saber, and Odalric-Ambrym Maillard
    In Advances in Neural Information Processing Systems 2021