Mathieu Petitbois

Publications

Google Scholar

Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment

Mathieu Petitbois, Rémy Portelas, Sylvain Lamprier

ICML 2026 (Spotlight)

Agents in the Wild: Safety, Security, and Beyond Workshop at ICLR 2026

Principled Design for Trustworthy AI Workshop at ICLR 2026

Conference Paper Workshop paper Website Code

QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Mathieu Petitbois^*, Alexi Canesse^*, Ludovic Denoyer, Sylvain Lamprier, Rémy Portelas

^*Equal contribution.

IJCNN 2025

7th Robot Learning Workshop at ICLR 2025

Conference Paper Workshop Paper Website Code

Offline Learning of Controllable Diverse Behaviors

Mathieu Petitbois, Ludovic Denoyer, Sylvain Lamprier, Rémy Portelas

Generative Models for Robot Learning Workshop at ICLR 2025

Workshop Paper Website Code

Experience

Ubisoft La Forge

Apr. 2022 – May 2026

Bordeaux, France

Research Assistant

June 2023 – May 2026

Developed and integrated generative models (Transformers, Diffusion, VAEs, VLMs) to bridge generative modeling and sequential decision-making in complex environments (Godot, Atari, MuJoCo, Newton Physics). Collaborated to integrate research outcomes into game development through scientific consulting, and presented findings at international conferences.

Research Engineer

Oct. 2022 – May 2023

Designed and implemented a library to extract behavioral diversity from player data, enabling the learning of diverse, controllable policies for behavior personalization.

Research Intern

Apr. 2022 – Sep. 2022

Implemented and evaluated offline and online RL methods in simulated environments using MuJoCo and Panda3D. Demonstrated that offline RL improves sample efficiency and reduces training time given sufficient data, contributing to a company-wide shift from online RL to data-driven offline RL for policy learning in games.

ENSTA Paris, U2IS

June 2021 – Aug. 2021

Research Intern · Palaiseau, France

Reimplemented the model-based approach from the World Models paper (Ha & Schmidhuber, NeurIPS 2018) in PyTorch.

Education

Ubisoft La Forge & Université d'Angers, LERIA

June 2023 – May 2026

PhD in Deep Reinforcement Learning for Stylized Behavior Modeling · Bordeaux & Angers, France

Supervised by Ludovic Denoyer, Sylvain Lamprier, and Rémy Portelas.

Institut Polytechnique de Paris

Sep. 2021 – Sep. 2022

M.Sc. in Data Science, Highest Honors · Palaiseau, France

Main courses: Deep Learning, Reinforcement Learning, Computer Vision.

ENSTA Paris

Sep. 2019 – Sep. 2022

Master's Degree in Engineering · Palaiseau, France

Main courses: Dynamic Programming, Logic Programming, Linear Programming, Genetic Algorithms, Probability Theory, Optimization, Applied Statistics, Statistical Learning, Control Theory, Bayesian Filtering.

Publications

Experience

Education

Get In Touch