Benchmarks for reinforcement learning in mixed-autonomy traffic

David Schonenberg Deep Learning, Reinforcement Learning

We release new benchmarks in the use of deep reinforcement learning (RL) to create controllers for mixed-autonomy traffic, where connected and autonomous vehicles (CAVs) interact with human drivers and infrastructure. Benchmarks, such as Mujoco or the Arcade Learning Environment, have spurred new research by enabling researchers to effectively compare their results so that they can focus on algorithmic improvements and control techniques rather than system design. To promote similar advances in traffic control via RL, we propose four benchmarks, based on three new traffic scenarios, illustrating distinct reinforcement learning problems with applications to mixed-autonomy traffic. We provide an introduction to each control problem, an overview of their MDP structures, and preliminary performance results from commonly used RL algorithms. For the purpose of reproducibility, the benchmarks, reference implementations, and tutorials are available at

Authors: Eugene Vinitsky, Aboudy Kriedieh, Luc Le Flem, Nishant Kheterpal, Kathy Jang, Cathy Wu, Richard Liaw, Eric Liang, Alexandre Bayen