Optimal control of an environment is a common task in industrial applications. Continuous control of power plants and robot arms but also planning with discrete actions, e.g., for navigation or routing are just some of the many examples.

Much of the theory has a long history and is well established. However, with the advance of deep reinforcement learning (RL) and automatic differentiation in the last years, many new results and connections have been uncovered. As a consequence, the lines between control theory and RL, in particular model-based RL, become more blurry with each year.

In appliedAI’s training on planning and control, we introduce the main concepts of the theory with the same terminology that is used in RL. Classical algorithms will be covered alongside modern developments in this area. A special focus is put on the many ways in which a problem can be formulated as a control/planning task.

As the content is aimed at practitioners, robustness, safety, sample efficiency and an analysis of the gap from simulation to reality (sim2real) will play a central role. For that, we will get to apply our algorithms on a real robot, like a Franka Emika arm.

This course is always offered back-to-back with a two-day training on Reinforcement Learning. Due to the overlaps in applications, we recommend booking the two together. A discount applies in this case.

Learning outcomes

Participants will learn about

Basics of decision-making: environments, trajectories, actors and rewards
Typical and less-typical control problems
Planning and classical control
From simulation to reality
Robustness and distribution shift in decision processes
Relation between control, planning and reinforcement learning

Gym environments
Classical control algorithms: PID, LQR, iLQR and others
Planning with Monte Carlo Tree Search (MCTS) and Model Predictive Control (MPC)
Comparison to search algorithms like Grid and Simplex Search
Connections between MCTS and reinforcement learning
Learning models for planning (model based methods)
Differentiable simulators and planning with analytic gradients

Learning outcomes

Contents

In this series →