State representations hold the promise of simplifying control, allowing a reinforcement learning (RL) agent to solve a task more quickly, and to generalize better to new tasks. While this representation can be learned in a multi-task setting, doing so requires manually constructing a suitable task distribution, an onerous requirement. Instead, we propose to learn a representation that encodes as few bits of the input as possible, subject to the constraint that the agent is still able to solve this task. This essentially amounts to placing “blinkers” on our agent, with the aim of ignoring spurious attributes of the state. Formally, we adopt the information bottleneck (IB) as a measure of representational complexity, and augment the standard RL objective with a lower bound.
The Information Bottleneck in RL