Skip to content

Hierarchical Reinforcement Learning Algorithms

Hierarchical Reinforcement Learning decomposes complex tasks into simpler subtasks using temporal abstraction and multi-level decision making.

Hierarchical Reinforcement Learning (HRL) extends traditional reinforcement learning by decomposing

complex tasks into simpler subtasks or options. This hierarchical structure enables agents to learn more efficiently by reusing learned skills and operating at multiple levels of abstraction, from high-level strategic planning to low-level control execution.

Unlike traditional RL where agents learn flat policies, HRL introduces temporal abstraction through options - temporally extended actions that can be executed over multiple time steps. This allows agents to operate at different time scales and reuse learned behaviors across different tasks, making them particularly powerful for complex, long-horizon problems.

Overview

Key Characteristics:

  • Temporal Abstraction


    Actions that operate over different time scales from high-level strategy to low-level control

  • Skill Reuse


    Learned behaviors that can be applied to new tasks and environments

  • Multi-level Decision Making


    Hierarchical structure from strategy to control with different abstraction levels

  • Option Policies


    Temporally extended actions over multiple time steps with initiation and termination conditions

Common Applications:

  • manipulation tasks

  • navigation

  • autonomous vehicles

  • humanoid robots

  • real-time strategy games

  • chess

  • poker

  • multi-player games

  • swarm robotics

  • distributed systems

  • cooperative games

  • industrial automation

  • smart grids

  • traffic management

  • dialogue systems

  • text generation

  • language understanding

  • object recognition

  • scene understanding

  • visual navigation

Key Concepts

  • Options


    Temporally extended actions that can be executed over multiple time steps

  • Hierarchy


    Multiple levels of decision-making from high-level strategy to low-level control

  • Skill Reuse


    Learned behaviors that can be applied to new tasks and environments

  • Temporal Abstraction


    Actions that operate over different time scales

  • Subgoal Decomposition


    Breaking complex tasks into manageable subtasks

  • Policy Hierarchies


    Layered policies operating at different abstraction levels

  • Initiation Set


    States where an option can be started

  • Termination Function


    Probability of option terminating in each state

Complexity Analysis

Complexity Overview

Time: O(n²) to O(n³) Space: O(n) to O(n²)

Complexity depends on hierarchy depth, option complexity, and state-action space size

šŸŽØ State-of-the-Art HRL Algorithms

Options Framework (95% coverage) šŸŽÆ

Temporal abstraction through options (closed-loop policies with initiation/termination conditions)

  • Formal mathematical framework with convergence guarantees
  • Semi-Markov decision process framework
  • Flexible option discovery methods
  • Best for discrete, well-defined subtasks

Feudal RL (98% coverage) šŸ‘‘

Manager-worker architecture with explicit goal-setting hierarchy

  • End-to-end differentiable manager-worker networks
  • Automatic goal discovery in latent space
  • Handles long-term dependencies
  • Best for long-horizon continuous control

HIRO (99% coverage) šŸš€

Goal-conditioned policies with off-policy correction for sample efficiency

  • State-of-the-art sample efficiency
  • Off-policy learning at both levels with goal relabeling
  • TD3-style target smoothing
  • Best for sample-limited robotic tasks

Types of Hierarchies

Temporal Hierarchy:

  • High-level: Strategic decisions and goal setting
  • Mid-level: Skill selection and coordination
  • Low-level: Primitive actions and control

Spatial Hierarchy:

  • Global: Environment-wide planning
  • Regional: Local area navigation
  • Local: Immediate obstacle avoidance

Functional Hierarchy:

  • Planning: Long-term strategy
  • Navigation: Path finding and movement
  • Control: Actuator commands

Hierarchy Construction Methods

Predefined Hierarchies:

  • Human-designed task decomposition
  • Fixed skill libraries
  • Structured learning objectives

Learned Hierarchies:

  • Automatic task decomposition
  • Dynamic skill discovery
  • Adaptive abstraction levels

Hybrid Approaches:

  • Combine predefined and learned components
  • Incremental hierarchy construction
  • Skill refinement over time

Option Components

  1. Initiation Set: States where the option can be started
  2. Policy: How to behave while the option is executing
  3. Termination Function: When to stop executing the option
  4. Reward Function: How rewards are distributed during option execution

Option Properties: - Temporal Abstraction: Options can last multiple time steps - Reusability: Same option can be used in different contexts - Composability: Options can be combined to form complex behaviors

Comparison Table

Algorithms Coming Soon

This algorithm family is currently in development. The following algorithms are planned for implementation:

  • Algorithm implementations are being developed
  • Check back soon for updates

Algorithms in This Family

Algorithms Coming Soon

This algorithm family is currently in development. The following algorithms are planned for implementation:

  • Algorithm implementations are being developed
  • Check back soon for updates

Implementation Status

Development Status

This algorithm family is currently in development. All algorithms are planned for implementation.

Algorithm implementations are being developed. Check back soon for updates.

  • Rl: HRL extends traditional reinforcement learning with hierarchical structure

  • Multi-Agent: HRL can be applied to multi-agent coordination and cooperation

  • Planning: HRL often incorporates planning algorithms for task decomposition

  • Neural-Networks: Deep HRL combines hierarchical structure with neural networks

References

  1. Bacon, Pierre-Luc and Harb, Jean and Precup, Doina (2017). The Option-Critic Architecture. AAAI Press

  2. Vezhnevets, Alexander Sasha and Osindero, Simon and Schaul, Tom and Heess, Nicolas and Jaderberg, Max and Silver, David and Kavukcuoglu, Koray (2017). FeUdal Networks for Hierarchical Reinforcement Learning. PMLR

Tags

Hierarchical RL Reinforcement learning with hierarchical structure

Reinforcement Learning Machine learning algorithms that learn through interaction

Algorithms General algorithmic concepts and implementations