Skip to content

Feudal Networks

Feudal Networks (FuN)

A hierarchical reinforcement learning algorithm that implements a manager-worker architecture for temporal abstraction and goal-based learning.

Family: Hierarchical Reinforcement Learning Status: ๐Ÿ“‹ Planned

Need Help Understanding This Algorithm?

๐Ÿค– Ask ChatGPT about Feudal Networks (FuN)

Overview

Feudal Networks (FuN) is a hierarchical reinforcement learning algorithm that implements a manager-worker

architecture for temporal abstraction. The algorithm consists of two neural networks: a manager that operates at a high level and sets abstract goals, and a worker that operates at a low level and executes actions to achieve these goals.

This hierarchical approach enables the agent to solve complex, long-horizon tasks by breaking them down into manageable subproblems. The manager learns to set useful goals, while the worker learns to achieve specific goals efficiently. Feudal Networks are particularly powerful in domains where tasks have natural hierarchical structure, such as robotics manipulation, navigation, and game playing.

Mathematical Formulation

๐Ÿงฎ Ask ChatGPT about Mathematical Formulation

Problem Definition

Given:

  • State space: S
  • Goal space: G
  • Action space: A
  • Manager policy: ฯ€_m(g_t|s_t)
  • Worker policy: ฯ€_w(a_t|s_t, g_t)
  • Reward function: R(s,a,s')

Find hierarchical policies that maximize expected cumulative reward:

ฯ€_h(a_t|s_t) = โˆ‘_{g_t} ฯ€_m(g_t|s_t) ยท ฯ€_w(a_t|s_t, g_t)

Key Properties

Manager-Worker Architecture

ฯ€_m(g_t|s_t) = softmax(f_m(s_t))

Manager selects goals using neural network f_m


Worker Policy

ฯ€_w(a_t|s_t, g_t) = softmax(f_w(s_t, g_t))

Worker executes actions given state and goal


Hierarchical Value Function

V_h(s_t) = E_{g_t ~ ฯ€_m}[V_w(s_t, g_t)]

Value function decomposes into manager and worker components


Key Properties

๐Ÿ”‘ Ask ChatGPT about Key Properties

  • Manager-Worker Architecture


    Clear separation of high-level planning and low-level execution

  • Temporal Abstraction


    Manager operates over longer time horizons than worker

  • Goal-Based Learning


    Worker learns to achieve abstract goals set by manager

  • Hierarchical Learning


    Both networks learn simultaneously with different objectives

Implementation Approaches

๐Ÿ’ป Ask ChatGPT about Implementation

Standard FuN implementation with manager and worker networks

Complexity:

  • Time: O(batch_size ร— (manager_params + worker_params))
  • Space: O(batch_size ร— (state_size + goal_size))

Advantages

  • Clear separation of high-level planning and low-level execution

  • Temporal abstraction enables learning at different time scales

  • Goal-based learning allows for skill reuse

  • Hierarchical structure improves sample efficiency

Disadvantages

  • Requires careful coordination between manager and worker

  • Goal achievement detection can be challenging

  • Two networks increase complexity and training time

  • Goal horizon parameter needs careful tuning

Complete Implementation

The full implementation with error handling, comprehensive testing, and additional variants is available in the source code:

Complexity Analysis

๐Ÿ“Š Ask ChatGPT about Complexity

Time & Space Complexity Comparison

Approach Time Complexity Space Complexity Notes
Basic FuN O(batch_size ร— (manager_params + worker_params)) O(batch_size ร— (state_size + goal_size)) Two-network architecture requires coordination and careful training

Use Cases & Applications

๐ŸŒ Ask ChatGPT about Applications

Application Categories

Robotics and Control

  • Robot Manipulation: Complex manipulation tasks with hierarchical goals

  • Autonomous Navigation: Multi-level navigation planning and execution

  • Industrial Automation: Process optimization with temporal abstraction

  • Swarm Robotics: Coordinated multi-agent behavior with hierarchical control

Game AI and Entertainment

  • Strategy Games: Multi-level decision making and planning

  • Open-World Games: Complex task decomposition and execution

  • Simulation Games: Resource management with hierarchical objectives

  • Virtual Environments: NPC behavior with long-term objectives

Real-World Applications

  • Autonomous Vehicles: Multi-level driving behavior and navigation

  • Healthcare: Treatment planning with hierarchical objectives

  • Finance: Portfolio management with temporal abstraction

  • Network Control: Traffic management with hierarchical policies

Educational Value

  • Manager-Worker Architecture: Understanding hierarchical control systems

  • Goal-Based Learning: Learning to set and achieve abstract goals

  • Temporal Abstraction: Understanding different time scales in learning

  • Transfer Learning: Learning reusable worker skills across tasks

Educational Value

  • Hierarchical Control: Perfect example of manager-worker architecture

  • Goal Setting: Shows how to set abstract goals for workers

  • Temporal Abstraction: Demonstrates learning at different time scales

  • Skill Reuse: Illustrates how worker skills can be reused across tasks

References & Further Reading

:material-library: Core Papers

:material-file-document:
Original Feudal Networks paper introducing manager-worker architecture

:material-book: Hierarchical RL Textbooks

:material-file-document:
Comprehensive introduction to reinforcement learning including hierarchical methods
:material-file-document:
Foundational work on hierarchical reinforcement learning

:material-web: Online Resources

:material-link:
Wikipedia article on Feudal Networks
:material-link:
OpenAI Spinning Up tutorial on hierarchical RL

:material-code-tags: Implementation & Practice

:material-link:
PyTorch deep learning framework documentation
:material-link:
RL environments for testing algorithms
:material-link:
High-quality RL algorithm implementations

Interactive Learning

Try implementing the different approaches yourself! This progression will give you deep insight into the algorithm's principles and applications.

Pro Tip: Start with the simplest implementation and gradually work your way up to more complex variants.

Related Algorithms in Hierarchical Reinforcement Learning:

  • Hierarchical Q-Learning - Extends traditional Q-Learning to handle temporal abstraction and hierarchical task decomposition with multi-level Q-functions.

  • Hierarchical Task Networks (HTNs) - A hierarchical reinforcement learning approach that decomposes complex tasks into hierarchical structures of subtasks for planning and execution.

  • Option-Critic - A hierarchical reinforcement learning algorithm that learns options (temporally extended actions) end-to-end using policy gradient methods.

  • Hierarchical Actor-Critic (HAC) - An advanced hierarchical reinforcement learning algorithm that extends the actor-critic framework with temporal abstraction and hierarchical structure.

  • Hierarchical Policy Gradient - Extends traditional policy gradient methods to handle temporal abstraction and hierarchical task decomposition with multi-level policies.