Feudal Networks
Feudal Networks (FuN)
A hierarchical reinforcement learning algorithm that implements a manager-worker architecture for temporal abstraction and goal-based learning.
Family: Hierarchical Reinforcement Learning Status: ๐ Planned
Need Help Understanding This Algorithm?
Overview
Feudal Networks (FuN) is a hierarchical reinforcement learning algorithm that implements a manager-worker
architecture for temporal abstraction. The algorithm consists of two neural networks: a manager that operates at a high level and sets abstract goals, and a worker that operates at a low level and executes actions to achieve these goals.
This hierarchical approach enables the agent to solve complex, long-horizon tasks by breaking them down into manageable subproblems. The manager learns to set useful goals, while the worker learns to achieve specific goals efficiently. Feudal Networks are particularly powerful in domains where tasks have natural hierarchical structure, such as robotics manipulation, navigation, and game playing.
Mathematical Formulation¶
๐งฎ Ask ChatGPT about Mathematical Formulation
Problem Definition
Given:
- State space: S
- Goal space: G
- Action space: A
- Manager policy: ฯ_m(g_t|s_t)
- Worker policy: ฯ_w(a_t|s_t, g_t)
- Reward function: R(s,a,s')
Find hierarchical policies that maximize expected cumulative reward:
ฯ_h(a_t|s_t) = โ_{g_t} ฯ_m(g_t|s_t) ยท ฯ_w(a_t|s_t, g_t)
Key Properties
Manager-Worker Architecture
ฯ_m(g_t|s_t) = softmax(f_m(s_t))
Manager selects goals using neural network f_m
Worker Policy
ฯ_w(a_t|s_t, g_t) = softmax(f_w(s_t, g_t))
Worker executes actions given state and goal
Hierarchical Value Function
V_h(s_t) = E_{g_t ~ ฯ_m}[V_w(s_t, g_t)]
Value function decomposes into manager and worker components
Key Properties¶
๐ Ask ChatGPT about Key Properties
-
Manager-Worker Architecture
Clear separation of high-level planning and low-level execution
-
Temporal Abstraction
Manager operates over longer time horizons than worker
-
Goal-Based Learning
Worker learns to achieve abstract goals set by manager
-
Hierarchical Learning
Both networks learn simultaneously with different objectives
Implementation Approaches¶
๐ป Ask ChatGPT about Implementation
Standard FuN implementation with manager and worker networks
Complexity:
- Time: O(batch_size ร (manager_params + worker_params))
- Space: O(batch_size ร (state_size + goal_size))
Advantages
-
Clear separation of high-level planning and low-level execution
-
Temporal abstraction enables learning at different time scales
-
Goal-based learning allows for skill reuse
-
Hierarchical structure improves sample efficiency
Disadvantages
-
Requires careful coordination between manager and worker
-
Goal achievement detection can be challenging
-
Two networks increase complexity and training time
-
Goal horizon parameter needs careful tuning
Complete Implementation
The full implementation with error handling, comprehensive testing, and additional variants is available in the source code:
-
Main implementation with manager-worker architecture:
src/algokit/hierarchical_rl/feudal_networks.py
-
Comprehensive test suite including convergence tests:
tests/unit/hierarchical_rl/test_feudal_networks.py
Complexity Analysis¶
๐ Ask ChatGPT about Complexity
Time & Space Complexity Comparison
Approach | Time Complexity | Space Complexity | Notes |
---|---|---|---|
Basic FuN | O(batch_size ร (manager_params + worker_params)) | O(batch_size ร (state_size + goal_size)) | Two-network architecture requires coordination and careful training |
Use Cases & Applications¶
๐ Ask ChatGPT about Applications
Application Categories
Robotics and Control
-
Robot Manipulation: Complex manipulation tasks with hierarchical goals
-
Autonomous Navigation: Multi-level navigation planning and execution
-
Industrial Automation: Process optimization with temporal abstraction
-
Swarm Robotics: Coordinated multi-agent behavior with hierarchical control
Game AI and Entertainment
-
Strategy Games: Multi-level decision making and planning
-
Open-World Games: Complex task decomposition and execution
-
Simulation Games: Resource management with hierarchical objectives
-
Virtual Environments: NPC behavior with long-term objectives
Real-World Applications
-
Autonomous Vehicles: Multi-level driving behavior and navigation
-
Healthcare: Treatment planning with hierarchical objectives
-
Finance: Portfolio management with temporal abstraction
-
Network Control: Traffic management with hierarchical policies
Educational Value
-
Manager-Worker Architecture: Understanding hierarchical control systems
-
Goal-Based Learning: Learning to set and achieve abstract goals
-
Temporal Abstraction: Understanding different time scales in learning
-
Transfer Learning: Learning reusable worker skills across tasks
Educational Value
-
Hierarchical Control: Perfect example of manager-worker architecture
-
Goal Setting: Shows how to set abstract goals for workers
-
Temporal Abstraction: Demonstrates learning at different time scales
-
Skill Reuse: Illustrates how worker skills can be reused across tasks
References & Further Reading¶
:material-library: Core Papers
:material-book: Hierarchical RL Textbooks
:material-web: Online Resources
:material-code-tags: Implementation & Practice
Interactive Learning
Try implementing the different approaches yourself! This progression will give you deep insight into the algorithm's principles and applications.
Pro Tip: Start with the simplest implementation and gradually work your way up to more complex variants.
Need More Help? Ask ChatGPT!
Navigation¶
Related Algorithms in Hierarchical Reinforcement Learning:
-
Hierarchical Q-Learning - Extends traditional Q-Learning to handle temporal abstraction and hierarchical task decomposition with multi-level Q-functions.
-
Hierarchical Task Networks (HTNs) - A hierarchical reinforcement learning approach that decomposes complex tasks into hierarchical structures of subtasks for planning and execution.
-
Option-Critic - A hierarchical reinforcement learning algorithm that learns options (temporally extended actions) end-to-end using policy gradient methods.
-
Hierarchical Actor-Critic (HAC) - An advanced hierarchical reinforcement learning algorithm that extends the actor-critic framework with temporal abstraction and hierarchical structure.
-
Hierarchical Policy Gradient - Extends traditional policy gradient methods to handle temporal abstraction and hierarchical task decomposition with multi-level policies.