Approximate planning for bayesian hierarchical reinforcement learning

Vien Ngo, Hung Quoc Ngo, Sungyoung Lee, TaeChoong Chung

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


In this paper, we propose to use hierarchical action decomposition to make Bayesian model-based reinforcement learning more efficient and feasible for larger problems. We formulate Bayesian hierarchical reinforcement learning as a partially observable semi-Markov decision process (POSMDP). The main POSMDP task is partitioned into a hierarchy of POSMDP subtasks. Each subtask might consist of only primitive actions or hierarchically call other subtasks’ policies, since the policies of lower-level subtasks are considered as macro actions in higher-level subtasks. A solution for this hierarchical action decomposition is to solve lower-level subtasks first, then higher-level ones. Because each formulated POSMDP has a continuous state space, we sample from a prior belief to build an approximate model for them, then solve by using a recently introduced Monte Carlo Value Iteration with Macro-Actions solver. We name this method Monte Carlo Bayesian Hierarchical Reinforcement Learning. Simulation results show that our algorithm exploiting the action hierarchy performs significantly better than that of flat Bayesian reinforcement learning in terms of both reward, and especially solving time, in at least one order of magnitude.

Original languageEnglish
Pages (from-to)808-819
Number of pages12
JournalApplied Intelligence
Issue number3
Early online date20 Jul 2014
Publication statusPublished - Oct 2014

Fingerprint Dive into the research topics of 'Approximate planning for bayesian hierarchical reinforcement learning'. Together they form a unique fingerprint.

Cite this