Monte carlo bayesian hierarchical reinforcement learning

Vien Anh Ngo, Hung Ngo, Ertel Wolfgang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)


In this paper, we propose to use hierarchical action decomposition to make Bayesian model-based reinforcement learning more efficient and feasible in practice. We formulate Bayesian hierarchical reinforcement learning as a partially observable semi-Markov decision process (POSMDP). The main POSMDP task is partitioned into a hierarchy of POSMDP subtasks; lower-level subtasks get solved first, then higher-level ones. We sample from a prior belief to build an approximate model for each POSMDP, then solve using Monte Carlo Value Iteration with Macro-Actions solver. Experimental results show that our algorithm performs significantly better than that of flat BRL in terms of both reward, and especially solving time, in at least one order of magnitude.
Original languageEnglish
Title of host publicationInternational conference on Autonomous Agents and Multi-Agent Systems, AAMAS '14, Paris, France, May 5-9, 2014
Subtitle of host publicationAAMAS
Number of pages2
Publication statusPublished - 2014


Dive into the research topics of 'Monte carlo bayesian hierarchical reinforcement learning'. Together they form a unique fingerprint.

Cite this