We consider learning and planning in relational MDPs when object existence is uncertain and new objects may appear or disappear depending on previous actions or properties of other objects. Optimal policies actively need to discover objects to achieve a goal; planning in such domains in general amounts to a POMDP problem, where the belief is about the existence and properties of potential not-yet-discovered objects. We propose a computationally efficient extension of model-based relational RL methods that approximates these beliefs using discrete uncertainty predicates. In this formulation the belief update is learned using probabilistic rules and planning in the approximated belief space can be achieved using an extension of existing planners. We prove that the learned belief update rules encode an approximation of the exact belief updates of a POMDP formulation and demonstrate experimentally that the proposed approach successfully learns a set of relational rules appropriate to solve such problems.
|Title of host publication||The 31th International Conference on Machine Learning|
|Subtitle of host publication||ICML|
|Number of pages||9|
|Publication status||Published - 2014|
|Event||International Conference on Machine Learning - Beijing, China|
Duration: 21 Jun 2014 → 26 Oct 2017
|Conference||International Conference on Machine Learning|
|Period||21/06/2014 → 26/10/2017|
FingerprintDive into the research topics of 'Model-Based Relational RL When Object Existence is Partially Observable'. Together they form a unique fingerprint.
- School of Electronics, Electrical Engineering and Computer Science - Visiting Scholar