Learning via human feedback in continuous state and action spaces

Vien Ngo, Wolfgang Ertel, TaeChoong Chung

Research output: Contribution to journalArticlepeer-review

18 Citations (Scopus)

Abstract

This paper considers the problem of extending Training an Agent Manually via Evaluative Reinforcement (TAMER) in continuous state and action spaces. Investigative research using the TAMER framework enables a non-technical human to train an agent through a natural form of human feedback (negative or positive). The advantages of TAMER have been shown on tasks of training agents by only human feedback or combining human feedback with environment rewards. However, these methods are originally designed for discrete state-action, or continuous state-discrete action problems. This paper proposes an extension of TAMER to allow both continuous states and actions, called ACTAMER. The new framework utilizes any general function approximation of a human trainer’s feedback signal. Moreover, a combined capability of ACTAMER and reinforcement learning is also investigated and evaluated. The combination of human feedback and reinforcement learning is studied in both settings: sequential and simultaneous. Our experimental results demonstrate the proposed method successfully allowing a human to train an agent in two continuous state-action domains: Mountain Car and Cart-pole (balancing).
Original languageEnglish
Pages (from-to)267-278
Number of pages12
JournalApplied Intelligence
Volume39
Issue number2
DOIs
Publication statusPublished - 2013

Fingerprint

Dive into the research topics of 'Learning via human feedback in continuous state and action spaces'. Together they form a unique fingerprint.

Cite this