Upper confidenceweighted learning for efficient exploration in multiclass prediction with binary feedback

Hung Ngo, Matthew Luciw, Ngo Anh Vien, Jürgen Schmidhuber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We introduce a novel algorithm called Upper Confidence Weighted Learning (UCWL) for online multiclass learning from binary feedback. UCWL combines the Upper Confidence Bound (UCB) framework with the Soft Confidence Weighted (SCW) online learning scheme. UCWL achieves state of the art performance (especially on noisy and nonseparable data) with low computational costs. Estimated confidence intervals are used for informed exploration, which enables faster learning than the uninformed exploration case or the case where exploration is not used. The targeted application setting is human-robot interaction (HRI), in which a robot is learning to classify its observations while a human teaches it by providing only binary feedback (e.g., right/wrong). Results in an HRI experiment, and with two benchmark datasets, show UCWL outperforms other algorithms in the online binary feedback setting, and surprisingly even sometimes beats state-of-the-art algorithms that get full feedback, while UCWL gets only binary feedback on the same data.

Original languageEnglish
Title of host publicationIJCAI 2013 - Proceedings of the 23rd International Joint Conference on Artificial Intelligence
Pages2488-2494
Number of pages7
Publication statusPublished - 01 Dec 2013
Externally publishedYes
Event23rd International Joint Conference on Artificial Intelligence, IJCAI 2013 - Beijing, China
Duration: 03 Aug 201309 Aug 2013

Conference

Conference23rd International Joint Conference on Artificial Intelligence, IJCAI 2013
CountryChina
CityBeijing
Period03/08/201309/08/2013

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Upper confidenceweighted learning for efficient exploration in multiclass prediction with binary feedback'. Together they form a unique fingerprint.

  • Cite this

    Ngo, H., Luciw, M., Vien, N. A., & Schmidhuber, J. (2013). Upper confidenceweighted learning for efficient exploration in multiclass prediction with binary feedback. In IJCAI 2013 - Proceedings of the 23rd International Joint Conference on Artificial Intelligence (pp. 2488-2494)