Extraction of reward-related feature space using correlation-based and reward-based learning methods

Poramate Manoonpong*, Florentin Wörgötter, Jun Morimoto

*Kontaktforfatter for dette arbejde

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

Abstrakt

The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to extract low-dimensional feature space in which we can find a failure avoidance policy. Then, the extracted feature space is used as a prior for RL. If we can extract proper feature space for a given task, a model of the policy can be simple and the policy can be easily improved. The performance of this learning paradigm is evaluated through simulation of a cart-pole system. As a result, we show that the proposed method can enhance the feature extraction process to find the proper feature space for a pole balancing policy. That is it allows a policy to effectively stabilize the pole in the largest domain of initial conditions compared to only using ICO learning or only using RL without any prior knowledge.

OriginalsprogEngelsk
TitelNeural Information Processing : Theory and Algorithms - 17th International Conference, ICONIP 2010, Proceedings
Antal sider8
Publikationsdato21. dec. 2010
UdgavePART 1
Sider414-421
ISBN (Trykt)3642175368, 9783642175367
DOI
StatusUdgivet - 21. dec. 2010
Begivenhed17th International Conference on Neural Information Processing, ICONIP 2010 - Sydney, NSW, Australien
Varighed: 22. nov. 201025. nov. 2010

Konference

Konference17th International Conference on Neural Information Processing, ICONIP 2010
LandAustralien
BySydney, NSW
Periode22/11/201025/11/2010
SponsorAsia Pacific Neural Network Assembly (APNNA)
NavnLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NummerPART 1
Vol/bind6443 LNCS
ISSN0302-9743

Fingeraftryk

Dyk ned i forskningsemnerne om 'Extraction of reward-related feature space using correlation-based and reward-based learning methods'. Sammen danner de et unikt fingeraftryk.

Citationsformater