Extraction of reward-related feature space using correlation-based and reward-based learning methods

Poramate Manoonpong*, Florentin Wörgötter, Jun Morimoto

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to extract low-dimensional feature space in which we can find a failure avoidance policy. Then, the extracted feature space is used as a prior for RL. If we can extract proper feature space for a given task, a model of the policy can be simple and the policy can be easily improved. The performance of this learning paradigm is evaluated through simulation of a cart-pole system. As a result, we show that the proposed method can enhance the feature extraction process to find the proper feature space for a pole balancing policy. That is it allows a policy to effectively stabilize the pole in the largest domain of initial conditions compared to only using ICO learning or only using RL without any prior knowledge.

Original languageEnglish
Title of host publicationNeural Information Processing : Theory and Algorithms - 17th International Conference, ICONIP 2010, Proceedings
Number of pages8
Publication date21. Dec 2010
EditionPART 1
Pages414-421
ISBN (Print)3642175368, 9783642175367
DOIs
Publication statusPublished - 21. Dec 2010
Event17th International Conference on Neural Information Processing, ICONIP 2010 - Sydney, NSW, Australia
Duration: 22. Nov 201025. Nov 2010

Conference

Conference17th International Conference on Neural Information Processing, ICONIP 2010
Country/TerritoryAustralia
CitySydney, NSW
Period22/11/201025/11/2010
SponsorAsia Pacific Neural Network Assembly (APNNA)
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6443 LNCS
ISSN0302-9743

Keywords

  • Neural control
  • Pole balancing
  • Reinforcement learning
  • Sequential combination
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'Extraction of reward-related feature space using correlation-based and reward-based learning methods'. Together they form a unique fingerprint.

Cite this