The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to extract low-dimensional feature space in which we can find a failure avoidance policy. Then, the extracted feature space is used as a prior for RL. If we can extract proper feature space for a given task, a model of the policy can be simple and the policy can be easily improved. The performance of this learning paradigm is evaluated through simulation of a cart-pole system. As a result, we show that the proposed method can enhance the feature extraction process to find the proper feature space for a pole balancing policy. That is it allows a policy to effectively stabilize the pole in the largest domain of initial conditions compared to only using ICO learning or only using RL without any prior knowledge.