Projekter pr. år
This paper presents a reinforcement learning algorithm that enables fast learning of control policies based on a limited amount of training data, by leveraging the attributes of both model-based and model-free algorithms. This is accomplished by using expert demonstrations for initializing the reinforcement learning algorithm, by learning a Gaussian process model and a policy that behaves similar to the expert. The policy is subsequently improved using Bi-poplation Covariance Matrix Adaptation Evolution Strategy (BIPOP-CMA-ES) that exploits the model in a black-box optimizer. Finally, the policy parameters obtained from BIPOP-CMA-ES are refined by a model-free reinforcement learning algorithm. Scalable Variational Gaussian Processes are used in the model to allow high-dimensional state spaces and larger amounts of data; in addition, autoencoders are used for dimensionality reduction of the parameter space in BIPOP-CMA-ES. The algorithm is tested in a cart-pole system as well in a higher-dimensional industrial peg-in-hole task and is compared to state-of-the-art model-free and model-based algorithms. The proposed algorithm solves the peg-in-hole task faster than previous algorithms.
|Titel||2023 European Control Conference (ECC)|
|Status||Udgivet - 2023|
|Begivenhed||2023 European Control Conference, ECC 2023 - Bucharest, Rumænien|
Varighed: 13. jun. 2023 → 16. jun. 2023
|Konference||2023 European Control Conference, ECC 2023|
|Periode||13/06/2023 → 16/06/2023|