Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning

Jeppe Langaa*, Christoffer Sloth

*Kontaktforfatter

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

Abstract

This paper presents a reinforcement learning algorithm that enables fast learning of control policies based on a limited amount of training data, by leveraging the attributes of both model-based and model-free algorithms. This is accomplished by using expert demonstrations for initializing the reinforcement learning algorithm, by learning a Gaussian process model and a policy that behaves similar to the expert. The policy is subsequently improved using Bi-poplation Covariance Matrix Adaptation Evolution Strategy (BIPOP-CMA-ES) that exploits the model in a black-box optimizer. Finally, the policy parameters obtained from BIPOP-CMA-ES are refined by a model-free reinforcement learning algorithm. Scalable Variational Gaussian Processes are used in the model to allow high-dimensional state spaces and larger amounts of data; in addition, autoencoders are used for dimensionality reduction of the parameter space in BIPOP-CMA-ES. The algorithm is tested in a cart-pole system as well in a higher-dimensional industrial peg-in-hole task and is compared to state-of-the-art model-free and model-based algorithms. The proposed algorithm solves the peg-in-hole task faster than previous algorithms.
OriginalsprogEngelsk
Titel2023 European Control Conference (ECC)
Antal sider6
ForlagIEEE
Publikationsdato2023
ISBN (Elektronisk)978-3-907144-08-4
DOI
StatusUdgivet - 2023
Begivenhed2023 European Control Conference, ECC 2023 - Bucharest, Rumænien
Varighed: 13. jun. 202316. jun. 2023

Konference

Konference2023 European Control Conference, ECC 2023
Land/OmrådeRumænien
ByBucharest
Periode13/06/202316/06/2023

Fingeraftryk

Dyk ned i forskningsemnerne om 'Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning'. Sammen danner de et unikt fingeraftryk.
  • PIRAT

    Sloth, C. (Projektdeltager)

    01/10/201901/10/2023

    Projekter: ProjektForskning

Citationsformater