Expert Initialized Reinforcement Learning with Application to Robotic Assembly

Jeppe Langaa*, Christoffer Sloth*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

This paper investigates the advantages and boundaries of actor-critic reinforcement learning algorithms in an industrial setting. We compare and discuss Cycle of Learning, Deep Deterministic Policy Gradient and Twin Delayed Deep Deterministic Policy Gradient with respect to performance in simulation as well as on a real robot setup. Furthermore, it emphasizes the importance and potential of combining demonstrated expert behavior with the actor-critic reinforcement learning setting while using it with an admittance controller to solve an industrial assembly task. Cycle of Learning and Twin Delayed Deep Deterministic Policy Gradient showed to be equally usable in simulation, while Cycle of Learning proved to be best on a real world application due to the behavior cloning loss that enables the agent to learn rapidly. The results also demonstrated that it is a necessity to incorporate an admittance controller in order to transfer the learned behavior to a real robot.

Original languageEnglish
Title of host publication2022 IEEE 18th International Conference on Automation Science and Engineering (CASE)
PublisherIEEE Computer Society
Publication date2022
Pages1405-1410
ISBN (Electronic)9781665490429
DOIs
Publication statusPublished - 2022
Event18th IEEE International Conference on Automation Science and Engineering, CASE 2022 - Mexico City, Mexico
Duration: 20. Aug 202224. Aug 2022

Conference

Conference18th IEEE International Conference on Automation Science and Engineering, CASE 2022
Country/TerritoryMexico
CityMexico City
Period20/08/202224/08/2022
SeriesProceedings - IEEE International Conference on Automation Science and Engineering
Volume2022-August
ISSN2161-8070

Fingerprint

Dive into the research topics of 'Expert Initialized Reinforcement Learning with Application to Robotic Assembly'. Together they form a unique fingerprint.

Cite this