TY - JOUR
T1 - A new benchmark for pose estimation with ground truth from virtual reality
AU - Schlette, Christian
AU - Buch, Anders Glent
AU - Aksoy, Eren Erdal
AU - Steil, Thomas
AU - Papon, Jérémie
AU - Savarimuthu, Thiusius Rajeeth
AU - Wörgötter, Florentin
AU - Krüger, Norbert
AU - Roßmann, Jürgen
PY - 2014
Y1 - 2014
N2 - The development of programming paradigms for industrial assembly currently gets fresh impetus from approaches in human demonstration and programming-by-demonstration. Major low- and mid-level prerequisites for machine vision and learning in these intelligent robotic applications are pose estimation, stereo reconstruction and action recognition. As a basis for the machine vision and learning involved, pose estimation is used for deriving object positions and orientations and thus target frames for robot execution. Our contribution introduces and applies a novel benchmark for typical multi-sensor setups and algorithms in the field of demonstration-based automated assembly. The benchmark platform is equipped with a multi-sensor setup consisting of stereo cameras and depth scanning devices (see Fig. 1). The dimensions and abilities of the platform have been chosen in order to reflect typical manual assembly tasks. Following the eRobotics methodology, a simulatable 3D representation of this platform was modelled in virtual reality. Based on a detailed camera and sensor simulation, we generated a set of benchmark images and point clouds with controlled levels of noise as well as ground truth data such as object positions and time stamps. We demonstrate the application of the benchmark to evaluate our latest developments in pose estimation, stereo reconstruction and action recognition and publish the benchmark data for objective comparison of sensor setups and algorithms in industry.
AB - The development of programming paradigms for industrial assembly currently gets fresh impetus from approaches in human demonstration and programming-by-demonstration. Major low- and mid-level prerequisites for machine vision and learning in these intelligent robotic applications are pose estimation, stereo reconstruction and action recognition. As a basis for the machine vision and learning involved, pose estimation is used for deriving object positions and orientations and thus target frames for robot execution. Our contribution introduces and applies a novel benchmark for typical multi-sensor setups and algorithms in the field of demonstration-based automated assembly. The benchmark platform is equipped with a multi-sensor setup consisting of stereo cameras and depth scanning devices (see Fig. 1). The dimensions and abilities of the platform have been chosen in order to reflect typical manual assembly tasks. Following the eRobotics methodology, a simulatable 3D representation of this platform was modelled in virtual reality. Based on a detailed camera and sensor simulation, we generated a set of benchmark images and point clouds with controlled levels of noise as well as ground truth data such as object positions and time stamps. We demonstrate the application of the benchmark to evaluate our latest developments in pose estimation, stereo reconstruction and action recognition and publish the benchmark data for objective comparison of sensor setups and algorithms in industry.
KW - Industrial Assembly
KW - Machine Vision
KW - Machine Learning
KW - Virtual Reality
UR - http://www.scopus.com/inward/record.url?scp=84911006617&partnerID=8YFLogxK
U2 - 10.1007/s11740-014-0552-0
DO - 10.1007/s11740-014-0552-0
M3 - Journal article
AN - SCOPUS:84911006617
SN - 0944-6524
VL - 8
SP - 745
EP - 754
JO - Production Engineering
JF - Production Engineering
IS - 6
ER -