Learning Visual Representations for Perception-Action Systems

Justus Piater, Sebastien Jodogne, Renaud Detry, Dirk Kraft, Norbert Krüger, Oliver Kroemer, Jan Peters

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

We discuss vision as a sensory modality for systems that effect actions in response to perceptions. While the internal representations informed by vision may be arbitrarily complex, we argue that in many cases it is advantageous to link them rather directly to action via learned mappings. These arguments are illustrated by two examples of our own work. First, our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.
OriginalsprogEngelsk
TidsskriftInternational Journal of Robotics Research
Vol/bind30
Udgave nummer3
Sider (fra-til)294-307
ISSN0278-3649
DOI
StatusUdgivet - 2011

Fingeraftryk

Grippers
Reinforcement learning
Object Detection
Reinforcement
Probabilistic Inference
Grasping
Supervised Classification
Pose Estimation
Aliasing
Trial and error
Object Model
Structural Model
Classification Algorithm
Reinforcement Learning
Modality
Likelihood
Discretization
Learning
Vision
Perception

Citer dette

Piater, Justus ; Jodogne, Sebastien ; Detry, Renaud ; Kraft, Dirk ; Krüger, Norbert ; Kroemer, Oliver ; Peters, Jan. / Learning Visual Representations for Perception-Action Systems. I: International Journal of Robotics Research. 2011 ; Bind 30, Nr. 3. s. 294-307.
@article{3f8b76f09e4911df846a000ea68e967b,
title = "Learning Visual Representations for Perception-Action Systems",
abstract = "We discuss vision as a sensory modality for systems that effect actions in response to perceptions. While the internal representations informed by vision may be arbitrarily complex, we argue that in many cases it is advantageous to link them rather directly to action via learned mappings. These arguments are illustrated by two examples of our own work. First, our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.",
author = "Justus Piater and Sebastien Jodogne and Renaud Detry and Dirk Kraft and Norbert Kr{\"u}ger and Oliver Kroemer and Jan Peters",
year = "2011",
doi = "10.1177/0278364910382464",
language = "English",
volume = "30",
pages = "294--307",
journal = "International Journal of Robotics Research",
issn = "0278-3649",
publisher = "SAGE Publications",
number = "3",

}

Learning Visual Representations for Perception-Action Systems. / Piater, Justus; Jodogne, Sebastien; Detry, Renaud; Kraft, Dirk; Krüger, Norbert; Kroemer, Oliver; Peters, Jan.

I: International Journal of Robotics Research, Bind 30, Nr. 3, 2011, s. 294-307.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Learning Visual Representations for Perception-Action Systems

AU - Piater, Justus

AU - Jodogne, Sebastien

AU - Detry, Renaud

AU - Kraft, Dirk

AU - Krüger, Norbert

AU - Kroemer, Oliver

AU - Peters, Jan

PY - 2011

Y1 - 2011

N2 - We discuss vision as a sensory modality for systems that effect actions in response to perceptions. While the internal representations informed by vision may be arbitrarily complex, we argue that in many cases it is advantageous to link them rather directly to action via learned mappings. These arguments are illustrated by two examples of our own work. First, our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.

AB - We discuss vision as a sensory modality for systems that effect actions in response to perceptions. While the internal representations informed by vision may be arbitrarily complex, we argue that in many cases it is advantageous to link them rather directly to action via learned mappings. These arguments are illustrated by two examples of our own work. First, our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.

U2 - 10.1177/0278364910382464

DO - 10.1177/0278364910382464

M3 - Journal article

VL - 30

SP - 294

EP - 307

JO - International Journal of Robotics Research

JF - International Journal of Robotics Research

SN - 0278-3649

IS - 3

ER -