Exploiting Higher Order and Multi-modal Features for 3D Object Detection

Lilita Kiforenko

Research output: Book/anthology/thesis/reportPh.D. thesisResearch

Abstract

Object detection and pose estimation are fundamental tasks of computer vision. They are used in various applications in different domains (e.g. robotics, surveillance and medical treatment). Many methods to solve the object detection and pose estimation problems exist. However, while some of the methods work well for one type of objects in a specific scenario, in another scenario or with different objects they might fail, therefore more robust solutions are required. The typical problem solution is the design of robust feature descriptors, where feature descriptors contain information that describe object visual appearance such as shape, colour, texture etc.

This thesis focuses on robust object detection and pose estimation of rigid objects using 3D information. The thesis main contributions are novel feature descriptors together with object detection and pose estimation algorithms.

The initial work introduces a feature descriptor that uses edge categorisation in combination with a local multi-modal histogram descriptor in order to detect objects with little or no texture or surface variation. The comparison is performed with a state-of-the-art method, which is outperformed by the presented edge descriptor. The second work presents an approach for robust detection of multiple objects by combining feature descriptors that capture both surface and edge information. This work presents quantitative results, where the performance of the developed feature descriptor combination is compared to individual methods. The result shows a significant performance gain of the proposed method.

The third work presents a performance evaluation of Point Pair Features (PPFs). PPF is a successful feature descriptor that was introduced more than a decade ago, but still, is considered as a state-of-the-art descriptor and to this date, constant improvements of it are presented. The evaluation of PPFs is performed on seven publicly available datasets and it presents not only the performance comparison towards other popularly used methods, but also investigations of the space of possible point pair relations for building PPFs. The overall results show that the performance depends on the dataset used but in general, PPFs are more descriptive than local feature descriptors.

The fourth work presents a novel third-order feature descriptor called Point Triplet Feature (PTF), which is an extension of the second-order PPF. The work focuses on how to design a third-order feature and handle the increasing complexity of using a cubic number of features. The method has been evaluated on two datasets and compared to the PPF and other methods. The overall results show that third-order features are more robust than second-order features, especially in highly occluded scenes, but with a cost of increased computation time.

Finally, the last chapter presents a practical application of the proposed feature descriptors in a robotic cell, which was used in the European ACAT project.
Original languageEnglish
PublisherSyddansk Universitet. Det Tekniske Fakultet
Number of pages114
Publication statusPublished - 2017

Fingerprint

Robotics
Textures
Computer vision
Color
Object detection
Costs

Note re. dissertation

Grad tildelt d. 02-02 2017

Cite this

Kiforenko, L. (2017). Exploiting Higher Order and Multi-modal Features for 3D Object Detection. Syddansk Universitet. Det Tekniske Fakultet.
Kiforenko, Lilita. / Exploiting Higher Order and Multi-modal Features for 3D Object Detection. Syddansk Universitet. Det Tekniske Fakultet, 2017. 114 p.
@phdthesis{47ed25bb6a9e407c8ea6b235dea35ff7,
title = "Exploiting Higher Order and Multi-modal Features for 3D Object Detection",
abstract = "Object detection and pose estimation are fundamental tasks of computer vision. They are used in various applications in different domains (e.g. robotics, surveillance and medical treatment). Many methods to solve the object detection and pose estimation problems exist. However, while some of the methods work well for one type of objects in a specific scenario, in another scenario or with different objects they might fail, therefore more robust solutions are required. The typical problem solution is the design of robust feature descriptors, where feature descriptors contain information that describe object visual appearance such as shape, colour, texture etc. This thesis focuses on robust object detection and pose estimation of rigid objects using 3D information. The thesis main contributions are novel feature descriptors together with object detection and pose estimation algorithms. The initial work introduces a feature descriptor that uses edge categorisation in combination with a local multi-modal histogram descriptor in order to detect objects with little or no texture or surface variation. The comparison is performed with a state-of-the-art method, which is outperformed by the presented edge descriptor. The second work presents an approach for robust detection of multiple objects by combining feature descriptors that capture both surface and edge information. This work presents quantitative results, where the performance of the developed feature descriptor combination is compared to individual methods. The result shows a significant performance gain of the proposed method. The third work presents a performance evaluation of Point Pair Features (PPFs). PPF is a successful feature descriptor that was introduced more than a decade ago, but still, is considered as a state-of-the-art descriptor and to this date, constant improvements of it are presented. The evaluation of PPFs is performed on seven publicly available datasets and it presents not only the performance comparison towards other popularly used methods, but also investigations of the space of possible point pair relations for building PPFs. The overall results show that the performance depends on the dataset used but in general, PPFs are more descriptive than local feature descriptors. The fourth work presents a novel third-order feature descriptor called Point Triplet Feature (PTF), which is an extension of the second-order PPF. The work focuses on how to design a third-order feature and handle the increasing complexity of using a cubic number of features. The method has been evaluated on two datasets and compared to the PPF and other methods. The overall results show that third-order features are more robust than second-order features, especially in highly occluded scenes, but with a cost of increased computation time. Finally, the last chapter presents a practical application of the proposed feature descriptors in a robotic cell, which was used in the European ACAT project.",
author = "Lilita Kiforenko",
year = "2017",
language = "English",
publisher = "Syddansk Universitet. Det Tekniske Fakultet",
address = "Denmark",

}

Kiforenko, L 2017, Exploiting Higher Order and Multi-modal Features for 3D Object Detection. Syddansk Universitet. Det Tekniske Fakultet.

Exploiting Higher Order and Multi-modal Features for 3D Object Detection. / Kiforenko, Lilita.

Syddansk Universitet. Det Tekniske Fakultet, 2017. 114 p.

Research output: Book/anthology/thesis/reportPh.D. thesisResearch

TY - BOOK

T1 - Exploiting Higher Order and Multi-modal Features for 3D Object Detection

AU - Kiforenko, Lilita

PY - 2017

Y1 - 2017

N2 - Object detection and pose estimation are fundamental tasks of computer vision. They are used in various applications in different domains (e.g. robotics, surveillance and medical treatment). Many methods to solve the object detection and pose estimation problems exist. However, while some of the methods work well for one type of objects in a specific scenario, in another scenario or with different objects they might fail, therefore more robust solutions are required. The typical problem solution is the design of robust feature descriptors, where feature descriptors contain information that describe object visual appearance such as shape, colour, texture etc. This thesis focuses on robust object detection and pose estimation of rigid objects using 3D information. The thesis main contributions are novel feature descriptors together with object detection and pose estimation algorithms. The initial work introduces a feature descriptor that uses edge categorisation in combination with a local multi-modal histogram descriptor in order to detect objects with little or no texture or surface variation. The comparison is performed with a state-of-the-art method, which is outperformed by the presented edge descriptor. The second work presents an approach for robust detection of multiple objects by combining feature descriptors that capture both surface and edge information. This work presents quantitative results, where the performance of the developed feature descriptor combination is compared to individual methods. The result shows a significant performance gain of the proposed method. The third work presents a performance evaluation of Point Pair Features (PPFs). PPF is a successful feature descriptor that was introduced more than a decade ago, but still, is considered as a state-of-the-art descriptor and to this date, constant improvements of it are presented. The evaluation of PPFs is performed on seven publicly available datasets and it presents not only the performance comparison towards other popularly used methods, but also investigations of the space of possible point pair relations for building PPFs. The overall results show that the performance depends on the dataset used but in general, PPFs are more descriptive than local feature descriptors. The fourth work presents a novel third-order feature descriptor called Point Triplet Feature (PTF), which is an extension of the second-order PPF. The work focuses on how to design a third-order feature and handle the increasing complexity of using a cubic number of features. The method has been evaluated on two datasets and compared to the PPF and other methods. The overall results show that third-order features are more robust than second-order features, especially in highly occluded scenes, but with a cost of increased computation time. Finally, the last chapter presents a practical application of the proposed feature descriptors in a robotic cell, which was used in the European ACAT project.

AB - Object detection and pose estimation are fundamental tasks of computer vision. They are used in various applications in different domains (e.g. robotics, surveillance and medical treatment). Many methods to solve the object detection and pose estimation problems exist. However, while some of the methods work well for one type of objects in a specific scenario, in another scenario or with different objects they might fail, therefore more robust solutions are required. The typical problem solution is the design of robust feature descriptors, where feature descriptors contain information that describe object visual appearance such as shape, colour, texture etc. This thesis focuses on robust object detection and pose estimation of rigid objects using 3D information. The thesis main contributions are novel feature descriptors together with object detection and pose estimation algorithms. The initial work introduces a feature descriptor that uses edge categorisation in combination with a local multi-modal histogram descriptor in order to detect objects with little or no texture or surface variation. The comparison is performed with a state-of-the-art method, which is outperformed by the presented edge descriptor. The second work presents an approach for robust detection of multiple objects by combining feature descriptors that capture both surface and edge information. This work presents quantitative results, where the performance of the developed feature descriptor combination is compared to individual methods. The result shows a significant performance gain of the proposed method. The third work presents a performance evaluation of Point Pair Features (PPFs). PPF is a successful feature descriptor that was introduced more than a decade ago, but still, is considered as a state-of-the-art descriptor and to this date, constant improvements of it are presented. The evaluation of PPFs is performed on seven publicly available datasets and it presents not only the performance comparison towards other popularly used methods, but also investigations of the space of possible point pair relations for building PPFs. The overall results show that the performance depends on the dataset used but in general, PPFs are more descriptive than local feature descriptors. The fourth work presents a novel third-order feature descriptor called Point Triplet Feature (PTF), which is an extension of the second-order PPF. The work focuses on how to design a third-order feature and handle the increasing complexity of using a cubic number of features. The method has been evaluated on two datasets and compared to the PPF and other methods. The overall results show that third-order features are more robust than second-order features, especially in highly occluded scenes, but with a cost of increased computation time. Finally, the last chapter presents a practical application of the proposed feature descriptors in a robotic cell, which was used in the European ACAT project.

M3 - Ph.D. thesis

BT - Exploiting Higher Order and Multi-modal Features for 3D Object Detection

PB - Syddansk Universitet. Det Tekniske Fakultet

ER -

Kiforenko L. Exploiting Higher Order and Multi-modal Features for 3D Object Detection. Syddansk Universitet. Det Tekniske Fakultet, 2017. 114 p.