Abstract
Solid protocols to benchmark local feature detectors and descriptors were introduced by Mikolajczyk et al. [1,2]. The detectors and the descriptors are popular tools in object class matching, but the wide baseline setting in the benchmarks does not correspond to class-level matching where appearance variation can be large. We extend the benchmarks to the class matching setting and evaluate state-ofthe- art detectors and descriptors with Caltech and ImageNet classes. Our experiments provide important findings with regard to object class matching: (1) the original SIFT is still the best descriptor; (2) dense sampling outperforms interest point detectors with a clear margin; (3) detectors perform moderately well, but descriptors' performance collapses; (4) using multiple, even a few, best matches instead of the single best has significant effect on the performance; (5) object pose variation degrades dense sampling performance while the best detector (Hessian-affine) is unaffected. The performance of the best detectordescriptor pair is verified in the application of unsupervised visual class alignment where state-of-the-art results are achieved. The findings help to improve the existing detectors and descriptors for which the framework provides an automatic validation tool.
Original language | English |
---|---|
Journal | Neurocomputing |
Volume | 184 |
Issue number | C |
Pages (from-to) | 3-12 |
ISSN | 0925-2312 |
DOIs | |
Publication status | Published - 2016 |
Event | Robust Local Descriptors for Computer Vision - , Singapore Duration: 1. Nov 2014 → 5. Nov 2016 |
Conference
Conference | Robust Local Descriptors for Computer Vision |
---|---|
Country/Territory | Singapore |
Period | 01/11/2014 → 05/11/2016 |
Keywords
- BRIEF
- Interest point
- Local descriptor
- Local detector
- SIFT
- SURF