Visual Correspondence Hallucination

ICLR 2022

Abstract

Given a pair of partially overlapping source and target images and a keypoint in the source image, the keypoint's correspondent in the target image can be either visible, occluded or outside the field of view. Local feature matching methods are only able to identify the correspondent's location when it is visible, while humans can also hallucinate its location when it is occluded or outside the field of view through geometric reasoning. In this paper, we bridge this gap by training a network to output a peaked probability distribution over the correspondent's location, regardless of this correspondent being visible, occluded, or outside the field of view. We experimentally demonstrate that this network is indeed able to hallucinate correspondences on pairs of images captured in scenes that were not seen at training-time. We also apply this network to an absolute camera pose estimation problem and find it is significantly more robust than state-of-the-art local feature matching-based competitors.


Overview

Our network, called NeurHal, takes as input a pair of partially overlapping source/target images and keypoints, and outputs for each point a probability distribution over its correspondent’s location in the target image plane. When the correspondent is visible in the target image, its location can be identified. Otherwise, its location must be hallucinated. Two types of hallucination tasks can be distinguished: if the point is occluded, its location has to be inpainted; 2) if the point is outside the field of view, its location needs to be outpainted. We show the probability distributions predicted by our network on test images from ScanNet and Megadepth.

Our network, called NeurHal, takes as input a pair of partially overlapping source/target images and keypoints, and outputs for each point a probability distribution over its correspondent’s location in the target image plane. When the correspondent is visible in the target image, its location can be identified. Otherwise, its location must be hallucinated. Two types of hallucination tasks can be distinguished: if the point is occluded, its location has to be inpainted; 2) if the point is outside the field of view, its location needs to be outpainted. We show the probability distributions predicted by our network on test images from ScanNet and Megadepth.


Qualitative Results

To illustrate the ability of NeurHal to perform visual correspondence hallucination, we display correspondence maps output by NeurHal on validation image pairs: (top row) outpainting examples, (bottom row) inpainting examples. In the source image, the red dot is a keypoint. In the target image and in the (negative log) correspondence map, the red dot represents the ground truth keypoint’s correspondent. The dashed rectangles represent the borders of the target images.

We use NeurHal to perform NRE-based camera pose estimation on low-overlap images. More qualitative and quantitative results are available in the paper.

We use NeurHal to perform NRE-based camera pose estimation on low-overlap images. More qualitative and quantitative results are available in the paper.

 

To cite our paper :

@inproceedings{germain2021NeurHal,
title = {Visual Correspondence Hallucination: Towards Geometric Reasoning},
author = {Hugo Germain and Vincent Lepetit and Guillaume Bourmaud},
booktitle = {arXiv Preprint},
year = {2021}
}