DeepSFM: Robust Deep Iterative Refinement for Structure From Motion.

Researchers

Journal

Modalities

Models

Abstract

Structure from Motion (SfM) is a fundamental computer vision problem which has not been well handled by deep learning. One of the promising solutions is to apply explicit structural constraint, e.g. 3D cost volume, into the neural network. Obtaining accurate camera pose from images alone can be challenging, especially with complicate environmental factors. Existing methods usually assume accurate camera poses from GT or other methods, which is unrealistic in practice and additional sensors are needed. In this work, we design a physical driven architecture, namely DeepSFM, inspired by traditional Bundle Adjustment, which consists of two cost volume based architectures to iteratively refine depth and pose. The explicit constraints on both depth and pose, when combined with the learning components, bring the merit from both traditional BA and emerging deep learning technology. To speed up the learning and inference efficiency, we apply the Gated Recurrent Units (GRUs)-based depth and pose update modules with coarse to fine cost volumes on the iterative refinements. In addition, with the extended residual depth prediction module, our model can be adapted to dynamic scenes effectively. Extensive experiments on various datasets show that our model achieves the state-of-the-art performance with superior robustness against challenging inputs.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *