SimpleRecon

Abstract

Traditionally, 3D indoor scene reconstruction from posed images happens in two phases: per image depth estimation, followed by depth merging and surface reconstruction. Recently, a family of methods have emerged that perform reconstruction directly in final 3D volumetric feature space. While these methods have shown impressive reconstruction results, they rely on expensive 3D convolutional layers, limiting their application in resource-constrained environments. In this work, we instead go back to the traditional route, and show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion. We propose a simple state-of-the-art multi-view depth estimator with two main contributions: 1) a carefully-designed 2D CNN which utilizes strong image priors alongside a plane-sweep feature volume and geometric losses, combined with 2) the integration of keyframe and geometric metadata into the cost volume which allows informed depth plane scoring. Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes, yet still allows for online real-time low-memory reconstruction.

SimpleRecon is fast. Our batch size one performance is 70ms per frame. This makes accurate reconstruction via fast depth fusion possible!

Approach

Overview

Our key contribution is the injection of cheaply-available metadata into the feature volume. Each volumetric cell is then reduced in parallel with an MLP into a feature map before input into a 2D cost volume encoder-decoder. We also make use of an image encoder specifically used to enforce a strong image prior when propagating and correcting depth estimates from the cost volume throughout the frame in the cost volume encoder-decoder.

Metadata Infused Cost Volume

Metadata insertion Typical MVS systems predict depth from warped features or differences between features \eg dot products. We additionally include cheaply-available metadata for improved performance. Indices $(k,i,j)$ are omitted for clarity.

Results

ScanNetv2 Depths

ScanNetv2 Reconstructions

ScanNetv2 Depth Video Visualization

Resources

Paper

Supplemental

Code

BibTeX

If you find this work useful for your research, please cite:

    @inproceedings{sayed2022simplerecon,
      title={SimpleRecon: 3D Reconstruction Without 3D Convolutions},
      author={Sayed, Mohamed and Gibson, John and Watson, Jamie and Prisacariu, Victor and Firman, Michael and Godard, Cl{\'e}ment},
      booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
      year={2022},
    }

Acknowledgements

We thank Aljaž Božič of TransformerFusion, Jiaming Sun of Neural Recon, and Arda Düzçeker of DeepVideoMVS for quickly providing useful information to help with baselines and for making their codebases readily available, especially on short notice.

The tuple generation scripts make heavy use of a modified version of DeepVideoMVS's Keyframe buffer (thanks again Arda and co!).

The PyTorch point cloud fusion module is borrowed from 3DVNet's repo. Thanks Alexander Rich!

We'd also like to thank Niantic's infrastructure team for quick actions when we needed them. Thanks folks!

Mohamed is funded by a Microsoft Research PhD Scholarship (MRL 2018-085).

SimpleRecon

3D Reconstruction without 3D Convolutions

ECCV 2022

Mohamed Sayed^2* John Gibson¹ Jamie Watson¹ Victor Adrian Prisacariu^1,3 Michael Firman¹ Clément Godard^4*

Abstract

Approach

Overview

Metadata Infused Cost Volume

Results

ScanNetv2 Depths

ScanNetv2 Reconstructions

ScanNetv2 Depth Video Visualization

Resources

Paper

Supplemental

Code

BibTeX

Acknowledgements

SimpleRecon

3D Reconstruction without 3D Convolutions

ECCV 2022

Mohamed Sayed2* John Gibson1 Jamie Watson1 Victor Adrian Prisacariu1,3 Michael Firman1 Clément Godard4*

Abstract

Approach

Overview

Metadata Infused Cost Volume

Results

ScanNetv2 Depths

ScanNetv2 Reconstructions

ScanNetv2 Depth Video Visualization

Resources

Paper

Supplemental

Code

BibTeX

Acknowledgements

Mohamed Sayed^2* John Gibson¹ Jamie Watson¹ Victor Adrian Prisacariu^1,3 Michael Firman¹ Clément Godard^4*