(Liu 2020) Learned Descriptor

Note: I’m only reading this paper for the into to SLAM/SfM

Abstract

Problem: 3D reconstuction has subpar performance when dealing with endoscopic videos, partly due to local descriptors …

Introduction

Correspondence estimation: match between 2D points in image and corresponding 3D location (s. registration )
Correspondence estimation is needed by SfM, SLAM, …
SfM + SLAM combination has been shown to be effective for surgical navigation in endoscopy – simultaneous estimation of
- sparse 3D structure of the observed scene
- camera trajectory

Complementarity of SfM + SLAM

Good camera tracking requires dense 3D reconstruction

SLAM	SfM
good for real time applications	limited to offline estimation (due to the global optimisation used in the bundle adjustment)
usually limited to local optimisation (due to computational constraints)	prioritises high density and accuracy for the sparse 3D structure
prone to drift errors when no loop closure

Example

SfM only pipeline: COLMAP
SLAM only pipeline: ORB-SLAM

Challenges for correspondence estimation in endoscopic video

Tissue deformation (violates static scene assumption in the pipelines)
Textures in endoscopy
- often smooth and repetitive
- sparse matching with local descriptors are in this case prone to error
- possible workarounds:
  - adding textures
    - Widya: use of dye to manually texturise the surface (this improves matching performance of the descriptors)
    - Qiu: project patterns onto the surface
  - methods to work with texture-scarce surfaces