Lamarca 2020 DefSLAM
URL: http://arxiv.org/abs/1908.08918
Authors: Lamarca et al
Code: http://github.com/UZ-SLAMLab/DefSLAM
Results (video):
http://www.youtube.com/watch?v=6mmhD2_t6Gs
Summary
- First monocular SLAM for deformable environments in real-time
- Most other SLAM implementations assume rigidity
- Main techniques used (techniques for monocular non-rigid scenes):
- isometric shape from template (
SfT
)
- non-rigid structure from motion (
NRSfM
)
- Main principle: computation in
two parallel threads (s. DefSLAM framework)
- Deformation tracking [front end]
- Deformation mapping [back end]
- The map from the mapping thread defines the shape-at-rest template used by deformation tracking
- Validation: compare with ORBSLAM (rigid)
- Assumes isometric deformation
- Future work: relocalisation (s.
kidnapped robot problem
), loop closure for robustness
Contents
- Introduction
- Initialisation of monocular SLAM
- Most SLAM algorithms exploit the rigidity assumption!
- NRSfM
and
SfT
- DefSLAM framework
- DefSLAM and discontinuous areas (classical datasets)
- in visual SLAM
- Existing deformable visual SLAM
- mostly using RGB-D or stereo cameras
- the ones mentioned optimise the whole map, but hier DefSLAM only optimises the observed map zone (local zone)
- Rigid methods used in (semi-)deformable environments
- assume negligible deformation
- circumvent deformable situations by excluding any deformable regions from the map
In DefSLAM:
energy-based SfT
and
isometric NRSfM
are used.
Notation
| |
---|
Map | Set of 3D map points |
 | j-th map point in frame t |
 | camera pose in frame t |
 | surfaced observed in keyframe k |
 | template (has map points embedded into it), based initially on keyframe k (deformations can occur after k) – already explored areas |
 | local zone (currently being viewed area) |
 | image |
 | feature keypoint in image |
 | * warp between the keyframes k and k** image transformation between Ik to Ik* |
 | * embedding of an image point onto the scene surface* transforms an image point (2D) into a point on a 3D surface |
Anchor keyframe [of a map point] | * Keyframe in which a map point is initialised* After each new KF processing, one of the anchor KFs is selected as the reference KF |
Reference keyframe | Defines the template used by the deformation tracking |
Tracking
Components: map/template, local zone, camera pose
Stages of tracking:
- Data association
- Camera pose estimation
- Template deformation
- New keyframe selection: as soon as mapping finishes one run
- If new KF of a new map: this KF becomes an anchor KF, the ref. KF, and a new template is created
- If new KF of known region: this KF is a regular KF, the most covisible anchor KF is selected as ref KF, template is refined (deformed)
Template in DefSLAM
Pinhole camera projection function
Tracking optimisation in DefSLAM
Data association in DefSLAM
Keyframe selection
A new keyframe is selected when the mapping finishes the last estimation, i.e. new keyframe at the end of each deformation mapping run
Mapping
In a nutshell:
- recovers observed map as a surface for the next keyframe k
- the surface Sk is the shape-at-rest of template Tk for the next frames
Mapping step-by-step in DefSLAM
[Non-Rigid Guided Matching (b/w KFs) in DefSLAM](non-rigid guided-matching-(b_w-kfs)-in-defslam.md)
Validation results
In the experiments, DefSLAM was run sequentially in a single thread (for repeatability)!
