SlideSLAM: Sparse, Lightweight, Decentralized Metric-Semantic SLAM for Multirobot Navigation

Abstract

This article develops a real-time decentralized metric-semantic simultaneous localization and mapping (SLAM) algorithm that enables a heterogeneous robot team to collaboratively construct object-based metric-semantic maps. The proposed framework integrates a data-driven front-end, for instance, segmentation from either RGBD cameras or light detection and ranging (LiDAR) and a custom back-end for optimizing robot trajectories and object landmarks in the map. To allow multiple robots to merge their information, we design semantics-driven place recognition algorithms that leverage the informativeness and viewpoint invariance of the object-level metric-semantic map for inter-robot loop closure detection. A communication module is designed to track each robot’s observations and those of other robots whenever communication links are available. The framework supports real-time, decentralized operation onboard the robots and has been integrated with three types of aerial and ground platforms. We validate its effectiveness through experiments in both indoor and outdoor environments, as well as benchmarks on public datasets and comparisons with existing methods. The framework is open-sourced and suitable for both single-agent and multirobot real-time metric-semantic SLAM applications.

https://ieeexplore.ieee.org/abstract/document/11230622

Fig. 1.

System diagram. Our system takes in data streams from each robot’s onboard sensors, which can be either an RGBD camera or a LiDAR, and performs instance segmentation to extract semantic object features. Meanwhile, low-level odometry, either VIO [19], or LIO [20], provides relative-motion estimates between consecutive key poses. Next, the metric-semantic SLAM algorithm takes in such semantic observations and relative motion estimates, and constructs a factor graph consisting of both robot pose nodes and object landmark nodes. Meanwhile, our multirobot communication module (see Fig. 3) opportunistically leverages connectivity to share lightweight semantic observations among robots in a decentralized way. Based on this shared information, our metric-semantic place recognition algorithm constantly checks for possible inter-robot loop closures at a fixed rate. Once a loop closure is detected, the resulting transformation between each pair of robots is used to transform all observations into each robot’s reference frame. These observations are then added to their own factor graphs, forming a merged metric-semantic map. Note that the entire perception-action loop runs in a decentralized manner onboard each robot. Besides the obvious differences in control algorithms, the planning modules and the front-end processing algorithms are also different across each robot platform. This is due to the need to accommodate the differences in sensing modalities (RGBD and LiDAR), operating environments (indoor, urban, and forest), and traversal modes (ground and aerial). However, the core metric-semantic SLAM framework remains the same.