Introduction
In the near future, robot teams will perform cooperative tasks in a multitude of application scenarios, ranging from exploration of subterranean environments, to search-and-rescue missions in hazardous settings, to human assistance in houses, airports, factory floors, and malls, to mention a few.
A key requirement for cooperative exploration and navigation in an initially unknown environment is to build a map model of the environment as the robots explore it. Recent work has proposed 3D Scene Graphs as an expressive hierarchical model of complex environments [1], [2], [3], [4], [5], [6]: a 3D Scene Graph organizes spatial and semantic information, including objects, structures (e.g., walls), places (i.e., free-space locations the robot can reach), rooms, and buildings into a graph with multiple layers corresponding to different levels of abstraction. 3D Scene Graphs provide a user-friendly model of the scene that can support the execution of high-level instructions by a human. Also, they capture traversability between places, rooms, and buildings that can be used for path planning.
To scale up from single- to multi-robot systems and to longer missions and larger environments, a key challenge is to share the map information among the robots to support cooperation. For instance, the robots may exchange partial maps such that a robot can navigate within a portion of the environment mapped by another robot. However, the potentially high volume of data to be transferred over a shared wireless channel easily saturates the available bandwidth, degrading team performance. This holds true especially when the wireless channel is also used to transmit other information in the field —such as images or place recognition information for localization and map reconstruction— which further limits the bandwidth available for transmitting map information in a timely manner [7], [8], [9], [10]. The challenge of information sharing is particularly relevant when the map is modeled as a 3D Scene Graph, since these are rich and potentially large models if all nodes and edges are retained. On the other hand, 3D Scene Graphs also provide opportunities for compression: for instance, the robots may exchange information about rooms in the environment rather than sharing fine-grained traversability information encoded by the place layer; similarly, for a large-scale scene, the robot may just specify a sequence of buildings to be traversed, abstracting away geometric information at lower levels. This is similar to what humans do: when providing instructions to a person about how to reach a location in a building, we would specify a sequence of rooms and landmarks (e.g., objects or structures) rather than a detailed metric map or a precise path.
Therefore, the question we address in this letter is: how can we compress a 3D Scene Graph to retain relevant information the robots can use for navigation while meeting a communication budget constraint, expressed as the maximum size of the map the robots can transmit? Besides multi-robot communication, task-driven map compression can play a role in long-term autonomy under resource constraints, where the robots might suffer memory limitations and retain only key portions of a large map. Such a compression is also useful when it is desirable for the robots to share essential information under privacy considerations by sending only task-relevant data [11].
Related work: Graph compression is an active area of research in mathematics, computer science, and telecommunications, where it finds applications to, e.g., vehicle and packet routing [12], [13], [14], and 3D point cloud compression [15], [16], [17].
A prominent body of works simplifies a graph by carefully pruning it to retain relevant information. For example, references [13], [18] find efficient representations of huge web and communication networks by heuristically selecting a few key elements, while the work [19] prunes graphs while preserving connectivity among nodes. Within the discrete mathematics literature, graph compression has been studied with focus on ensuring low distortion (or stretch) of inter-node distances. For example, spanning trees and Steiner trees are the smallest subgraphs maintaining connectivity in undirected graphs [20], [21]. Graph spanners remove a subset of edges while allowing for a user-defined maximum distortion of shortest paths [22], [23], [24]. A special case are distance preservers [25] that prune graphs but keep unaltered the distances for specified node pairs. Emulators are tools that replace a large number of edges with a few strategic ones to ensure small stretch of distances [26].
Related work in robotics focuses on graph compression to speed up path planning and decision-making. Silver et al. [27] use Graph Neural Networks to detect key nodes by learning heuristic importance scores. Agia et al. [28] propose an algorithm that exploits the 3D Scene Graph hierarchy to prune nodes and edges not relevant to the robotic task. Targeting a related application domain, Tian et al. [29] study computation and communication efficiency of multi-robot loop closure, providing a strategy to share a limited number of visual features in multi-robot SLAM, while Denniston et al. [10] introduce a graph-based method to prune the multi-robot loop closures in order to save on processing time. Larsson et al. [30], [31], [32] propose algorithms to build hierarchical abstractions of tree-structured representations, for instance enabling fast planning on occupancy grid maps at progressively increasing resolution.
Novel contribution: In this letter, we tackle the challenging problem of efficiently sharing 3D Scene Graphs for navigation under hard communication constraints. We propose two greedy algorithms, BUD-Lite and TOD-Lite (collectively referred to as D-Lite), that leverage graph spanners to prune nodes and edges from a 3D Scene Graph while minimizing the distortion of the shortest paths between locations of interest (terminal nodes, see Fig. 1). Compared to the literature, our algorithms (i) are designed to retain navigation-relevant information, (ii) leverage the hierarchical structure of the 3D Scene Graph for compression, and (iii) enforce a user-specified size of the compressed 3D Scene Graph. Our algorithms are computationally efficient and apply to general 3D Scene Graphs. In contrast, related works are either restricted to trees or involve mixed-integer programming [30], [31]. Other pruning strategies do not directly target path planning tasks [28]. Finally, most works tailored to real-time compression do not allow for hard communication constraints [28], [30]. The effectiveness of our algorithms is validated through realistic simulated experiments. We show that the proposed method meets hard communication constraints without excessively impacting navigation performance. For example, navigation time on the compressed graph increases by at most 8% after compressing the 3D Scene Graph to 1.6% of its size.
3D Scene Graph of an environment (left) and compressed version produced by D-Lite (right). The purple circles mark the terminal nodes: D-Lite approximately preserves shortest-path distances between those locations of interest.
Navigation-Oriented Scene Graph Compression
Motivating scenario: We consider a multi-robot team exploring an unknown environment. Each robot navigates to gather information and builds a 3D Scene Graph (DSG)
Navigation-oriented query: We assume that the querying robot
Communication constraints: Data sharing among robots occurs over a common wireless channel. Because of resource constraints of wireless communication, such as limited bandwidth, robot
Pruning 3D Scene Graphs: Assuming navigation-oriented queries, the relevant information reduces to nodes and edges describing efficient paths robot
However, transmitting all nodes in the shortest paths may violate the communication constraint (see Fig. 6): this can happen with many terminals or if shortest paths have little overlap. Hence, heavier pruning of the DSG might be needed to make communication feasible. This means that information useful for path planning will be partially unavailable to the querying robot's planner. In other words, because the DSG
Illustration of the BUD-Lite procedure with source
Initial (left) and final DSG (right). Terminal nodes (A, B, C, and D) are in blue, place nodes in red, and the room node in green.
Illustration of the TOD-Lite expansion procedure with one source
Comparison on distortion (top row) and number of nodes after compression (bottom row) for BUD-Lite and TOD-Lite against computing the shortest paths (SP) and pruning all nodes that are not on them. The dotted lines mark the communication budget.
Problem formulation: For the querying robot
\begin{align*}
\min _{\mathcal {G}^{\prime }\subseteq \mathcal {G}} \quad& {\beta } \tag{1a}\\
\mathrm{s.t.} \quad& {d_{\mathcal {G}^{\prime }}(s,t)}{\leq d_{\mathcal {G}}(s,t) + \beta W_{\max}^{\mathcal {G}}(s,t)\;\,}{\forall (s,t)\in \mathcal {P}} \tag{1b}\\
& |\mathcal {V}_{\mathcal {G}^{\prime }}|{\leq {B}}, \tag{1c}
\end{align*}
Problem (1) can be solved by means of integer linear programming (ILP), see the technical report [33, Appendix A]. However, the runtime complexity of ILP solvers is subject to combinatorial explosion, making this approach impractical for online operation. Hence, we propose greedy algorithms that require lighter-weight computation, based on graph spanners. Background about these tools is given in [33, Section III].
3D Scene Graph Compression Algorithms
We propose D-Lite, a compression method for DSGs to meet communication constraints with attention to navigation efficiency. We design two versions of D-Lite, which are initialized with a spanner of the full DSG (Section III-B) and tackle the compression problem from opposite perspectives.
The first algorithm, BUD-Lite (Section III-C), performs progressive bottom-up compression of the spanner computed during initialization, exploiting the DSG abstraction hierarchy. In contrast, the second algorithm, TOD-Lite (Section III-D), works top-down expanding nodes with the spanner as a target.
A. Intuition and the Role of the 3D Scene Graph Hierarchy
Assume we want to design a greedy procedure that removes nodes and edges in
The discussion above suggests a simple way to compress the DSG: nodes in a layer can be progressively replaced by their parent nodes in the layer above. Every time we replace nodes with more “abstract” ones (rooms, buildings) the length of the paths passing through those nodes increases, indicating longer navigation. Hence, we can opportunistically select which nodes to “abstract away” so as to achieve a small stretch in the paths between terminals. In alternative, we can start with a coarse representation (including only the highest abstraction level) and expand it to reduce the stretch of the paths. We present these two greedy strategies below and initialize both procedures by computing a spanner of the given DSG, as explained next.
B. Building a DSG Spanner
The literature provides several algorithms to produce spanners of an input graph given a user-specified stretch on the distance between terminals. The spanner need not meet our budget constraint, hence we use it just as initialization for D-Lite. We adapt the algorithm in [24, Section 5] to build a spanner of the full DSG with additive path stretch. The procedure initializes the spanner with a random selection of edges: to exploit the DSG hierarchy, we modify the original algorithm by manually adding cross-layer edges during the initialization. Also, once the spanner is built, we retain only nodes and edges relevant for navigation by removing all nodes that are not traversed by shortest paths between terminals in the spanner just built.1 This greatly reduces the graph to be compressed, making our compression strategies based on hierarchal abstractions more efficient. We call this subroutine
C. BUD-Lite: A Bottom-Up Compression Algorithm
The idea behind our first algorithm (BUD-Lite, short for Bottom-Up D-Lite) is to iteratively compress the DSG spanner produced by
To gain intuition, consider Fig. 3 that illustrates three steps of BUD-Lite on a toy DSG.2 Dashed edges and light-colored nodes are part of the full DSG
We formally introduce the compression procedure in Algorithm 1. The compressed graph is initialized as the DSG spanner
Algorithm 1: BUD-Lite.
Performance bound: We now provide an analytical bound on the worst-case stretch that is incurred by every shortest path after running BUD-Lite. First, we provide two definitions that are instrumental to the understanding of the bound.
Definition 1 (Ancestor):
The (
In words, the ancestors of node
Definition 2 (Diameter):
For any node
\begin{equation*}
\text{diam}_{\mathcal {G}}(n) \doteq \max \left\lbrace |\mathcal {G}{c_{1}}{c_{2}}| : c_{1}, c_{2} \in C_{\mathcal {G}}(n) \right\rbrace, \tag{2}
\end{equation*}
In words, the diameter of a node describes how “large” the node is when expanded into its children in the layer below.
We now assume the following bounds on quantities associated with the original DSG
Assumption 1 (DSG bounds):
For any layer
\begin{align*}
W_{\text{max}}^{i} &\doteq \max \left\lbrace W^{\mathcal {G}}(m,n):m,n\in \mathcal {L}_{i}^{\mathcal {G}} \right\rbrace,\\
W_{\text{max}}^{i-1,i}&\doteq \max \left\lbrace W^{\mathcal {G}}(m,n):m\in \mathcal {L}_{i-1}^{\mathcal {G}},n\in \mathcal {L}_{i}^{\mathcal {G}} \right\rbrace,\\
u_{\text{min}}^{i} &\doteq \min \left\lbrace \left|\mathcal {G}{a_{\mathcal {G}}^{i}(s)}{a_{\mathcal {G}}^{i}(t)}\right| : (s,t)\in \mathcal {P}\right\rbrace,\\
\text{diam}_{\text{min}}^{i} &\doteq \min \left\lbrace \text{diam}_{\mathcal {G}}(n) : n \in \mathcal {L}_{i}^{\mathcal {G}} \right\rbrace. \tag{3}
\end{align*}
is the maximum weight of edges in layerW_{\text{max}}^{i} ;\mathcal {L}_{i}^{\mathcal {G}} is the maximum weight of cross-layer edges between layersW_{\text{max}}^{i-1,i} and\mathcal {L}_{i-1}^{\mathcal {G}} ;\mathcal {L}_{i}^{\mathcal {G}} is the minimum cardinality of a shortest path between theu_{\text{min}}^{i} th ancestors of every two connected terminals;i is the minimum diameter of nodes in layer\text{diam}_{\text{min}}^{i} .\mathcal {L}_{i}^{\mathcal {G}}
Equipped with the definition above, we can bound the distortion on the compressed DSG
Proposition 1 (Worst-case BUD-Lite stretch):
After
\begin{align*}
d_{\mathcal {G}^{\prime }}(s,t) \;\le& \; 2\sum _{i=1}^{\ell _{\text{max}}}W_{\text{max}}^{i-1,i} + \left(u_{\text{min}}^{\ell _{\text{max}}-1}-\alpha _{k}\text{diam}_{\text{min}}^{\ell _{\text{max}}}\right)W_{\text{max}}^{\ell _{\text{max}}-1} \\
& + \alpha _{k}W_{\text{max}}^{\ell _{\text{max}}}, \quad \forall (s,t)\in \mathcal {P},\tag{4}
\end{align*}
\begin{align*}
\alpha _{k}\doteq& \lceil {\frac{k}{|\mathcal {P}|}-\sum _{i=\ell _{0}}^{\ell _{\text{max}}-1}u_{\text{min}}^{i}}\rceil,\tag{5}\\
\ell _{\text{max}}\doteq& \max \left\lbrace \ell :k>|\mathcal {P}|\sum _{i=\ell _{0}}^{\ell -1}u_{\text{min}}^{i} \right\rbrace,\tag{6}\\
\ell _{0}\doteq& \max \left\lbrace \ell :\min _{(s,t)\in \mathcal {P}}\left(d_{\mathcal {G}}(s,t)+\beta W_{\max}^{\mathcal {G}}(s,t)\right)\right. \\
&\qquad\qquad \geq \left. 2\sum _{i=1}^{\ell }W_{\text{min}}^{i-1,i} + u_{\text{min}}^{\ell }W_{\text{min}}^{\ell }\right\rbrace. \tag{7}
\end{align*}
Proof:
See the technical report [33, Appendix D].
In words,
D. TOD-Lite: A Top-Down Expansion Algorithm
This section presents our second greedy algorithm. Symmetrically to the bottom-up approach of Algorithm 1, the idea behind TOD-Lite (short for TOp-down D-Lite) is to exploit the DSG hierarchy by expanding node children to iteratively increase spatial granularity of the compressed graph (Fig. 5).
The idea of TOD-Lite is depicted with a toy example in Fig. 5, where room nodes
We formally describe TOD-Lite in Algorithm 2. During initialization, Algorithm 2 builds a spanner
The main phase is an iterative top-down expansion through the hierarchical spanner
Expanding nodes gradually restores the geometric granularity of the DSG spanner, because a spatially coarse representation (e.g., room node) is replaced by a group of nodes with fine resolution (e.g., place nodes). This expansion comes at the price of heavier communication burden. Nonetheless, using the hierarchical spanner allows us to narrow the expansion procedure to a small set of navigation-relevant nodes, both saving runtime and helping meet communication constraints.
Algorithm 2: TOD-Lite.
Note that, with enough communication resources, TOD-Lite would exactly output the target spanner
E. Discussion: BUD-Lite vs. TOD-Lite
BUD-Lite compresses the DSG in a more granular fashion compared to TOD-Lite: that is, it adds distortion to paths more slowly, because it compresses a limited portion of one path at a time. On the other hand, the expansion strategy of TOD-Lite restores all children of a parent node at once. This difference makes BUD-Lite generally slower but able to reach a final graph size closer to the budget, whereas TOD-Lite is typically faster but retains fewer nodes and leads to more distorted paths.
Those differences make the two strategies suited to different scenarios. For instance, a map that includes both large and small rooms may cause TOD-Lite to get stuck after expanding the nodes with the largest number of children, while the path-wise compression of BUD-Lite is less sensitive to heterogeneous maps. On the other hand, to compress a large but homogeneous map with many relevant locations, one may use TOD-Lite to favor compression speed against a slightly worse result.
Experiments
This section shows that our method retains information for efficient navigation while meeting the communication budget constraint. We also show that the algorithms run in real time.
A. Experimental Setup
Besides benchmarking D-Lite against the solution to (1) (label: “Optimum”) found via integer linear programming (ILP), we also adapt and compare the compression strategy introduced in [30] (label: “IB”), as discussed below.
Q-Tree search adaptation: The compression approach in [30] builds on the Information Bottleneck (IB) [34]. This approach aims to find a compact representation
\begin{equation*}
\min _{p(T|X)} I(T; X) - \beta I(T; Y), \tag{8}
\end{equation*}
To adapt this approach to navigation-oriented DSG compression (since the Q-tree does not encode connectivity within a layer of the scene graph), we define a uniform distribution
Simulator: We showcase the online operation of D-Lite in the Office environment of the uHumans2 simulator (Fig. 1) [35], with 4 scenarios featuring different distances between navigation goal and starting position of the robot.
The queried robot
Upon receiving the compressed DSG, robot
B. Results and Discussion
Comparison with baselines: The results on the four scenarios are documented in Table I. We show the compression time (label: “Comp”), the nominal (label: “Nom”, computed from the compressed DSG) and simulated (label: “Mis”, computed as the actual time
The combinatorial nature of problem (1) makes the ILP solver impractical in robotic applications: for the long runs, the calculation of Optimum did not finish within an hour.
In all scenarios, robot
Ablation study: We compare distortion and number of nodes on the shortest paths between terminal nodes, for increasing number of terminal nodes and increasing budget constraints in Fig. 6. The shortest paths are optimal in terms of navigation performance (no distortion, top row), but easily violate the communication constraints (exceeding the budget, bottom row). On the other hand, BUD-Lite and TOD-Lite trade-off the path lengths between the terminal nodes to meet the budget constraint, and as we relax the latter, the distortion decreases. For the case with a budget of 150 nodes (last column), BUD-Lite and TOD-Lite obtain the same results, since the initial spanner already satisfies the budget constraint.
A study on the runtime of D-Lite and a comparison of BUD-Lite against the bound (4) are provided in [33, Section V-B].
Conclusion
Motivated by collaborative multi-robot exploration, we proposed a method to compress 3D Scene Graphs under communication constraints. Our algorithms can accommodate a sharp node budget while retaining navigation performance. Realistic simulations validate the effectiveness of our approach.
NOTE
Open Access provided by ‘Università degli Studi di Padova’ within the CRUI CARE Agreement