Generation of Paths in a Maze using a Deep Network without Learning

Trajectory- or path-planning is a fundamental issue in a wide variety of applications. Here we show that it is possible to solve path planning for multiple start- and end-points highly efficiently with a network that consists only of max pooling layers, for which no network training is needed. Different from competing approaches, very large mazes containing more than half a billion nodes with dense obstacle configuration and several thousand path end-points can this way be solved in very short time on parallel hardware.

problem.The task of SSSP is to find shortest paths in a graph between a vertex (node) and all other vertices such that the sum of the edges' weights is minimised.Classical approaches to solve SSSP are Breadth First Search algorithm (BFS) [19], Dijkstra's algorithm [7], and the Bellman-Ford algorithm [3], [10].BFS is suitable for unweighted graphs, i.e., is a uniform cost search and the obtained solution is optimal with respect to the number of nodes to travel from the source node to all other nodes.Dijsktra's algorithm finds shortest paths in weighted graphs with positive weights, whereas the Bellman-Ford algorithm can also deal with graphs with negative weights.However, it is slower than Dijsktra's algorithm.Another algorithm which finds shortest paths between all pairs of nodes in a graph is the Floyd-Warshall algorithm [9], however, running Dijkstra's algorithm for each node is a better choice when considering sparse graphs.The advantage of all these methods is that they do not require learning and are parameter free methods, however, they can be computationally expensive.
Recently, an new approach, which also does not require learning has been proposed by [8].This method utilises GPU OpenGL shaders and is based on cone rasterization from sources and obstacle vertices.It can generate optimal path maps for multiple sources and outperforms other GPU based approaches [17], [18], [27], [14], [11].However, as we will show later, this method is not very suitable in several cases, because it leads to relative long computational times.
Another class of algorithms is based on artificial neural networks ranging from bio-inspired approaches to deep learning methods.In the bio-inspired approaches [12], [13], [5], [28], [20], [24], the environment is represented by a network with inhibitory (=obstacles) and excitatory (=free spaces) neurons arranged on a grid.Here, activity is propagated from the source neuron to the closest neurons and this procedure is repeated for many iterations until the activity is spread out across the whole network.Later, shortest paths can be found by following activity gradients.In principle, these networks are similar to the wavefront propagation algorithm [6], which is a special case of the BFS algorithm.Although most of these these approaches (except [24]) do not require network training, they are not parameter free.Also, as we will show later, the disadvantage of such algorithms is that the activity decreases exponentially [28] and large mazes (e.g., larger than 500 × 500) can not be solved due to numerical precision problems.
Moreover, these approaches require relatively large data sets (e.g., thousands of samples) as well as training to optimise network parameters.
In this paper we present a novel deep network, which consists of only max pooling layers (oMAP).The proposed method generates activity maps for single as well as for multiple sources to single but also multiple targets.It does not require training data such that learning is not necessary and, when following the activity gradient shortest paths are found.Furthermore, this approach can process very large environments on standard GPUs in very short time 1 .

Input
The oMAP algorithm uses two binary images of size m × n, an environment map I e and a source map I s , with s ≥ 1 sources, as an input.
For the source map, we set I s (i, j) = 1 at all source locations, otherwise we set I s (i, j) = 0.
The environment map I e represents an obstacle map where we set I e (i, j) = 0 if a grid cell is free (no obstacle) and I e (i, j) = −maxint if grid cell (i, j) contains an obstacle.The choice of using here the numerically most negative integer (−maxint) is motivated by the algorithm for generating the activity map, described next.

Activity Map Generation
The oMAP network consists of L identical max pooling layers l i (see Fig. 1) with only one type of filter with size 3 × 3. We specifically use such a filter size in order to pass activity only to the nearest grid cells (similar to the wavefront expansion algorithm [6]).Otherwise, in case of larger filters, activity could propagate also to grid cells, which are separated by obstacles.
We start with the source map I s and perform max pooling.Then we sum the resulting map with input maps I e plus I s into one intermediate-layer map.Note that the largest resulting grid cell value v after this operation will be v = n + 1 (see example in Figure 2), where n is the index of the current layer l n .All source cells will obtain this value, hence v source = n + 1. Obstacle cells, on the other hand, obtain values, which remain negative and follow v obst < −maxint + n + 1 < 0.
Then we pass the resulting map through the standard rectified linear transfer function (ReLU).Because v obst < 0 all grid values at obstacle locations remain at zero.This is repeated for all layers until the last layer, which produces the final activity map O of size m × n as an output.
Note that (different from, e.g., convolutional nets) we do not have any tunable weights, thus, no training is required.
A graphical visualisation of the process of activity map generation using oMAP is shown in Fig. 2.Here we used a maze of size 9×9 and placed one source in the middle.We show the obstacle map (black grid cells denote obstacles), the input activity (source map) and the activity of each max pooling layer after Add/ReLU operation.
As we can see, after the first layer (see output l 1 ) activity is propagated from the source only to its neighbouring grid cells (except obstacles) which obtain values of 1, while the activity at the source cell is increased by one to a value of 2. From layer to layer, activity in the network grows and propagates to grid cells increasingly distant from the source.In this particular example, map generation is complete after nine layers.
Determining the number of layers L and algorithmic complexity: oMAP does not have any tunable variables and also the number of layers L, which is the only existing free parameter, can be unequivocally determined.
It is identical to the maximal path length in an environment, which depends on the location of the source(s), the size of the environment and the distribution of obstacles.In general, however, the longest path is a priori unknown, but the structure of the oMAP algorithm allows determining L during run-time.The algorithm can be run recursively adding layer after layer until the activity map does not contain any zero-values anymore (except at the obstacles).Thus, L can be set using this procedure 2 .
The complexity of our algorithm is O(N × L) where N = m × n is the number of grid cells in the map (corresponds to the number of max operations per layer) and L is the number of layers.

Path Reconstruction
A path from any given target location to the source (or closest source) can be found from the generated activity map O by following the activity gradient; i.e., we start from the chosen target location and select a neighbouring grid cell out of its eight neighbors with maximum value and repeat this until reaching the source.
Note that there can be cases of more than one neighboring cell with maximal value.In such a case, we chose the next cell randomly and we refer to this method as simple path reconstruction.Note that, due to the single step forward propagation by the max pooling method described above, it will not matter, which cell to choose, because following any of the resulting gradients will render the same number of steps back to the source.
Using map l 9 from Fig. 2, it is easy to see that this method can create all possible paths from any target back to the source in the middle.
Note that this method is similar to the wavefront expansion algorithm [6], which is a special case of Breadth First Search (BFS, [19]) and, thus, always renders optimal paths with respect to number of steps. 2 Note, in the Appendix we will show that recursive running of oMAP is very slow due to a per-iteration required CPU-GPU handshake.
Thus, for practical purposes, we did not run oMAP in recursive mode and determined L instead by prima-vista estimating path complexity.
However, paths obtained by simple path reconstruction will be not necessarily optimal with respect to Euclidean distance.This is due to the fact, that all transitions (horizontal, vertical, diagonal) are weighted equally amounting to a uniform-cost search.To improve on this, we propose the Euclidean path reconstruction method, described next.
We first find a path, similar to above, by choosing a neighboring cell with maximum value, but now we only consider horizontal and vertical neighboring cells ('Manhattan' transition).Thus, given the current path position {P x (t) = i, P y (t) = j} the next step of the path is defined by We stop this path search as soon as the (nearest) source is reached.Then we straighten the path by removing where k is the number of points in the path.Note that complexity of the path reconstruction procedure is O(k).The obtained path is now shortest with respect to the Euclidean distance.
oMAP was implemented using Tensorflow and Keras API 3 .We used a PC with Intel Xeon Silver 4114 CPU (2.2GHz) and NVIDIA GTX 1080 Ti or NVIDIA Titan V GPU.

Generation of Activity Maps and Path Reconstruction for oMAP and other Algorithms
We compared our approach to several above mentioned algorithms, which do not require learning, i.e., Dijkstra's algorithm [7], a biologically inspired neural network (shunting model) [28] and a state-of-the-art algorithm based on OpenGL shaders [8].We assessed path optimality (shortest paths with respect to the Euclidean distance) as well as run-time.For the run-time comparison against OpenGL shaders we used the benchmark maps as in [8].zero (dark blue) and maximum (dark red).We obtain a circular pattern when using Dijkstra (see panel A1) and a square pattern with oMAP (B1).This is due to the fact that Dijkstra uses a non-uniform cost (horizontal/vertical moves have a cost of 1 and diagonal moves have a cost of √ 2), while for oMAP we use uniform cost (all moves have a cost of 1).
In case of Dijkstra's algorithm (panel A2) paths were reconstructed from the visited nodes list [7] and are of optimal Euclidean length, in spite of their 'wiggly' appearance.On the other hand, maps generated by oMAP combined with simple path reconstruction (magenta paths in B2, D2) will lead to optimal paths with respect to the number of steps, but these paths are not optimal with respect to Euclidean distance.Euclidean path reconstruction (blue paths) solves this issue.Note that for this, one follows first (before straightening) the gradient only along Manhattan transitions.This will usually not lead to the same grid cell selection as for simple path reconstruction.
Hence blue and magenta paths are independent of each other.This can be seen when comparing the reconstructed paths (see, for example, panel B2) from simple path reconstruction (magenta) with those from Euclidean path reconstruction (blue).The latter renders optimal paths, which are straighter than the ones from Dijkstra.Also for complex mazes, Euclidean reconstruction renders optimal paths (C, D), in this case identical to Dijkstra.An example of activity map generation by oMAP using three and nine sources is shown in Fig. 4. For this, we used a map from the Moving AI benchmark [25] of resolution 767 × 881.We used 500 layers and 350 layers in case of three and nine sources, respectively.Results demonstrate that less layers are needed if the number of sources increases, since activity propagates from all sources at the same time, which, as a consequence, fills the complete map sooner.
oMAP versus Shunting Model: Conceptually, our approach, as already discussed in the Introduction section, is similar to the path finding method using a biologically inspired neural network (shunting model, [28]).Thus, we also compare our approach to this method.Results obtained on a map with a u-shape obstacle (a common benchmark for the evaluation of path finding methods; reproduced from [28]) is shown in Fig. 5.The central disadvantage of the shunting model is that large environments (e.g.above 500 × 500) cannot be addressed.
Either neuronal activity drops very quickly when moving away from the source (see panel A) and soon reaches 'numerical-zero'.Or, for quite long run-times, activity can indeed be propagated to more distant locations but this easily leads to activity plateaus due to the nature of the model (see panel B).The authors comment on their model parameters [28] but even after extended search in the parameters space, we were not able to arrive at a shunting model that could solve environments above 500 × 500.Thus, the shunting model is only applicable for relatively small environments.Moreover, the authors of that study state [28] that their method generates optimal paths, which is not always the case.The paths in Fig. 5 A, B are non-optimal, due to the fact that activity of each neuron is computed as a weighted average activity of its nearest neighboring neurons, which may lead to sub-optimal paths in environments with obstacles.In contrast to that, activity decreases linearly from the source when using oMAP, which allows generating activity maps for extremely large environments, using large enough L, and numerical precision problems do not exist, because we operate with integer numbers that grow maximally to the longest path length.As a consequence, the resulting paths are always optimal (Fig. 5 C).
oMAP versus OpenGL Shaders: Possibly the most powerful, currently existing method that uses a wavefront propagation algorithm is described by [8] and employs OpenGL shaders in a GPU implementation.This method will always produce optimal paths.Therefore, we chose to compare it to oMAP according to the run time of both algorithms (Fig. 6).Note, however, that such across-implementation comparisons have to be taken with a grain of salt, because we cannot know how efficient the foreign implementation was.
For this, we used maps of size 1, 000 × 1, 000 with increasing number of obstacles as in [8] (see obstacle configurations on the oMAP activity maps in Fig. 6, left).Note that in our case we set the source in the bottomleft corner, which is the worst case with respect to computer time.Results demonstrate that the run-time of the OpenGL shaders method depends on the number of obstacles in the scene and this method slows down non- the reconstructed shortest paths for eight customers to the closest taxi cabs, respectively.We used a network with 160 layers and it took only 20 ms on NVIDIA GTX 1080 Ti to generate the activity map (resolution 332 × 709).
The reconstructed paths (blue trajectories) show that the closest taxi cabs and the shortest paths were found for all eight customers.
General Run-time Evaluation, Limitations and Practical Considerations Run-time: Run-time evaluations evidently depend on the hardware used.Still, it is of interest to document where we stand given currently existing state of the art hardware.In the following we will therefore show that oMAP achieves remarkable performance when tested on an NVIDIA GTX 1080 Ti GPU.We define the linear grid size as n and consider here square grids with n 2 grid cells ("nodes").We used empty maps without obstacles because for oMAP the number of operations does not depend on the number of obstacles.Panel A in Fig. 8 shows that run-time increases linearly against the number of nodes n 2 when keeping the number of layers constant.Similarly, linear growth is also observed when keeping the grid size n constant and increasing L (Fig. 8 B).
From above it is clear that more layers are needed when the grid gets bigger, because this leads to longer paths for which the network has to be increased.The shortest possible longest-path length is 0.5n (empty square grid, source in the middle 4 ).In panel C we, thus, consider the number of nodes n 2 together with a changeable number of layers L and we let L depend on the grid size for approximating the fact that in larger grids paths are longer.As expected from panels A and B these curves now linearly follow n 3 with slopes that increase 13 for increasing L also in a linear manner (for slope values see figure caption).From all this, an approximate equation for estimating the run-time for different grid sizes and different number of layers can be derived as t r (n, L) ≈ 3.0751 × 10 −10 L n 2 .Limitations and Practical Considerations: The above run-time estimate holds on an NVIDIA GTX 1080 Ti GPU as long as oMAP runs essentially in forward mode without (too many) iterations i. Note, however, that the overhead of having to pass information iteratively back to the start of a new batch of layers will remain tiny if this happens just a few times.
As stated above, the number of required layers L depends on the longest path.The theoretically existing longest path in any n × m grid is given as max(n, m) × int((min(n, m) + 1)/2) + int(min(n, m)/2), which is a path that meanders back and forth between two interleaved comb-like obstacle rows.In a square grid, this number can be approximated by n 2 2 and this number would have to be matched by L. From our experiments with real maps, we found, however, that this is a highly unrealistic situation and usually 1.5n < L < 2n suffices.
For example, on this architecture, we could run oMAP in one forward pass for a remarkable linear grid size of n max = 26, 000, equivalent to 676,000,000 nodes (see panel 1 in Fig. 9) with maximal layer number of 3, 450 (on an NVIDIA GTX 1080 Ti GPU).This would correspond to a panel of 3.25 × 3.25 tiles when using 64 Megapixel images as individual "maps" with every pixel a grid cell.Run-time for this case was 692.3 s.
Interestingly, for this system L max does not depend on the grid-size.
Note that the number of required layers L will decrease, and so does the run-time, as soon as more than one source exists.Figure 9 shows an example of a grid with n = 26, 000 and 5, 000 sources.This required 650 layers and took only 147.4 s to run.
Some more examples that show the power and the limitation of oMAP are: if we assume L = 2 n, then we can run systems with n = 1725 in one forward pass in ≈ 3.1 s.Under the same assumption (L = 2 n) we would need L = 52, 000 for the maximal possible grid with n = 26, 000.This would require i = 15 full iterations and a few more layers for iteration 16, resulting in a run time of t r ≈ 10, 800 s (3 hours).

CONCLUSION
We have presented a deep network for path finding in grid-like environments that does not need learning and runs very fast even for large environments with complex paths.It outperforms competing network approaches by a large margin and is easy to implement on standard GPUs, because of its simple structure.There are no free parameters except the number of layers L for which, however, efficient approximations exist.
For example, in a square grid, the maximal path-length (worst case) can be approximated by n 2 2 , which would have to be matched by L. However, realistically we found that n < L < 2 n was most often sufficient.The extreme case with highly meandering paths in Fig. 3 D, which in practical, map-like situations is unlikely to exist, needed L ≈ 4.5 n.
Furthermore, L decreases, when more sources are introduced.Thus, the proposed oMAP approach has, in particular, high potential for multi-source multi-target applications in large environments.

Fig. 1 .
Fig. 1.Network architecture and algorithmic process.The network consists of many stacked identical max pooling layers with one filter of size 3 × 3, stride 1 × 1 and zero padding (no sub-sampling).Here, for graphical reasons, a wider filter is shown.The network receives a source map and an environment map (with obstacles, black) as inputs.The algorithm consists only of repeated max pooling, adding, and rectified-linear (ReLU) operations as shown in the figure.

Fig. 2 .
Fig. 2. Illustration of the generation of an activity map using oMAP.Black grid cells in the Environment Map stand for cell values of −maxint and represent obstacles.Brown cell with value 1 in the Source Map denotes the source location.Activity at each layer after Add/ReLU operation (see Fig. 1) is shown.

oMAP versus Dijkstra :
First, we show qualitative results on the generation of avtivity maps with single and multiple sources using oMAP compared to Dijkstra's algorithm.Examples of activity map generation and path reconstruction are shown in Fig 3 in panels A, C for Dijkstra and in panels B, D for oMAP.Here we used two artificial environments, a relatively simple map with four obstacles (reproduced from [8], panels A and B) and a complicated maze map (generated automatically, panels C and D).We show normalised activity maps between

Fig. 3 .
Fig. 3. A, C) Dijkstra's algorithm versus B, D) oMAP.Activity maps (index 1) and reconstructed trajectories (index 2) are shown.Grid size is 100 × 100 in case A, B, and 101 × 101 in case C, D. For oMAP we used 100 layers for B and 450 layers for D. Blue trajectories represent optimal (shortest) paths following Euclidean path reconstruction and magenta trajectories represent non-optimal paths following simple path reconstruction.Green and red dots represent start-and end-points, respectively.

Fig. 5 .
Fig. 5. A, B) Biologically inspired neural network [28] versus C) oMAP with 100 layers.Grid size is 100 × 100.Greed and red dots represent start-and end-points, respectively.A 3D plot has been used to more clearly show the structure of the gradients.

Fig. 7 .
Fig. 7. Multi-source -multi target path finding problem in a real taxi scenario.Top: Section of a Berlin city map (332 × 709) with disks showing taxi positions obtained on March, 03, 2020 at 2:19 pm, which is a screenshot from the on-line application available at https://www.taxi.de.Middle: Multi-source activity map using oMAP (160 layers).Bottom: Shortest paths (blue lines) from customers (green dots) to taxis (red dots) are shown.Customer positions were defined manually.

Fig. 9 .
Fig. 9. Example of a huge grid with n = 26, 000 corresponding to 676,000,000 nodes.Panel 1 shows the full grid with 5,000 sources and panels 2 to 4 show magnifications to make the obstacles visible (panel 4).Colors encode activity as in Figure 2, blue=small and red=large values.