A Python Multiprocessing Approach for Fast Geostatistical Simulations of Subglacial Topography

Realistically rough stochastic realizations of subglacial bed topography are crucial for improving our understanding of basal processes and quantifying uncertainty in sea level rise projections with respect to topographic uncertainty. This can be achieved with sequential Gaussian simulation (SGS), which is used to generate multiple nonunique realizations of geological phenomena that sample the uncertainty space. However, SGS is very CPU intensive, with a computational complexity of O($\boldsymbol{N}\boldsymbol{k}$Nk3), where $N$N is the number of grid cells to simulate, and $\boldsymbol{k}$k is the number of neighboring points used for conditioning. This complexity makes SGS prohibitively time-consuming to implement at ice sheet scales or fine resolutions. To reduce the time cost, we implement and test a multiprocess version of SGS using Python’s multiprocessing module. By parallelizing the calculation of the weight parameters used in SGS, we achieve a speedup of 9.5 running on 16 processors for an $\boldsymbol{N}$N of 128,097. This speedup—as well as the speedup from using multiple processors—increases with $\boldsymbol{N}$N. This speed improvement makes SGS viable for large-scale topography mapping and ensemble ice sheet modeling. Additionally, we have made our code repository and user tutorials publicly available (GitHub and Zenodo) so that others can use our multiprocess implementation of SGS on different datasets.


S
ubglacial topography is a key parameter in ice sheet models used to make sea level rise projections. 1 The bed topography of the Greenland and Antarctic ice sheets has been extensively surveyed using airborne ice-penetrating radar. 2 However, there remain large gaps in measurements that must be interpolated.This interpolation is often performed using kriging, spline, or mass conservation 3 approaches, which all solve for the optimal bed elevation value at each spatial coordinate.These methods are deterministic, meaning they produce a single solution or interpolation.One major drawback of the deterministic approach is that it cannot sample the parameter space, making it difficult to determine how uncertainty in basal conditions is propagated in ice sheet models.Furthermore, these methods produce bed estimates that are smoother than the observed topography, which may bias interpretations of basal sliding processes.
The aforementioned issues can be resolved with geostatistical simulation, which is used to generate multiple realizations of topography that retain the roughness observed in radar measurements.Geostatistical simulation has previously been used to quantify uncertainty in hydrological and ice sheet models 1 and to investigate basal motion. 4The ability of geostatistical simulation to create ensembles of equiprobable topographic realizations could be particularly advantageous for running ensemble ice sheet models to quantify uncertainty.
One of the most widely used geostatistical simulation methods is sequential Gaussian simulation (SGS), which treats spatial phenomena as a Gaussian process governed by spatial covariances. 5These simulations are conditional, meaning they exactly match existing measurements.SGS is the stochastic version of the kriging algorithm, which uses the weighted average of nearby measurements to estimate the value at an intermediate coordinate.These weights are determined by a covariance function, which describes the variability of measurements as a function of their separation distance.SGS is implemented by 1) initializing a random order in which each grid cell will be simulated, 2) visiting a grid cell and using kriging to compute the mean and variance, 3) randomly sampling from the distribution described by the kriging mean and variance (this becomes the simulated value at that grid cell), 4) updating the conditioning data with the newly simulated value, and 5) repeating steps 2-4 until each grid cell has been visited and simulated.Each grid cell is visited and simulated sequentially to ensure that each value accounts for previously simulated values.See MacKie et al. 6 for detailed information on kriging and SGS.
While SGS software packages have historically been proprietary and are predominantly used in oil and gas exploration, 7 recent developments in open source geostatistics software in Python (e.g., SciKit-GStat 8 and GStatSim 6 ) have improved the availability of these methods in cryosphere research.Despite these advances in accessibility, SGS remains difficult to use for large-scale ice sheet applications because its sequential nature makes it extremely computationally expensive.Specifically, generating S topographic realizations for a grid with N grid cells while using a maximum of k neighboring points to calculate the kriging weights is an O (SNk 3 )-type problem. 9This increase in runtime as the grid size increases makes SGS prohibitively computationally expensive for large-scale ice sheet problems, making it difficult to realize the potential of geostatistical simulation.
To address this computational issue, previous studies 9 have used a constant simulation path approach, where the grid cells are simulated in the same order for each realization to save computational time.However, this approach can lead to the underestimation of uncertainty.Alternatively, Nunes and Almeida 10 presented a parallelization strategy for SGS where the kriging weights are calculated in parallel.However, their implementation was tested on only four cores, is only compatible with the Windows operating system, and does not account for nonstationarity or variability in spatial statistics.As such, this approach is not well suited for the simulation of nonstationary subglacial topography, and the scalability has not been fully tested.
In this article, we implemented a Python multiprocess version of SGS following the multiprocessing approach of Nunes and Almeida, 10 which is designed to accommodate nonstationarity in subglacial topography.We tested the performance on up to 16 cores.To enhance the accessibility of this method, we have made our code repository readily available on GitHub a and Zenodo.b In this repository, we provide a script and user tutorial for generating simulations of subglacial topography.This script accepts the conditioning data, resolution, coordinate bounds, and number of realizations as inputs and then outputs topographic realizations.Here, we describe the parallelization process in more detail, quantify the improvement in model performance, and discuss the features of our tool.

SGS Implementation
We implemented a modified version of SGS using methods available in GStatSim v1.0.5, 6 which was specifically designed for simulating subglacial topography.The topographic roughness is quantified using a covariance function, also known as the variogram, which measures the covariance between pairs of data points as a function of their separation distance.Rather than fitting one variogram to the entire region, the data are broken into different spatial clusters based on the density of measurements (see MacKie et al. 6 for details).This allows the topographic roughness to vary within a realization, which is important for capturing complex subglacial conditions or nonstationarity in the variogram statistics.Then, a variogram model is fit to the experimental variogram for the data in each cluster.The variograms are modeled using the exponential variogram function in the Scikit-GStat software package. 8e use the exponential variogram type, 8 which is the most appropriate model for subglacial topography. 6he most computationally expensive step in this algorithm is the nearest neighbor octant search, which finds the k neighboring points to the grid cell being simulated such that they are evenly divided among the eight octants of the coordinate plane (see MacKie et al. 6 for details).This process is designed to reduce bias in irregularly sampled data and the subsequent calculation of the kriging weights.While SGS can be used with any type of kriging (ordinary, universal, cokriging, etc.), we use simple kriging in this study.Let u 1 , u 2 , :::, u k be the location of neighbors, u 0 be the location of the current grid cell, and C m,n be the covariance a https://github.com/GatorGlaciology/SGS-topographyparallelizationb https://doi.org/10.5281/zenodo.7627029

COMPUTATIONAL MODELING OF ICE SHEETS AND GLACIERS
between u m and u n .The kriging weights are determined by solving the following system for k: assuming k ¼ 3.In this study, we use a k of 50.The matrix on the left describes the covariance between the conditioning data, and the term on the right is the covariance between the conditioning data and u 0 .
Note that the covariance calculation depends only on the relative locations of the k neighbors, not their values.Let Z be the variable that is being estimated.The kriging mean, Z Ã is determined by computing the weighted sum of the elevation values and the kriging variance r 2 E ðu 0 Þ is defined as where C(0) is the variance of data in the current variogram cluster.After these calculations, the simulated bed elevation value is determined by sampling from a normal distribution, defined by the kriging mean Z Ã ðu 0 Þ and variance r 2 E ðu 0 Þ.This simulated value is then added to the conditioning data.The process repeats until all grid cells are simulated.

Multiprocessing Strategy
We improve the performance of the previously described SGS methodology by parallelizing key aspects of the workflow following the approach of Nunes and Almeida, 10 described in Figure 1.First, we parallelize the process of fitting variograms to the data in each spatial cluster.This is easy to do because the variogram of each cluster can be calculated independently.Parallelizing SGS itself is more challenging because previously simulated points are added to the conditioning data, making each iteration dependent on the previous iterations.This is demonstrated in (2) when some Zðu a Þ are elevation values simulated during a previous iteration.As such, the traditional SGS approach fails the parallelization requirement of independence of processes.
The parallelization approach from Nunes and Almeida 10 solves this problem by isolating the steps of the algorithm where the kriging weights k are computed.Recall that the nearest neighbor and kriging weight calculations depend only on the variogram parameters and the locations of conditioning data; the values of the conditioning data themselves are not needed.As such, we can encapsulate the calculations for the kriging weights (1) in a function executed in parallel and prepare the parameters required for ( 2) and ( 3), which are faster to compute.To achieve this, we first define a random simulation order in which each grid cell is visited.Then, for each grid cell in the simulation path, the nearest neighbors are found, and the kriging weights are computed in parallel.These k nearest neighbors include the original conditioning data and any coordinates that will be simulated prior to u 0 .The grid cell indices and weights of the neighbors are stored for later calculation of the kriging mean and variance.For each additional stochastic realization, a new random simulation path is generated, and the process repeats.

Speed Comparison
All simulations were tested on a 2021 Apple Mac Studio M1 Ultra with 20 cores.The Python multiprocessing module is used to distribute independent Python processes and data across a specified number of cores.We use bed elevation measurements for a 150 Â 150-km 2 region in northwest Greenland from the Center for the Remote Sensing of Ice Sheets. 11For visualization purposes, we also generate simulations for Pine Island Glacier (PIG) in West Antarctica using data from Fr emand et al. 2 We compared the performance of our parallelized algorithm with 16 processors to a single processor by running simulations with varying job sizes and recording the execution time.The number of simulation points decays approximately geometrically with resolution, so, to evenly sample job sizes, we varied the resolution geometrically from 200 m to 1200 m.Speedup for this comparison is defined as the serial algorithm's execution time divided by the parallel algorithm's

COMPUTATIONAL MODELING OF ICE SHEETS AND GLACIERS
execution time.Job size is defined by the number of simulated elevation values.Note that job size is a function of resolution size.During multiprocess runs, we recorded both the total execution time (including both serial and parallel components) and the time spent on calculations executed in parallel for later analysis.

Performance Analysis
We ran additional simulations specifically to perform scalability testing.Scalability is an algorithm's ability to increase in speedup with the number of processors.To perform this test, we ran two trials with the Greenland dataset at resolutions of 300 m and 1000 m and sequentially increased the number of processors from one to 16.For these trials, speedup (S) is defined as T ðpÞ where T( 1) is the execution time of our parallel algorithm with one processor, and T(p) is the execution time of the parallel algorithm with p processors.To interpret our results, the ideal speedup line (displayed in Figure 2) was used as a baseline for assessing scalability.However, ideal speedup is often unattainable in practice. 12Ideal speedup is defined under the assumption that the execution time for a parallel algorithm with p processors should follow and, therefore, attain and a speedup of This metric does not account for the extra time required for parallel algorithms to perform process synchronization, known as parallel overhead.Taking the time for parallel overhead T o into consideration, the execution time for algorithms that are strictly parallel can be defined by We also used Amdahl's law to analyze the performance of our multiprocess implementation.Amdahl's law states that the performance of a parallel algorithm is limited by the percentage of time spent running a parallel process. 13As such, the maximum theoretical speedup S T of a parallel algorithm with p processors is where f is the proportion of time spent performing serial calculations.Additionally, we compared the empirical speedup to the idealized speedup line.According to Amdahl's law, the difference between the ideal speedup and the maximum empirical speedup D is given by Notice that, as the number of processors p increases, D is expected to increase.This is because, as p increases, the maximum theoretical speedup described by Amdahl's law becomes increasingly limited by the proportion of time performing serial calculations [f in (8)].

Visualization of Topographic Realizations
We visualized multiple realizations produced by our parallelized SGS algorithm using the PyVista library in Python. 14The conditioning data were plotted to display the spatial distribution of the ice-penetrating radar measurements from the northwest Greenland and PIG data sets.From these data, we generated two SGS realizations for both locations.The Greenland simulation has a resolution of 400 m, and the PIG simulation has a resolution of 500 m.For comparison, we plotted the topography of BedMachine 3 for both locations at the same resolution size.BedMachine was derived using deterministic interpolation methods, including mass conservation and spline interpolation.

RESULTS
The runtimes for different simulation sizes using one versus 12 processors are shown in Figure 3

COMPUTATIONAL MODELING OF ICE SHEETS AND GLACIERS
produces the fewest simulation points, at 5022.This job took approximately 24 s on one processor and 7.6 s on 12 processors.The greatest speedup of 8.2 occurred at a 200-m resolution, yielding the largest number of simulation points at 306,811.This simulation took 1 hour 3 min on one processor and 7 min 43 s on 12 processors.For the 12 processors, we find that, as the job size increased from 5022 to 306,811 simulated points, the proportion of time performing serial calculations monotonically decreased from 24.9% to 5.2%.Figure 2 displays the results from our scalability testing with speedup defined by ( 4).The 300-m runs simulated 128,097 missing points, and the 1000-m runs simulated 7899 missing points.The speedup for the 300-m-resolution runs is greater than that for the 1000-m runs.The 300-m-resolution simulation has a speedup of 9.5 when using 16 processors.Figure 4 provides a visualization for the topographic realizations generated by our multiprocess SGS algorithm, along with the BedMachine elevation model. 3e provide our parallelized SGS algorithm as an open source tool for users to run accelerated simulations on their own data sets.Our Python script and user tutorial are hosted on GitHub and Zenodo.The Python script provides a command line interface for users to enter a comma-separated value file (in polar stereographic coordinates) and specify the bounding coordinates, grid resolution, and number of simulations to run.For each realization, we return a color map visualization of the modeled topography and the associated simulation data.The user tutorial is a Jupyter notebook that also enables the user to provide their own data; however, its intent is to provide a more detailed understanding of our multiprocess implementation.The notebook also provides an interactive visualization of the simulated data using PyVista. 14

DISCUSSION
Our results show that multiprocessing performance increases with job size.This phenomenon can be explained by Amdahl's law because job size has a direct effect on the proportion of time performing serial calculations [f in (8)].With the number of processors held constant, we found that, as the job size increases, f decreases monotonically.Note that in (8), S T and f are inversely proportional; therefore, a decreasing f as job size increases results in an increase in speedup.
Our multiprocess algorithm significantly reduces the time cost barrier to generating multiple highresolution simulations.These results point toward a feasible path for simulating entire ice sheets.While Figure 2 shows that there are diminishing returns when increasing the number of processors, this effect can be explained by the increasing difference between the theoretical maximum speedup and the ideal speedup as the number of processors increases [see (9)].Furthermore, this effect is lessened for larger simulations, as demonstrated by Figure 3.This suggests that, for ice-sheet-scale simulations, which would involve simulating tens of millions of grid cells, a larger number of processors may be effective.The speed of this method could be further improved by performing the parallel component of this algorithm on a GPU.Future work is needed to test our algorithm's scalability using highperformance computing clusters and GPU acceleration.
Our code repository with our multiprocessing Python script and user tutorials are publicly available on GitHub and Zenodo.This open source software will make it easy to incorporate geostatistical simulation into various ice sheet investigations.For example, this tool could be used to generate ensemble realizations of subglacial hydrological models. 15These can be used to quantify uncertainty in the locations of subglacial flow paths.

COMPUTATIONAL MODELING OF ICE SHEETS AND GLACIERS
Additionally, simulated topography can be used in ice sheet models to investigate basal sliding processes 4 or quantify uncertainty in ice sheet models with respect to uncertainty in bed topography. 1 The case studies by Law et al. 4 and Wernecke et al. 1 used a small number of SGS topographic realizations with relatively small spatial domains.Our SGS speed improvements, coupled with advances in ice-sheet-modeling methods, could help facilitate the implementation of large ensembles of ice sheet models at regional or ice sheet scales.

CONCLUSION
The geostatistical simulation of realistically rough bed topography is imperative for accurately characterizing basal processes and quantifying uncertainty in ice sheet models.However, the inherently sequential nature of SGS makes this algorithm difficult to use at large scales or for large ensembles.Our multiprocessing implementation significantly reduces the time cost associated with generating simulations.Furthermore, by making this tool open source with well-documented user tutorials, we have reduced the barrier to spinning up SGS models for use with ice sheet workflows.This will enable SGS to be integrated with a variety of ice sheet analyses, including uncertainty quantification in ice sheet models.
(a).The resulting speedup is shown in Figure 3(b).The smallest speedup of 3.2 occurs at a 1,200-m resolution, which

FIGURE 2 .
FIGURE 2. Speedup, defined by (4), with respect to the number of processors.

FIGURE 3 .
FIGURE 3. (a) Total execution time of the parallel and serial SGS simulations.(b) Speedup defined by (4).The x-axis uses a log scale and 12 processors were used to obtain these data.

FIGURE 4 .
FIGURE 4. Ice-penetrating radar measurements, deterministic interpolation using BedMachine, and parallelized SGS interpolations (a) northwest Greenland and (b) Pine Island Glacier with arrows indicating the direction of ice flow.
This work is licensed under a Creative Commons Attribution 4.0 License.For more information, see https://creativecommons.org/ licenses/by/4.0/Digital Object Identifier 10.1109/MCSE.2023.3317773Date of publication 22 September 2023; date of current version 25 October 2023.