Spherical K-means and Elbow method optimizations with Fisher statistics for 3D stochastic DFN from Virtual Outcrop Models

Fracture modeling plays a valuable role to understand the fluid flow in carbonate reservoirs. For this, the fracture characterization to generate Discrete Fracture Networks (DFNs) can take advantage of analogue outcrops through Virtual Outcrop Models (VOMs), acquired by Unmanned Aerial Vehicles (UAV) and digital photogrammetry. The stochastic DFN generation is an important step in reservoir modeling as it brings more representative data to the process and has long been studied. However, optimizations concerning automatizing some of the steps necessary to its generation like data clustering are still open to advancements. In this sense, this work aims the fracture data clustering and the definition of the number of clusters when gathering data for the stochastic process, developing an Elbow method for spherical data and a balanced K-means, both based on Fisher statistics. For this, we interpreted fracture planes in a VOM that recreates a carbonate reservoir analogue from the Jandaíra Formation, in the Northeast, Brazil. As result, we show a workflow for immersive fracture interpretation alongside a 3D stochastic DFN model with fracture intensity of 22.57m-1 for cell sizes of 1m3. Regarding the clustering balance, our method achieved a lower standard deviation between sets while maintaining the Fisher values greater to obtain fracture sets with lower dispersion. Additionally, the Elbow method implementation proved a beneficial step to the workflow as it reduced the interpretation bias of family clusters. These results alongside the proposed workflow bring a better understanding of the outcrop geometry while offering data scalability for reservoir modeling.


I. INTRODUCTION
Carbonate reservoirs (e.g., Pre-salt of the South Atlantic Margin offshore Brazil) host important hydrocarbon reserves. The carbonates present intrinsic characteristics of porosity and permeability that regulate fluid storage and fluid flow in reservoirs [1]- [3]. These characteristics are impacted by diagenetic processes, dissolution, and tectonic events that affect the reservoirs, generating fractures and discontinuities [4]. Thus, the fracture attributes contribute to the estimation of the geomechanical behavior of carbonate rocks necessary for hydrocarbon extraction.
Fractures can be defined as a result of a rupture or crack in the rock body that generates a physical discontinuity [5]. They often appear as a connected network of fractures, usually represented in numeric terms as Discrete Fractures Networks (DFNs) [6].
Identifying this kind of features is also an important task when analyzing impact flow on geothermal reservoirs [7], hydrocarbon-bearing formations [8], groundwater aquifers [9], leakage of CO 2 storage [10] and nuclear waste [11] broadening the knowledge over these formations allowing proper exploration and preemptive actions.
Specifically, deep reservoirs, like the Brazil´s offshore pre-salt reservoirs (6km below sea level), suffer from the limitations in spatial resolution of the seismic assessment and with the sparsity of well log data [12]. These limitations in reservoir assessment lead us to the study of analogue outcrops, which are exposed rock formations with characteristics of composition and geometry similar to the reservoirs studied [13].
The fracture network investigation in analogue outcrops is carried out by direct measurements like the scanline method, where information such as length, strike and dip, and aperture are acquired [14], [15], usually by an analyst in situ. Due to access restrictions and limited spatial coverage, remote sensing techniques, to obtain the Virtual Outcrop models (VOMs), are widely used to obtain a larger set of geometric information about fractures on the outcrop. In this case, the interpretation of fractures can be made directly on the computer screen.
Either in situ or using VOMs, the fracture network information also integrate roughness, connectivity, and spacing, which are is modeled in deterministic and stochastic DFN models [6]. In the deterministic DFN model, attributes such as density and intensity are generally estimated by the P xy system [16]. The fracture intensity is obtained by the number of fractures that cross a scan line (P 10 ), by the sum of the fracture lengths, divided by the area measure of the sample region (P 21 ), and by the sum of the areas of fractures contained in a volume (P 32 ) expressed in meters (m −1 ). A linear correlation can be estimated observing the indexes P 10 , P 21 , and P 32 , allowing the estimation of the volumetric P 32 from 1D and 2D data [12], [17]. Such correlations are supported by a constant C, based on the parameters of fracture orientation, radius size distribution and orientation of the sampling window or scan line [18].
To estimate the P 32 index, the dimension of the sample to be measured must be three-dimensional, in which orientation measurements (strike/dip) are performed. This information can be incorporated into the stochastic DFN modeling and scaled to the reservoir models. In these models, the quality of the simulation is directly related to the collection of information from the fracture network [17]. The orientation parameters, needed to create the DFN models, must consider their mean values and dispersion values obtained with Fisher distribution statistics [19], [20], and the quantity of sets decision will influence how the fracture are clustered and how the Fisher statistics will be propagated to the stochastic model.
Works that have generated 3D stochastic DFN models from strike/dip data relied on visual interpretation of fracture sets in diagrams and statistics generated by 2D software like the FracPaQ [21], [22] and FRACMAN [23] to identify the best quantity of sets. With some works [24], [25] reducing the quantity of sets arbitrarily applying metrics to identify and agglutinate co-planar sets, also using the Fisher concentration to highlight the possible number of clusters in the stereonet. Additionally, some works have proposed modifications to the clusterization method based on the Kmeans algorithm, employing Particle Swarm Optimization and Fuzzy clustering [25], [26], also optimizing the cluster initial positions. However, the quantity of cluster is still set arbitrarily.
In this regard, we propose two optimizations to the 3D stochastic DFN generation workflow by: adapting the Elbow method to the definition of fracture sets using the spherical K-means and Fisher statistics; and making a balance check to the clusters obtained from the K-means algorithm to verify if the process should be restarted with new center positions or not. These strategies aims to obtain better input data to the Fisher statistics definitions necessary to the stochastic generation modeling.
For this, we use the software Mosis XP [27], [28], to interpret fracture planes in a VOM obtained of an outcrop analogue located in the Jandaíra Formation, in Brazilian Northeast. From this data, our proposed workflow is applied and validated generating a 3D stochastic DFN and estimating the volumetric fracture intensity for the model cells. This region is well studied in geological terms [29], [30], however no study regarding outcrop geometry and generation of 3D stochastic was carried. This work is presented as follows: the proposed method, statistics and computations to extract 3D data are presented in Section 2; the interpretation scenario and experimental setting are presented in Section 3; followed by the results and comparisons in Section 4; accompanied by the discussion and finding conclusion.

II. 3D STOCHASTIC DFN MODELING
The proposed workflow to determine the volumetric intensity integrates the 3D stochastic DFN generation from VOM interpretation of fracture planes, as illustrated in Figure 1.
The workflow starts with the fracture plane points acquisition from VOM. The rest of the workflow describes the data clustering and statistics for each possible set, and finishes with the 3D fracture generation and the volumetric 3D DFN. This workflow is better described in the following sections.

A. PROCESSING FRACTURE PLANE DATA
The interpreted plane data extracted from the VOM is composed of 3D point coordinates (xyz), in which the positional and directional information will be computed. Directional and positional information from points placed in 3D space necessary for fracture characterization can be obtained by basic geometry operations. As the first plane information, the centroid is calculated to obtain the central position of a group of points, given by where N is the quantity of points that compose the plane and {z 0 , z 1 , ...z N } are the points coordinates. The next plane information is the plane equation and its components that are given by where A, B, C, and D are the equation terms (with D representing a positional term) [25]. These terms are computed with the principal component analysis (PCA) technique [31] that fits the point data to its eigenvalues and eigenvectors, where the third eigenvector gives the plane normal vector N (A, B, C) usually normalized in the form n(a, b, c). The positional term D in the plane equation is given by After obtaining the plane equation terms, the strike and dip angle are calculated in terms of α and β angles given by and where q é 0 • , 180 • or 360 • according to the quadrant that the resultant vector of a and b points [32]. These equations give us the strike and dip in radians, that are later converted to degrees for visualization in the stereonet and other statistics. Depending on the quadrant, we must correct the strike direction considering the a and b values, where, if a and b are greater than zero the strike is 360 − deg(β) and if a is greater than zero and b is lesser than zero the strike is 180 + deg(β). Also, considering the "righthand rule" [33] we consider opposite strike directions as the same angle, computing directions for the first and the second quadrant only.

B. FISHER STATISTICS
The strike and dip data obtained from the 3D interpretations are the main data for the stochastic plane generation. However, this data is reduced to spherical statistics information computed with Von Mises and Fisher [34] statistics for each fracture set. The Fisher statistics [35] are spherical statistics that indicate the dispersion and distribution of point positions in a sphere. For the Fisher statistics, the mean strike and dip for a set are observed in terms of the normal vector of each plane represented in polar coordinates ϕ and θ. Where ϕ (latitude) is given by 90 − dip, and θ (longitude) is given by the angle rotated from the prime meridian using the bearing angle.
The mean direction of Fisher is given by the sum of the direction cosines [35] This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. where l i = cos(ϕ i )cos(θ i ), m i = cos(ϕ i )sin(θ i ), n i = sin(ϕ i ), and i is an individual plane of the N observations. After computing the mean direction cosines, they are converted in unit vectors dividing by the length of the dip mean vector R given by and the unit vectors are given bŷ The mean directional unit vectors are converted to mean strike and dip are then given by for dip angle, and The index of dispersion of the Fisher distribution k is given by where k values close to 0 indicate not coplanar planes and positive values to ∞ indicate strongly coplanar planes. In terms of distribution, the Fisher statistic [36] is given by

C. SPHERICAL DATA CLUSTERING
The basic K-means algorithm clusters objects in sets of K defined clusters, with random centroid clusters initially set in the first iteration. The objects are then associated with the closest cluster center given a distance metric. After each iteration, the centroid position of the cluster is recalculated. The algorithm converges when no significant change in centroid positions is observed.
To obtain the optimal number of clusters for the K-means clustering the most common algorithms used are the Elbow method and the Silhouette method [37]. For the Elbow method algorithm, the optimal number of sets is given by the Elbow method in which a set of different numbers of clusters are tested evaluating the squared sum of distances of samples and clusters centers. Using a line plot as a reference, the desired number of clusters is identified by the inflection point in the graph. Our implementations modify the K-means and the Elbow method to use the Fisher k statistics to obtain optimized clusters sets.
For the K-means algorithm we consider the spherical K-means which uses the angular distance between two strike/dip positions as polar coordinates (ϕ/θ) to measure the clusters and centers proximity. We use an additional step that verifies if the generated clusters have optimal Fisher k values, considering that are no set with Fisher k values below the average Fisher k minus the standard deviation of all sets. This approach guarantees that we have balanced clusters ( Figure  2b), instead of some clusters with low dispersion and others with high dispersion (Figure 2a). If this condition is not met, the K-means algorithm is restarted with new random centers. These steps are detailed in Algorithm 1.  For the Elbow method algorithm we consider the normalized squared mean of Fisher k values of all sets evaluated in each K-means K values evaluated. After running all different K-means K values clustering and after obtaining the established Elbow metric values, an additional step is carried to compute the optimal number of clusters systematically, establishing a line between the first and last values obtained in the Elbow method. This line is used as reference to compute the most distance point from this line, as the graph inflection point. These steps are detailed in Algorithm 2.

D. SPACING
After estimating the sets and directional statistics the spacing estimation can be carried [38], [39]. The spacing, assuming that two planes are parallel, e.g. have equal values for the a, b, and c, is given by where d 1 and d 2 are two parallel discontinuities (planes). For non-parallel planes, a virtual scanline can be used to estimate the distance between planes that cross the designated line, where the average, minimum and maximum spacing can be computed [40]. Get random K lat/lon positions as initial set centers 4: while moving_centers ̸ = F alse do 5: Compute distances between centers and measures 6: Associate measures to closest centers 7: Change center positions to new centers position in each group 8: if distance(centers, new_centers) ≤ tolerance then 9: moving_centers ← F alse Compute Fisher_k_values for each group 15: if F isher_k_values ≤ (max(F isher_k) * 0.25) then 16: optimal_F isher_k ← F alse 17: else 18: optimal_F isher_k ← T rue 19: end if 20: end while 21: Output: centers

E. 3D FRACTURE GENERATION AND STOCHASTIC DFN MODELING
With Fisher statistic parameters and spacing for each set, we can generate the stochastic DFN by estimating random planes that tend to follow the same Fisher distribution of the original sets when applying stochastic techniques.
This process starts by establishing the quantity of fracture planes to be generated for each fracture set. Our implementation considers the mean plane sizes to determine the quantity of fracture layers to be generated, as the height (z axis) of the model is divided by the mean fracture size.
In each layer, the quantity (n) of fracture planes to be generated in each set is given by the width (x axis) times the depth (y axis) of the model divided by the set spacing.
After getting the quantity of fractures for each layer (n), we generate two lists of random uniform values R 1 and R 2 of size n. To generate the new fracture data we apply these random values in the Fisher distribution sampling function given by and where k is the Fisher dispersion, and λ = exp(−2k). Which are rotated using the rotation matrix given a initial θ 0 and ϕ 0 . x i = i 17: The rotation matrix used to obtain the final sample values θ ′ and ϕ ′ is given by The random generated plane directions are then converted to their normal vector information to be placed in the stochastic DFN following a uniform random positions for each layer.
To estimate the volumetric fracture intensity P 32 we consider a sample volume with the same size and position to the stochastic DFN model, while this volume is subdivided in cells of equal sizes. In each cell, we compute the fracture plane intersections with the cell faces to obtain the vertices that make the fracture portion polygon inside the cell to compute its area. VOLUME 4, 2022 These intersections are obtained using the Möller-Trumbore ray-triangle intersection algorithm [41] considering the cell faces and fracture planes as triangle pairs and the rays as triangle vertices. Fracture plane vertices inside the cells are identified using the Delaunay triangulation using the cell edges as reference.
The polygon areas using the extracted vertices are computed subdividing the polygon using the Delaunay triangulation and estimating the area of triangles that fit the polygon. All polygon areas inside a sample cell are summed and then divided by the volume to compute P 32 for each cell.
The generated algorithm and scripts presented are packaged in a application named vizfrac3D which will generate the following products as presented in Figure 3. This chart illustrates what a user could expect in terms of input and output to our script in an easier way, complementing the main workflow presented in Figure 1.

A. STUDY AREA AND GEOLOGICAL SETTINGS
For this study we used a ravine within the Lajedo Soledade ( Figure 4B), located in Apodi municipality, in the Rio Grande do Norte state, Brazil. This outcrop presents rocks within the southeast portion of the Potiguar basin [42], [43], precisely within the Jandaíra formation ( Figure  4A), deposited between Turonian and Campanian stages (ca. 93.9-83.6 Ma) [29], [44]. This formation is characterized by the presence of a monotonous and homogeneous succession of carbonate rocks predominantly classified as mudstones, grainstones, and packstones, composed of grains of primary carbonate minerals (calcite) with strong posterior transformation of primary minerals for dolomite [30], [46], [47]. Grainstones are grain-supported rocks and have a relatively small fraction (<1%) of clay-like grain material (<20µm), while packstone is a supported rock by grains, but with a clay fraction ratio >1%, and mudstones are not supported by grains and are predominantly formed by fragments of grain size <20µm [48]. The rocks have a minority fossiliferous content of bivalves, mollusks, and algae remain [46].
The Jandaíra Formation is locally preserved in an expansion that exceeds 600m in thickness with a slight bedding dip (< 5 degrees) to NNE at the regional level [47], [49]. This late Cretaceous formation comprises a fracture network that is predominantly in the N-S; E-W directions, and secondarily NE-SW and NW-SE directions [44], [47], in both cases with sub-vertical dips associated to the tectonic evolution of the Potiguar basin. This fractured system is characterized by vertical or sub-vertical joints and veins. Horizontal fractures and fracture parallel to the Jandaira formation bedding also occur. These structures are possibly related to the lithostatic pressure relief process in the Potiguar Basin region [44], [50].
presence of calcite veins with N-S, NNE-SSW and E-W directions are reported by [50], while stylolites oriented predominantly in E-W and NNW-SSE directions are found by [47]. Additionally, there is formation of circular to subcircular subsidence structures, such as sinkholes and rebate rings, given by the dissolution and patent structuring of the Jandaíra Formation, which sustains high plateaus of regional character with main directions NE-SW [44], [47], [50].
Given the geological characterization and the extensive areas of naturally fractured carbonate outcrops present in the Jandaíra Formation, this region is of great interest for geological studies focused on hydrocarbon reservoirs, like the pre-salt reservoirs, in the Brazilian offshore. The Lajedo Soledade outcrop, in particular, has NE-SW faults and E-W fractures, which control its drainage system [29]. This outcrop presents an important process of karstification resulting from the dissolution of carbonate rocks by meteoric waters, as described in [29], which led to the creation of cavities such as the ravine used for this study. The karst dissolution process in the ravine, as well as the characterization of stylolites and primary discotinuities are best described in [30], based on field and laboratory analyses. However, a more comprehensible geometric analysis of these features is yet to be seen, and this is one key motivation for this work.

B. VOM ACQUISITION
For the UAV image acquisition in the ravine, we used a DJI Mavic 2 Pro embedded with a Hasselblad 20 Megapixels camera. The ravine represents 7,000m 2 of the outcrop, in which 376 photos were taken with a camera tilt of 45 • , in a flight lasting 14 minutes and height of 3m for subsequent photogrammetric reconstruction and generation of the textured 3D model.
The photogrammetric reconstruction follows the flow ( Figure 5) proposed by the software Agisoft Metashape ®1version 1.6.2, which employs algorithms of SfM [51], [52] identifying key points (Scale-invariant Feature Transform -SIFT algorithm) in image pairs, and through bundle adjustment estimate camera location to generate the point cloud, whereafter the Multi-View Stereo algorithm [53], [54] is used to create the dense point cloud. The dense point cloud is triangulated generating a 3D mesh that will be texturized generating the VOM.

C. FRACTURE INTERPRETATION IN IMMERSIVE ENVIRONMENT
We used the software Mosis XP designed for geological interpretation in an immersive virtual reality (iVR) system with a head-mounted display (HMD) to extract the fracture planes. This software integrates the Mosis 2 suite, developed by the Vizlab | X-Reality and GeoInformatics Lab 3 . With the iVR system set, the VOM is loaded on Mosis XP project ready for immersive geological interpretation where in this work we use 3D data from the plane measurement tool exported as xyz data.
The HMD used is an HTC Vive with a pair of controllers (also visible within software) tracked by two infrared towers. The computer specs required for the iVR system is relatively high due to the dual visual output of the HMD (one for each eye) at a high refresh rate (90Hz) to avoid motion sickness. The computer setup is an Intel Core i7-6820HK 2.70GHz 2 mosis.vizlab.cc 3 http://www.vizlab.unisinos.br/ (3.60GHz) CPU, 64GB RAM, an Nvidia GTX 1080 8GB RAM graphic card, and Windows 10, 64 bit operational system.

D. VALIDATION
For validation of the proposed method to generate the 3D stochastic DFN we evaluate the Fisher statistics of both stochastic and deterministic models. Each generated and interpreted set will have a mean strike and dip, that considering the polar coordinates, is possible to determine the angle distance between the stochastic and deterministic fracture sets. This validation is done considering 30 stochastic generation for each fracture set where we compute the mean and standard deviation of the angle distance.
To validate the proposed method to balance the K-means clusters after the elbow method definition, we run the original spherical K-means clustering and the modified proposed method 10 times for each set considering a small subset. The dispersion Fisher k value is used in terms of mean, standard deviation, min and max values to show the impact of the proposed method.

E. EXPERIMENTAL SETTINGS
The fracture plane data from Mosis XP consists of x,y, and z coordinates which are converted to strike, dip and centroid data following the steps presented in the prior section, while the algorithm to obtain the optimal number of clusters or fracture families is presented in Algorithm 1. We use the average of 50 runs as max iterations considering 1 to 10 clusters.
After computing Fisher and spacing statistics the fracture planes for the 3D DFN model are generated. The volumetric VOLUME 4, 2022 7 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3182332 sampling in this case will give the P 32 areal fracture intensity per volume with mean and standard values. The 3D stochastic model to be generated will have a size of 20m 3 with individual cube samples of 1m 3 , while fracture planes will have 4m 2 .

IV. RESULTS
The UAV-SfM-MVS photogrammetric processing (Figure 6) resulted in a textured 3D model with 899,055 triangles faces and texture resolution of 6 times 8k pixels showed in Figure  7a. The iVR interpretation is illustrated in Figure 7b, where we captured 638 planes using the bedding plane tool ( Figure  7c). Lastly, the measured plane locations are shown in Figure  7d. Applying the Elbow method in the fracture plane data from the iVR interpretation, we identified the optimal number of clusters as shown in Figure 8, highlighted by the inflection point with 7 clusters.
The Table 1 shows the results of the clustering balancing method in comparison to the original spherical clustering, where in 10 runs for each of the selected number of clusters considering a subset of 10% of the data, the proposed method showed a lower standard deviation. Also, the unmodified method achieved Fisher k values lower as 1 indicating groups with no directional trend. Within each of the seven clusters shown in Figure 9 we computed the Fisher statistics as shown in Table 2, while a visual representation of the Fisher dispersion is shown in Figure 10. The mean and standard deviation of the spacing in each set were also calculated as shown in Table 3. In Figure 11 the stochastic model is shown with the fracture generated for each set, alongside the volumetric model with 8,000 sample blocks colored according to the volumetric fracture intensity P 32 , which presented a mean intensity of 22.57m −1 , and standard deviation of 7.92.
The Figure 12 shows the stereonets with fracture planes generated for each clusters, matching the Figure 10, while the number of fracture planes generated are shown in Table  4. For the stochastic process validation considering 30 stochastic runs we achieve low deviation on the distance angle values indicating the statistical stability, while the mean angle distances is minimal indicating that the stochastic generations replicated the same behavior found in the deterministic model as shown in Table 5.

V. DISCUSSION
The first of the proposed improvements, namely the modified Elbow method to identify the optimal number of clusters 8 VOLUME 4, 2022 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.   ( Figure 8), proved an important step to the stochastic DFN generation workflow as it reduce the bias when selecting the number of fracture sets from the gathered data. It also contributed to generate fracture data respecting a lower direction dispersion used for the Fisher generation. This is a crucial factor as a reduced number of set can generate sets that not follow a visible trend, and prior works like [21]- [25] have generally ignored these characteristics of groups' dispersion. Additionally, the proposed clustering balancing using the Fisher k values of dispersion guarantee that the selected FIGURE 9: Stereonet with fracture poles clustered using the proposed method considering 7 clusters identified by the Elbow method.
clusters will follow a more defined trend. This is shown in our validation step as without our method, the K-means spherical clustering generated groups with values as low as 1.02 for the Fisher k value of dispersion, while the balanced results presented a minimum values of 30.43 and lower standard deviation between groups (Table 1). Our approach tackle a concern also find by [25], [26], however they only adjust the ideal position of clusters centers and not the final clusters.
With cluster balancing, the data distributed in the sphere VOLUME 4, 2022 9 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  in their respective sets shown in Figure 9 and the results presented in Table 2, prove the validity of the proposed method. Additionally, from the proposed optimizations, our stochastic model is able to reproduce the input data (10 and 11) with average strike/dip differences equal to 0.03°, having a set with a difference of 0.19°and all others with values less 0.006° (Table 5).
Regarding the DFN modeling, an important aspect to be set is the plane form and dimensions that represent the fracture, while the most common forms are ellipsoidal, rectangular, or with the planes extending to infinity. This will impact how the area of the plane will be measured to estimate the fracture intensity in the volume P 32 or how the topology will be constructed as suggested by [55]. However, the squared shape represented was the most appropriate to our fracture model. Besides that, the sample volume size needs to be further explored, as prior works have employed sizes varying from centimeters to tens of meters [21], [56]- [58].
As for the VOM interpretation, the constant evolution and improvement of UAV-SfM-MVS techniques for generating VOMs have resulted in increasingly lighter meshes [59], which can increase the accessibility for such immersive interpretations, as performance capabilities of today's computers, graphic cards, and HMDs enable more people to experience virtual reality.
The studied area presents steep and narrow sinkholes and canyons from karstification and subsequent erosion of these faults [29], [30], which makes access to these places difficult and dangerous. The adequate collection of fracture planes was possible due to the particularity of the immersive virtual environment that allows navigating the outcrop safely and reaching places that would not be possible in a real environment. Thus, the proposed framework showed a viable alternative as a mean to safely provide threedimensional fracture data in adequate quantity and quality for the generation of numerical models from realistic fracture networks, which reinforces previous work on these themes 10 VOLUME 4, 2022 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3182332  [60], [61].

VI. CONCLUSION
In this work we presented a workflow to generate 3D stochastic DFN models, with algorithms optimization for fracture plane clusterization. To obtain three-dimensional data, we interpreted a virtual outcrop model in an immersive environment. The mains conclusions are listed as follows: • The decision regarding the number of fracture sets often disregarded can be automatized adapting the Elbow method to work with spherical data, reducing the user bias in the process; • Clusterization data can be improved by the analysis of spherical statistics; • The automatic cluster balancing brings more data quality for the stochastic processes generating fractures with similar behavior to the input data.
Additional future works aim to integrate DFN model with multi-scale data from regional observations and porosity data to generate a representative geo-cellular model of the region. Also, further validation of the statistical models is to be carried out incorporating machine learning techniques. Further analysis need to be made concerning the number of interpretations to verify how representative to the DFN the acquired interpretations are.

CODE AVAILABILITY
The code script developed in Python and the related data is available at the Github repository (https://bit.ly/3vtOGRx), under the BSD-3 license.

DECLARATION OF INTERESTS
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.