Estimation of Filtration Properties of Host Rocks in Sandstone-Type Uranium Deposits Using Machine Learning Methods

The nuclear decay of uranium is one of the cleanest ways to meet the growing energy demand. The uranium needed for power plants is mainly extracted by two methods in roughly equal amounts: quarries (underground and open pit) and in-situ leaching (ISL). The effective use of ISL requires, among other things, the correct determination of the filtration characteristics of the host rocks. In Kazakhstan, this calculation is still based on methods that were developed more than 50 years ago, and in some cases, give inaccurate results. At the same time, knowledge of filtration characteristics is necessary for the calculation of recoverable reserves, prediction of production dynamics, calculation of the optimum number of wells, etc. This paper describes a method for calculating the filtration coefficient of ore-bearing rocks using machine learning. The proposed method is based on nonlinear regression models. It also allows the estimation of the filtration properties of rocks within the process acidification zone, where the existing method is not applicable. The proposed method applies to approximately half of the uranium mined in the world and makes it possible to significantly (by 22 %–70%) increase the accuracy of the filtration coefficient determination and, accordingly, improve the accuracy of recoverable reserves calculation and economic indicators of mining processes.


FIGURE 1. Countries with the largest uranium reserves.
Second, the filtration properties of the rocks must be known to estimate the reserves and extract the maximum amount of uranium.
Inaccuracies in determining the lithological composition and filtration characteristics lead to errors in the technological process of filter installation and errors in determining ore reserves. For example, economic losses from incorrect lithological classification in the deposits of Kazakhstan can be estimated to be approximately 1 to 4 million dollars per year [4].
These inaccuracies are caused by both the technological limitations of logging readings and, to a large extent, by the methods used to determine the lithological composition and filtration properties of rocks. When determining the filtration properties of rocks in the field, the key aspect is to determine the filtration coefficient (K f ) at the stage of exploratory drilling, which is further used to calculate the filtration properties of technological wells. However, the accepted methodology, based on analytical methods, has not changed since the end of the last century [5]. Meanwhile, the correct determination of K f is necessary for the calculation of recoverable reserves, prediction of production dynamics, calculation of the number of wells, and the distance between them (hexagonal cell diameter or distance between well rows), and selection of optimal length and location of filter by depth.
This study considers the application of machine learning methods to estimate the filtration characteristics of orebearing rocks. The method is based on the use of nonlinear regression models and has shown results 22 %-70% better than calculations using the existing methodology used in Kazakhstan. The method also allows for the estimation of the filtration properties of rocks within the technological acidification zone. The proposed method concerns approximately half of the mined uranium in the world.
The work consists of the following sections: -The first section briefly describes the existing techniques for determining the filtration coefficient and its shortcomings.
-In the second section (related works), we provide an overview of the work devoted to the application of machine learning methods to mining problems. -In the third section, we present the methodological scheme of the study, describe the machine learning models we applied, and the metrics for evaluating the quality of their performance.
-The fourth section presents and discusses the results obtained.
-The conclusion briefly describes the results obtained, the limitations of the method, and formulates the objectives of future research.

II. THE METHODOLOGY USED IN PRACTICE FOR DETERMINING THE FILTRATION PROPERTIES OF HOST ROCKS AND ITS LIMITATIONS
To determine the relationships between the filtration properties of host rocks and the value of apparent resistivity (AR), hydrogeological studies (pumping) were carried out at the stage of exploration, and sampling for the analysis of rock grain size distribution (GS) was carried out.
Analysis of GS samples of rocks of productive horizons in hydrogenous fields shows that the distribution of particle sizes can be well approximated by the log-normal law, and to characterize the rocks of productive horizons it is advisable to allocate the following lithological types, each of which is characterized by a certain range of particle fractions (in millimeters) ( Table 1) [6]: If the mass fraction of clay-silty particles exceeds 50%, the rock is identified as clay, and sand is subdivided into fine, close, medium, and coarse grained depending on which fraction exceeds 50% by mass fraction; if the mass fraction of gravel particles exceeds 50%, the rock is identified as gravel. If the mass fraction of none of the fractions exceeded 50%, the rock was identified by the fractions whose sum of mass fractions of particles exceeded 50%. For example, if the sum of the mass fractions of the fine-and medium-grained fractions exceeds 50%, the rock is identified as fine-to medium-grained sand. If the sum of the mass fractions of three or more sand fractions exceeds 50%, then the rock is identified as multigrained sand.
As a result of the joint processing of electric logging data, the results of the analysis of particle size distribution and data on filtration properties of rocks of productive horizons in fields of infiltration type, the following regularities are established: -The most stable parameters that characterize individual lithologic rock types are particle diameters d e = d 0,1 (the so-called effective diameter) and d 0,6 -the average particle diameters with relative mass fraction of 0,1 and 0,6, respectively; -sands, as a rule, are characterized by a coefficient of heterogeneity K H = d 0,6 /d 0,1 ≤ 5, which allows us to classify them as homogeneous rocks.
-effective diameter d 0,1 (or d 0,6 -mainly for sands) carries the main information about the belonging of a rock to a particular type.
-parameters d 0,1 and d 0,6 are connected by statistical dependences with electric parameters ρ κ and Spontaneous Polarization (SP) where max and min denote the maximum and minimum values of the corresponding parameters (apparent resistivity and potential) within the productive horizon, respectively. As a rule, at small values of electric parameters, their connection with values d 0,1 , is more stable, and at large, with values d 0,6 ; -filtration coefficient K f and parameters d 0,1 (d 0,6 ) are statistically connected by the dependence of the form K f = Ad 2 0,1 or K f = A 1 d 2 0,6 , where A and A 1 are constant multipliers for a given productive horizon, which are determined by the results of pilot pumping in hydrogeological wells with known values of d 0,1 and d 0,6 of rocks comprising the productive horizon. Appendix B provides a detailed explanation of physical aspects of logging data acquisition (https://www.dropbox.com/s/8d26z0umi8s5gow/ Appendix_B.pdf?dl=0).
The statistical relationship between the parameters d 0,1 and/or d 0,6 and the electrical parameters ρ κ , α ρ κ or α sp allows to restore the values of d 0,1 and/or d 0,6 , and therefore, to identify the rock type and evaluate the filtration coefficient The obtained dependencies are given in the standards and are used to calculate K f based on the average apparent resistivity within the allocated lithological interval.
To obtain data on the lithological structure of a sandstone type uranium deposit, the following electrical logging methods were used: induction log (IL), apparent resistance logging (AR), and spontaneous polarization potential (SP). During the logging process, a probe is lowered into the drilled borehole, which, when lifted, provides measurement data in 10-cm increments. Appendix C explains the process of assessment of filtration properties at the exploration stage (https://www.dropbox.com/s/wfwjnk8ufv675oo/ Appendix_C.pdf?dl=0).
By interpreting the logging data, the expert, by characteristic points of the AR curve, identifies the boundaries of lithological intervals, within which the average value of the apparent resistivity is determined. Using the dependencies the filtration coefficient is calculated [7]. Appendix D (https://www.dropbox.com/s/vnlg2lfiw40bxb6/Appendix_D. pdf?dl=0) provides an example.
By obtaining information about the rock distribution, filtration properties, and depth of the ore body occurrence, we can proceed to determine the optimal location of filters for acid injection and pumping of the productive solution (the filter position depends on whether the well is injection or pumping) [8].
A significant drawback of K f determination method used in practice is that it uses data from only one logging method. Consequently, it becomes inapplicable when the record is distorted. This occurs most often in acidified blocks, that is, where rocks have been exposed to acid and have changed their physical properties. To control the process of acidification with an interval of 1-2 years in the wells of the geological site, IL logging was carried out, during which the conductivity of rocks was measured and the degree of acidification was determined by the increment of conductivity, that is, the drop in resistance of rocks. An example of such a sequential conductivity measurement is presented in Fig. 2.
Induction log curves were recorded for 2018, 2020, and 2021. We can see their consistent increment at the acidification interval (205-235m), highlighted in yellow. The increment is maximum (up to five times) for the reservoirs with the highest filtration coefficients. Fig. 3 shows a passport of the well with highlighted acidification intervals, for which the AR curve values were underestimated, IL values were overestimated, and SP curve values were not distorted.
Because the conductivity increment indicates a resistance drop, it is practically impossible to apply the existing method to acidification intervals. Therefore, for these intervals, the filtration coefficient values of the corresponding tracks were deleted. At the same time, such intervals are present in 30%-40% of wells and, as a rule, in the ore-bearing horizon, which is of the greatest interest for the interpreter. It is difficult to use the IL curve to determine filtration properties because such measurements are not taken at the exploration stage; hence, they cannot be compared with the pumping results. In addition, in Kazakhstan, IL is not part of the standard set of measurements, that is, it is not performed in all wells. Thus, the standard method for determining the filtration coefficient has clear drawbacks that prevent accurate determination of reserves and planning of production processes.

III. RELATED WORKS. MACHINE LEARNING METHODS IN MINING TASKS
Machine learning (ML) is a subset of artificial intelligence techniques that allows computer systems to learn from previous experiences (i.e., from data observations) and improve their behavior to perform a particular task [9]. ML solves the problems of regression, classification, clustering, and data dimensionality reduction. ML models are divided into five classes [10], [11]: unsupervised learning (UL) or cluster analysis [12], supervised learning (SL) [13], semi-supervised learning (including self-learning) (SSL), reinforcement learning (RL), and deep learning (DL). UL models solve the problems of clustering and data dimensionality reduction when a set of unlabeled objects is partitioned into groups by an automatic procedure based on the properties of these objects [14], [15]. SL models solve classification or regression problems. A classification problem arises when finite groups of objects in a potentially infinite set of objects are distinguished by labeling [16]. Labeling is often performed by experts. The classification algorithm, using this initial classification as a pattern, must assign the unlabeled objects to this or that group based on the properties of these objects. Regression is the task of predicting a continuous quantity.
The SSL, RL, and DL models are often used for classification and regression tasks. The peculiarity of DL is the possibility of applying end-to-end learning (end-to-end), which, in turn, requires large volumes of marked-up data. ML models can be roughly divided into ''classic'' and ''modern'' ( Table 2).
ML methods are used in geological mapping to search for ore deposits, risk assessment, and hydrological and environmental modeling [46]. One of the most popular tasks is lithological classification. For example, the work in [47] deals with the lithological mapping of Hutti province in India using AVIRIS-NG multispectral data. As a result of the comparison, the Support Vector Classifier (SVC) algorithm was chosen to solve this problem. In [48], it was shown that SVC and Ensemble Methods (EM) showed the results of classification of rock types better or equal to those obtained using standard classification methods in 2D or 3D modeling of geological objects. This article emphasizes the high dependence of the results on the quality of expert labeling. The lithological mapping task using SVÑ and remote sensing data for the southern provinces of Morocco was also considered in [49]. The lithological classification of crystalline rocks based on logging data was also considered in [50]. Again, the SVC algorithm was chosen as the best algorithm.
Another popular direction is the analysis of remote sensing data of the earth's surface and the evaluation of mining prospects.
For example, [51] discussed the applications of machine learning to analyze remotely sensed data in mineral prospecting. Classification of the covering surface near uranium ore processing zones for nuclear nonproliferation treaty compliance assessment using machine learning techniques and remotely sensed earth surface data was considered in [52]. In [53], a method for evaluating the prospectivity of tungsten deposits using machine learning and deep learning techniques. RF, SVC, ANN, and CNN were used to solve the classification problem in this study. Based on the results of VOLUME 10, 2022 the quality assessment of the methods, RF was chosen as the main prediction method.
The application of machine learning to solve the problems of stratigraphy at uranium deposits in Kazakhstan was considered in [54]. Some papers by the authors of this article are devoted to the problems of classifying lithological types in uranium deposits using ''classical'' algorithms [55], combining the results of several classification models [56], comparative analysis of ''classical'' models and some deep learning methods [8], [57], evaluation of the quality of expert labelling of logging data in solving the problem of lithological classification [58], [59]. In the papers mentioned above, a quantitative assessment of the influence of experts on the solution of the lithological classification problem was carried out. Regression methods were used to estimate the amount of silica and iron in the ore [60]. The authors have shown that boosting methods and EM demonstrate good prediction result (96-98%).
The analysis of the works shows a great interest of researchers in solving the problem of classification and regression in the mining industry using ''classical'' machine learning algorithms. However, we have not been able to identify works devoted to the calculation of the filtration properties of rocks in uranium deposits of formationinfiltration type using machine learning methods, despite the fact that the analytical method used in practice yields inaccurate results. At the same time, the correct estimation of K f is critical for calculating the amount of recoverable reserves, predicting the dynamics of production, the number of necessary wells, the choice of parameters, and the location of the filter to be installed within the productive horizon. Therefore, improving the accuracy of K f estimation based on the application of machine learning methods is the goal of this research. This task can be solved using regression models because of the problem of predicting continuous values.

IV. METHOD
Given the shortcomings of the existing K f estimation methodology, we propose a machine learning model that receives basic logging data as input and generates filtration coefficients as output. Such a model can be trained on data from exploratory wells that have actual (pumped out) K fpo . values. The trained model can then be used to calculate K f of the production wells.
The problem of estimating filtration properties from logging data belongs to the class of example-based or supervised learning problems. Mathematically, it is reasonable to consider the problem of learning by examples as an optimization problem, which can be solved by searching for the minimum value of the cost function J (θ) on all available examples, defined as the sum of squares of the difference between the ''predicted'' value and the real value on the set of examples m. In this case, a hypothesis h θ (x) is selected that provides the minimum value of J (θ) on a certain set of parameters θ i ∈ : where m is the set of training examples, h θ is the hypothesis function, which can be linear ( To find the optimal function h θ (x), the gradient descent algorithm is used, the essence of which is to change the parameters θ 0 , θ 1 sequentially using the expression: where α is the learning parameter, and ∂ ∂θ j J (θ 0 , θ 1 ) is the derivative of the cost function by θ j . The sign = means assignment as opposed to the equality sign (=) in algebraic expressions.
Two algorithms were chosen for the experiments: gradient boosting (an ensemble of weak decision trees aggregated into a meta-model by the boosting method) and a neural network with a hidden layer.
The essence of the gradient boosting [26] is that after the optimal values of the regression coefficients are calculated and the hypothesis function h θ (x) is obtained using algorithm (a), the error is calculated and a new function h bθ (x) is selected, possibly using another algorithm (b) to minimize the error of the previous one.
In other words, we are talking about minimizing the function: where L is an error function that considers the results of algorithms a and b. If J b (θ) is still large, the third algorithm (c) is chosen. Often, decision trees of relatively small depth are used as algorithms (a), , that is, when training the algorithm (b) pairs (x (i) , −L (y (i) , h θ (x (i) )) are used instead of (x (i) , y (i) ).
Multilayer artificial neural networks (multilayer perseptrons) are one of the most popular methods of supervised learning, especially in the case of multiple classes. To adjust the weights θ of the neural network (network training), a cost function of the following form is used: where L is the number of layers of the neural network, s l -number of neurons in layer l, K is the number of classes (equal to the number of neurons in the output layer), is the weight matrix, and the hypothesis function is often a sigmoid (logistic) function.
To minimize the loss function (learning) of a multilayer ANN, the backpropagation error (BPE) algorithm [61] and its modifications are used to speed up the learning process.
The quality of the constructed regression dependence was assessed using a list of indices.
Coefficient of determination where y (i) -actual value for the i-th sample; h (i) − calculated (predicted) value (hypothesis function value) for the ith sample of total n samples.
In existing libraries R 2 is denoted by r2_score. The best value of r2_score=1.
Root mean square error Linear correlation coefficient (or Pearson correlation coefficient) The methodological scheme of the study consists of the following steps: -Data collection and preprocessing. This step is necessary to form a set of input variables and to select the target variable.
-Application of machine learning methods in two experiments. a) Experiment 1: ANN-based regression model based on data from exploratory wells of the Budennovskoye field. b) Experiment 2. Regression models based on ANN and Extreme Gradient Boosting (XGBoost) use data from the Inkai field.
-Verification of results using RMSE, R 2 , R.

V. DATA AND RESULTS
To increase the reliability of K f estimation, it is desirable to consider as much data as possible. However, during the exploration phase, a limited set of geophysical surveys are usually performed that do not include IL or neutron methods. Therefore, the models had to be limited to rock code and AR and SP log data (as a set of values in 0.1 m increments).

Experiment 1 (Calculation of Filtration Coefficients of Budennovskoye Field):
To train the model, a dataset was generated containing data for Budennovskoye field, part of which is shown in Fig. 4. (AR and SP are given for 90 centimeter intervals, for which, in turn, the actual values K fpo . obtained by pumping out (pump out) was determined. As a result, the input variable set consisted of 19 values, including the rock code (AR, SP). The target column is K fpo . The full dataset with the obtained calculation results is available in the following link [62].
The regression model was based on an ANN with one hidden layer consisting of 31 neurons. K f values were also  calculated for all intervals of the specified dataset using the currently used procedure K fc ,.
The results showed that neither method fully agrees with the actual data (K fpo ). However, the correlation of the results of the regression model -K fr with the actual data -K fpo is significantly higher than the correlation between K fpo and K fc . Accordingly, the RMSE value of the regression model was lower (Table 3).

Experiment 2 Calculation of Filtration Coefficients of Inkai Field:
The experiment for the Inkai field was designed so that the models were trained and verified on the data of exploration wells as in Experiment 1. Then, the best model according to R 2 and RMSE estimates was used to calculate K f of technological wells. The calculation results for the technological wells were compared with the debits of the wells because there was no actual data (K fpo ) for the technological wells.
A much larger dataset was used for training, containing approximately 600 intervals for more than 30 exploration wells. Table 4 shows the results of XGBOOST, ANN (hidden_layer_sizes = 91), Support Vector Regressor (SVR) and random forest regressor (RFR) for different input datasets. The datasets differed in terms of input parameters. Below, in Table 4 and Fig.5: Input -set of input variables, AR -set of input variables consisting of AR values for the interval in consideration, SP -set of input variables consisting of SP values, LC -lithological code set by the expert. It can be seen that the best results were obtained using AR and LC as the input parameters. The worst RMSE results were obtained using the calculations. The model based on the SP showed a weak correlation. It can be assumed that this is due to the poor quality of SP curve recording, as highquality recording requires strict requirements for drilling fluid preparation, which are often neglected in practice. Because the size of the dataset was small, the training time for all algorithms was less than 1 s. At the same time, methods based on the use of decision trees learn faster.
Because the XGBoost regressor showed the best results when using (AR, LC) as input parameters (Fig. 5), but for acidified intervals, the values of both AR and LC are incorrect, we propose a hybrid model. In this case, XGBoost (AR, LC) is used for all non-acidified intervals, and XGBoost (SP) for acidified intervals, as only SP preserves true values on acidified intervals. Nevertheless, it is worth noting that because of the poor quality of the SP record, the error in estimating K f of acidified intervals must be quite high.
The XGBoost-based models were tested on eight technological wells that contained technological acidification intervals. The calculation results for one of the wells are presented in Table 5. The last column shows the results of the hybrid model, in which multiple input variables (AR, SP) were used at all intervals except the acidified intervals, and the SP-based model on the acidified intervals, because the AR and rock code on the acidified intervals are distorted.
The technological acidification interval is highlighted in yellow (rock code 28). It can be seen that for this interval the K f , values calculated by the existing methodology are significantly underestimated (1.1 m/day).
It is worth noting that this value (1.1 m/day) is actually not the result of calculation, but was forcibly set for these intervals as the minimum possible value for permeable intervals.
Because it is difficult to obtain actual values of filtration coefficients at technological wells, it is possible to estimate the correctness of the calculated K f only indirectly by comparing it with the well flow rates (maximum volume of injected/outflow fluid). It is logical to assume that the higher the well flow rate, the better the filtration properties of rocks in the near-filter zone and a longer filter length. However, for calculations as a zone of solution movement in Kazakhstan, it is accepted to use not the filter itself, but the so-called zone of active movement of solutions (ZAMS). It extends 2 m upward and 6 m downward relative to the actual filter location. Therefore, the product of the average K f value within the filter (K f ) by the filter length (length of filter -LF) and the product of the average K f value within ZAMS by the length of ZAMS (length of ZAMS -LZAMS) was also used for comparison with well flow rates. Part of the calculation results for the process wells is listed in Table 6. The calculation results for all 46 wells are given in Appendix E (https://www.dropbox.com/s/a8rtrgzohykcrob/ Appendix_E.pdf?dl=0).
ColumnK fr shows the results of the XGBoost-based regression model using the set (AR, LC) as input parameters. ColumnK frh shows the results of a hybrid model, which also uses an XGBoost-based regression model, but uses only SP data for areas containing acidic rocks. Wells 42,43,45, and 46 contain zones of technological acidification. It can be seen that the assessment of filtration properties, when calculated according to the current instructions for them, is significantly underestimated, which is clearly visible in the example of well number 45. Overall, the assessment of filtration properties, obtained by means of a hybrid model, correlates significantly better with actual values of well flow rate after development (R = 0.550) in comparison with the calculation based on the existing method (R = 0.164), even if the acidulated wells were considered (the last row of the table).

VI. DISCUSSION
According to the current methodology, the parameters of the rock filtration properties and the actual value of the filtration coefficient (K f ) were identified at the exploration drilling stage. Subsequently, the obtained parameters were used to calculate the filtration properties of the technological wells. Correct calculation of K f affects the estimation of recoverable reserves and parameters of the production process. However, VOLUME 10, 2022  TABLE 5. Results of K f calculating using the existing method and using regression models. K f estimation is inaccurate. A comparison of the calculated data with actual data shows that the RMSE is 13.89 and the linear correlation value is 0.584 (see Table 7). In addition, the calculated values correlated poorly with the debits of the wells (R = 0.164).
Based on the results of the experiments, a two-stage scheme for determining filtration coefficients in the fields of Kazakhstan using machine-learning models was proposed (Fig. 6).
In the first stage, machine learning models were tuned using data from exploratory wells. A hybrid model is formed from the tuned models, which use the ML model for acidified well sections, where the input data are SPs (XGBoost(SP)).
For non-acidified sections, the AR and LC data were used (XGBoost(AR,LC)).
Because technogenic acidification intervals are found only in production wells, they are not present in the dataset generated from exploration wells. Therefore, it is impossible to directly teach the correct predictions K f for acidified intervals. Because of the poor quality of SP curve recording, adding it as an input regression parameter usually slightly worsens the accuracy.
However, in acidified well sections, the AR curve is too distorted, and the lithological code only indicates the acidification interval (not the actual rock type). Moreover, this distortion is dependent on the lithological composition of the rocks and the amount of acid. Therefore, the only option in this case is to use the SP, even though it has low accuracy.
The proposed model for determining the filtration coefficient is not only much more accurate (RMSE = 4.89, R 2 = 0.59), but also correlates much better with the actual well flow rates (R = 0.550) ( Table 7).
It can be noted that when using the approved methodology on data from exploration wells (Table 4), very poor results (RSME = 13.89) were obtained, which were significantly inferior to the regression models. At the same time, when using it on technological wells (Table 6), the correlation with well flow rates, if not considering acidified wells, is not much inferior to the results of the regression models. This is due to the fact that the existing methodology is designed to use the average value of resistivity within the allocated lithological interval, and it can be correctly determined only for intervals of at least 1.5 -2m, as in the fields of Kazakhstan for recording AR used downhole device with a distance between the electrodes in 1m. Because the borehole device was used to record AR in the fields of Kazakhstan, the distance between the electrodes was 1.1 m.
During the training on data from exploration wells, only intervals with a thickness of 0.5 m were used for comparison with data from hydrogeological studies, which led to a high RMSE value. In the technological well data, lithologic intervals were mainly more than 2 m thick, so the result of the approved methodology was relatively good (in wells without acidified intervals, almost comparable with regression models). Based on this, we can draw the following conclusions.
1. Regression models work well for all intervals, while the current methodology is only suitable for intervals greater than 1.5-2m.
2. The current methodology is not applicable to wells containing acidified intervals.
3. Hybrid can be applied to wells containing acidified intervals.

VII. CONCLUSION
Uranium mining by in situ leaching requires a fairly accurate assessment of the lithological composition and filtration properties of ore-bearing rocks. The methodology used in Kazakhstan to estimate filtration properties is based on the fact that at the stage of exploratory drilling, the parameters are determined, which are then used to calculate the filtration properties of technological wells. However, the existing methodology yields inaccurate results and cannot be used in technological acidification zones, which account for up to 40% of all considered data. Inaccuracies in determining the filtration coefficient lead to errors in the technological production process and inaccurate calculation of recoverable reserves.
To overcome the shortcomings of the existing approach, we propose a method for calculating the filtration coefficient based on the use of regression models. The proposed model receives electric logging data as an input and the calculated filtration coefficient as an output. To improve the quality of the model, it is made hybrid; that is, it is formed from two models. For non-acidic areas, a model with AR and LC as the input variables was used. For acidified sites, a model with input variables consisting of SP data was used.
The analysis shows that the proposed method yields a much smaller mean square error of filtration coefficient determination, correlates better (by 70%) with well debits, with actual filtration coefficient values (by 27%) applicable for small intervals, and can also be used for calculation of the filtration coefficient in acidified zones.

A. LIMITATION OF THE METHOD
Application of the method may be limited when analyzing data from fields explored 20-30 years ago. In such cases, it is not possible to obtain hydrogeological data, and the logging data are often fragmentary and unsuitable for the formation of a training dataset. VOLUME 10, 2022

B. FUTURE RESEARCH
In future studies, we plan to analyze the possibility of applying pre-trained models for such cases. In other words, it is planned to investigate the possibility of using transfer learning methods to calculate the filtration coefficient of technological wells in fields where data from exploration wells are incomplete.
The second direction of research is to develop methods for interpolation of K f in the interwell space, which can improve the accuracy of estimating K f using data from nearby wells.

APPENDIX A. A BRIEF OVERVIEW OF THE RESERVES AND PRODUCTION OF NATURAL URANIUM IN THE WORLD
All over the world, uranium is the main resource for the operation of nuclear power plants. Deposits of uranium ores are not evenly distributed around the globe. Today, only 28 countries of the world extract valuable raw materials in their bowels. The main world reserves of uranium in the world are located in 10 countries. We will tell you a little more about the countries with the largest uranium reserves. Table 8 below contains a summary of the uranium reserves, the number and names of deposits in the leading countries in terms of uranium reserves according to [1], [2].
Analysis of information related to the leadership of countries in uranium reserves does not allow us to conclude that these same countries are leaders in uranium mining.
In 2018, the world's largest uranium miners produced 86% of the world's uranium mined, according to the World Nuclear Association. The main uranium mining companies are mining corporations from Kazakhstan, Canada, Australia: they account for two-thirds of the world's production [2] ( Fig. 7) Table 9 shows the top ten uranium mines based on 2018 production results. At least three of these top ten mines (Rössing, Arlit (SOMAÏR) and Ranger), representing 10% of 2018 production, are scheduled/expected to close before the end of the 2020s and will need to be replaced by new mine capacity by then, in order not to cause further reduction of primary uranium production [3].
During the development of the uranium market, the technology of mining processing itself has changed more than once. Basically, uranium ore is mined in two ways -mine or open pit, depending on the depth of the layers with uranium ore. The career path means less radiation and higher safety. Underground (mine) allows you to extract higher quality uranium ore, but it is also more dangerous because of radona radioactive gas that accumulates in mines.
Underground leaching of uranium ores is the most advanced uranium mining technology, first used since 1957. The method is the injection of a special chemical solvent underground into the layer of uranium ores, which reacts to uranium compounds. Then this solution is brought to the surface and processed already. The disadvantage of this method is the ability to use it only in sandstone and below the groundwater level.
The method has gained particular popularity in Kazakhstan, Uzbekistan and the United States, although this method of production is used in Canada, Australia and China. The geography of the in-situ leaching method is steadily increasing its share of the total volume, mainly due to Kazakhstan (this method covers more than half of the production). Since 2015, global production has proceeded as follows: Conventional mines had a mill where the ore was crushed, crushed, and then leached with sulfuric acid to dissolve uranium oxides. In a conventional mine mill or sewage treatment plant with ISL operation, the uranium is then separated by ion exchange before drying and packaging, usually as uranium oxide (U3O8). Some mills and ISL operations (especially in the US) use carbonate leaching instead of sulfuric acid, depending on the orebody.
Today, the approximate distribution of uranium mining methods is as follows [3] (Table 10). According to the table, the method of underground leaching of uranium ores is used approximately equally along with two methods -mine or quarry. The extraction of uranium using underground leaching causes significantly less damage to the environment than the methods described above.
Over time, reclamation processes occur on the developed land plot. The use of this method can reduce economic costs. But it has its limitations. It is not used only in sandstone and below the water table.
Almost all the uranium mining companies listed above have all the existing mining technologies, depending on the characteristics of specific deposits and mines.

APPENDIX B. PHYSICAL ASPECTS OF LOGGING DATA ACQUISITION
Measurements of apparent resistivity (AR) method are performed using a four-electrode AMNB unit. Two electrodes A and B (supply electrodes are connected to the current source). M and N (measuring electrodes) are connected to the meter. The electric probe consists of three electrodes set at a strictly defined distance from each other. The fourth electrode is mounted on the surface and is called a ''fish''. Electrodes AB and MN are called paired electrodes, and electrodes AM, AN, BM and BN are unpaired.
Gradient probes and potential probes are mainly used to measure ρk of rocks in resistivity logging in the sedimentary section. Gradient probes are probes in which the distance between paired electrodes M and N (A and B) is at least 7 times less than the distance between unpaired ones. The distance from the middle of the paired electrodes close together which is called the recording point, to the unpaired electrode is called the length of gradient probe. Potential probes are probes in which the distance between unpaired electrodes is small compared to the distance between paired electrodes. The length of the potential probe is the distance between the unpaired electrodes. The essence of measurements is briefly that a current I is passed through supply electrodes of the probe located in the borehole, which    creates an electric field in the medium under study. Using measuring electrodes M and N, the potential difference U between two points of this electric field is measured.
Consequently, the resistivity, which is called apparent resistivity, is equal to: where U is the potential difference between the measuring electrodes, I is the current supplying the probe, K is the probe coefficient (a constant value depending on the distance between the electrodes). The probe coefficient K is calculated by the formula: where MA, MB and AB are distances between electrodes. The spontaneous polarization potential (SP) measurement is reduced to the measurement of the natural potential difference between the M electrode moving along the  borehole and the N electrode located on the surface near the borehole head.
We used a model of a plantar gradient probe, with a probe length of 1m, which is mainly used for logging in the uranium deposits of Kazakhstan. Its scheme is shown in (Fig. 8).
A, B, M-electrodes, electrode N is on surface, L-length of probe (in our case 1m), the point of recording of AR is between electrodes A and B, the point of recording of SP is on electrode M.
Distance between AR and SP recording points is 1m (ten 10-cm interlayers).  The registered value of SP depends only on the value at point M, the registered value of AR depends on the values of all the interlayers located between electrode M and the middle between electrodes A and B.

APPENDIX C. ASSESSMENT OF FILTRATION PROPERTIES OF ROCKS AT THE EXPLORATION STAGE
At the stage of exploration in uranium deposits of Kazakhstan, a set of hydrogeological studies is carried out to assess the filtration properties of the host rocks. The average filtration coefficient of the water-bearing horizon is determined by the rate of water level recovery in the well after its pumping. Then, to determine layer-by-layer filtration coefficients, flow velocity is measured with a flow meter (Fig. 9).
The method of determining layer-by-layer filtration properties of rocks in the uranium deposits of Kazakhstan is described in detail in [ 1 ]. The essence of flowmetry is that the flow rate of axial water flow measured in the wellbore in spouting, pumping, filling or injecting mode changes only in intervals of permeable (water-bearing) rocks, and within water-bearing rocks remains constant or equal to zero. As a consequence, the flow-metric graph Q = f(h), constructed based on the results of a set of water flow measurements in the experimental well, allows to determine the depth, thickness and hydrodynamic characteristics of permeable (water-bearing) formations. The boundaries of formations, which differ in their filtration properties, are fixed by the breakpoints of the flow-metric graph (Fig. 10) Fig. 10 Scheme of the flow-metric study of the well during pumping: (a) -flow diagram of fluid flows along the wellbore; (b) -flow chart Q = f (h); (c) -differential flow chart Q = f(h); h o -steady-state combined water level in the well; h ∂ -dynamic water level in the well during pumping.
The water inflow rate (water absorption) of any permeable layer is determined by the difference between the flow rate of water circulating in the borehole in its top and bottom.

APPENDIX D. THE DEPENDENCE OF K f ON ρ κ
See Table 11.

APPENDIX E. RESULTS OF APPLYING MACHINE LEARNING MODELS TO CALCULATE K f FOR TECHNOLOGICAL WELLS
See Table 12. RAVIL I. MUKHAMEDIEV received the Engineering degree in radio-electronics and the Ph.D. degree in information systems from the Riga Civil Aviation Engineers Institute, Riga, Latvia, in 1983 and 1987, respectively. From 1987 to 1995, he worked as an Assistant Professor, a Lecturer, and an Assistant Professor at RCAEI. From 2004 to 2010, he was an Assistant Professor and the Department Head with the Higher School of Information System Management, Riga. He is currently a Professor with Kazakh National Technical University (Satbayev University). He is the author of three books, more than 200 articles, and a leading researcher of six research projects. His research interests include applications of machine learning, data processing, and decision support systems.
YAN KUCHIN was born in Almaty, Kazakhstan, in 1980. He received the bachelor's degree in physics from Kazakh National University, Almaty, in 2002, the bachelor's degree in computer science from IITU, Almaty, in 2011, and the master's degree in computer science, in 2016. He is currently pursuing the Ph.D. degree with Riga Technical University. From 2004 to 2018, he worked at Kazatomprom. Since 2018, he has been a Programmer-Engineer with the Institute of Information and Computing Technologies and a Researcher with Satbayev University. He was the author of more than 30 articles. His research interests include geophysics, machine learning, data processing, and natural language processing.
YEDILKHAN AMIRGALIYEV is currently the Doctor of technical sciences, a Professor, and the Head of the Laboratory of Artificial Intelligence and Robotics, Institute of Information and Computational Technologies, Science Committee, Ministry of Education and Science of the Republic of Kazakhstan. His research interests include theoretical aspects of artificial intelligence and practical applications.
NADIYA YUNICHEVA was born in Frunze, Kyrgyzstan, in 1968. She received the Science degree in specialty 05.13.01-''Management in technical systems,'' in 1998. In 2002, she was an Associate Professor with the Higher Attestation Commission of the Republic of Kazakhstan, specializing in 05.13.00. Her research interests include the intersection of computer science, control theory, and data analysis of geophysical and climatic research.
ELENA MUHAMEDIJEVA received the master's degree in computer science from the ISMA University of Applied Sciences, Riga, Latvia, in 2008. Since 2010, she has been a Researcher with the Institute of Information and Computational Technologies, Kazakhstan. Her research interests include machine learning, data processing, and decision support systems.