Incorporation of frailties into a non-proportional hazard regression model and its diagnostics for reliability modeling of downhole safety valves

In this paper, our proposal consists of incorporating frailty into a statistical methodology for modeling time-to-event data, based on non-proportional hazards regression model. Specifically, we use the generalized time-dependent logistic (GTDL) model with a frailty term introduced in the hazard function to control for unobservable heterogeneity among the sampling units. We also add a regression in the parameter that measures the effect of time, since it can directly reflect the influence of covariates on the effect of time-to-failure. The practical relevance of the proposed model is illustrated in a real problem based on a data set for downhole safety valves (DHSVs) used in offshore oil and gas production wells. The reliability estimation of DHSVs can be used, among others, to predict the blowout occurrence, assess the workover demand and aid decision-making actions.


I. INTRODUCTION
The idea of modeling time-to-event data is well established in statistics and widely used in the medical sciences (in the context of survival analysis) and engineering (in the context of reliability analysis). In any of these situations, we are interested in representing the distribution of a non-negative random variable T , based on one of its representative functions, such as density, cumulative distribution, or the hazard function.
Many authors have chosen to model survival data in the presence of covariates using the hazard function, which is related to its interpretation. The hazard function represents an interesting alternative, since its interpretation is given in terms of the instantaneous failure rate over time. Perhaps the best known model dedicated to hazard modeling is the Cox model [1], which has brought to light this modeling possibility. The Cox's proportional hazards model is quite flexible and used extensively in survival analysis. It can be easily extended to incorporate, for instance, the effect of timedependent covariates. A strong assumption, and probably the most problematic of this model is that the failure rates of any two individuals are proportional, popularizing the name Cox proportional hazard (PH) model. The assumption of proportionality of hazards is not always in accordance with the observed reality in the field, which motivates the study and development of models that relax such a hypothesis.
Several techniques have been proposed as an alternative to PH modeling. Among others, we can cite the non-parametric accelerated failure time model [2], [3], the hybrid hazard model [4], the extension of hybrid hazard models [5], [6]. Another approach is the generalized time-dependent logistic (GTDL) model introduced by MacKenzie in [7], whose proposal is to bring a fully parametric competitor for the Cox model. More recently, Louzada-Neto et al. [8] proposed a Bayesian approach to the GTDL model, Louzada-Neto et al. [9] compared several techniques for building confidence intervals using parametric and non-parametric resampling methods, MacKenzie and Peng [10] extended the GTDL model by incorporating a random effect into the hazard function and using h-likelihood procedures that obviate the need to marginalize the risk and survival functions, and Milani et al. [11] extended the GTDL model by including a gamma frailty term in the modeling. These models have been successfully applied to situations where all units are susceptible to the event of interest, i.e., the presence of a cure fraction in the population is not feasible.
A typical assumption in reliability data analysis is that all of the study units or systems will eventually experience the event of interest if they are followed long enough. Nevertheless, the event may not occur for some units, even after a longer period of time. In manufacturing, for instance, those items that did not fail nor malfunction during the examination time comprise the cured fraction [12]. Nelson [13] observed the life of insulation on electric motors, which were operated at some levels of temperature; the result is that at low temperature the motors lasted almost indefinitely, while at high temperatures the breake down occurred quickly. From a Bayesian perspective, Lin and Zhu [14] proposed a new approach to the reliability analysis of complex systems, where a part of the subsystems is considered longevous compared with the entire system. Thus, the system will not fail due to these subsystems.
Hence, usual survival models, such as the Cox PH model or the accelerated failure time model, are not suitable for such cured individuals. As a result, cure models have been developed for manipulation and analysis of survival data with cure fraction. Boag [15] introduced the standard cure rate model, which is the most widely used cure rate model. His objective was to study cases where there was a fraction of cured patients among those who had received treatment for mouth cancer; the modeling of the failure time of the susceptible group was made by adopting the lognormal distribution and assuming the cure probability to be constant. The mixture cure model was further developed in [16] and later studied extensively by various authors [17], [18], [19], [20], [21], [22], [23], [24], among others.
Usual models implicitly admit a homogeneous population for susceptible systems, but explanatory variables can be included to elucidate the observable heterogeneity. Nonetheless, genetic, environmental factors or even information that, for some reason, was not considered in the planning, can cause a portion of the unobserved heterogeneity. Hougaard [25] discussed the benefits of adopting two sources of heterogeneity, the observable (given by explanatory variables) and the unobservable, considering for the latter some distribution families. Unobserved heterogeneity can be controlled by introducing a random effect to the hazard function, known as frailty (the term "frailty" was introduced in [26]). In this situation, the frailty models are widely used; for more details, we refer the reader to [27]. The exclusion of a relevant explanatory variable in the modeling will increase the amount of unobservable heterogeneity, thus, the frailty makes it possible to evaluate the effects of the explanatory variables that were not considered in the modeling. Therefore, the frailty, besides explaining the heterogeneity between the systems, also allows to alleviate the absence of important covariates.
The challenges in the construction of oil wells are increasing over time, either due to the increase in technical difficulties due to the greater complexity of the areas to be explored or by the improvements in the rules of regulatory bodies aiming at increasing safety. The DHSV (downhole safety valve) is a subsurface safety valve, which is positioned in the oil production pipeline column below the seabed; its function is to enable the production column to be closed almost instantly, preventing uncontrolled leakage of hydrocarbons into the environment in the event of a catastrophic wellhead accident. The failure (closing or opening unwantedly and other unexpected actions) of the DHSV generates several unforeseen events causing great financial losses. Demonstrating reliability performance of DHSVs is an important activity related to risk assessment and management of offshore well systems [28].
The study of reliability associated with DHSV contemplates many ramifications, even in statistics itself, including (but not limited to): (i) investigating current failures; (ii) evaluating their root causes, failure mechanisms and effects; (iii) estimating and improving the reliability of its components; (iv) developing degradation models as part of a testing strategy, among others. Selvik and Abrahamsen [28] studied and discussed the specific statistics for the period 20022013, focusing on reliability aspects of the collected data. Their study also included a literature review and some testing data collected directly from oil and gas companies, to provide a more nuanced picture of the reliability issues. Rausand and Vatn [29] discussed the impact of using a Weibull life distribution instead of an exponential distribution, based on a specific data set for surface-controlled subsurface safety valves used in offshore oil and gas production wells. Oliveira [30] compared the reliability of some control systems models taking into account the equipment positions throughout the system and their failure rates with a vision more focused on loss of production than in security. Colombo et al. [31] analyzed the behavior of several machine learning models to assess the reliability of DHSVs for further comparison against traditional statistical techniques, based on experimental evaluation over a data set collected from a Brazil's oil and gas company. In the context of this study, we would like to identify the association of the failure rate behavior with some environmental variables, such as the hydrogen sulfide (H2S) concentration, temperature, pressure, gas/oil ratio and water column. For this, we use the GTDL and GTDL frailty models, since the assumption of PH was not validated, consequently the Cox model cannot be used.
After fitting a model, it is necessary to check the validity of its assumptions, as well as to carry out robustness studies to detect possible influential or extreme observations that can provoke distortions in the results of the analysis. There are several works in the survival analysis setting that present such analysis [32], [33], [34]. In this study, we discuss the global influence starting from case-deletion, in which the influence of the i-th observation on the parameter estimates is investigated by excluding it from the analysis. We propose diagnostic measures based on case-deletion for the GTDL and GTDL frailty regression models, in order to determine which units might be influential in the analysis. To motivate our research, we describe the following real data set related to DHSVs.

II. MOTIVATING EXAMPLE IN OIL AND GAS INDUSTRY
The motivation for our study came from a real-world reability data set corresponding to the DHSVs used in the exploration of Petrobras' (abbreviation of Petróleo Brasileiro S.A.) oil wells in Brazil. Illustrated in Figure 1, the DHSV is a subsurface safety valve whose function is to prevent uncontrolled leakage of hydrocarbons into the environment in the event of a catastrophic wellhead accident.
The records show the time (in years) of valve's life, whether or not there was a suspension of use and some other explanatory variables, presented in Table I, which are divided into groups according to their characteristics. The type of variable is also highlighted.
Our objective is to study the time until the failure of the DHSV and to identify possible associations between the characteristics of the valve, the environment, the functioning   [36]. The cumulative log-hazard plots shown in Figure 2 indicate non-proportionality for the covariates: CWT, FR, Family and GOR; thus suggesting that the Cox model is not suitable for accurate modeling. With that, models with nonproportional hazard properties are preferable, in this study we use the GTDL and GTDL with gamma frailty models. The remaining graphs are presented in Figure 14 of Appendix A.  In summary, it is necessary to consider non-PH models in order to analyze the DHSV data set. Nevertheless, the inclusion of a frailty term in the traditional models can also be needed. As mentioned earlier, unobservable heterogeneity among units or systems may play an important role in assessing reliability, while its omission can cause biased results. Hence, this example serves as a motivation for the joint modeling of heterogeneity among valves by their frailties and the possible presence of a cured fraction of them when predicting their reliabilty.
The remainder of the paper is organized as follows. In Section III, we provide further details on the GTDL and GTDL frailty models, such as survival and hazard functions, their cure rate version and inference methods based on the likelihood function. Section IV describes and discusses the influence diagnostics based on case-deletion. Section V presents the fitted models to the groups of variables and the diagnostic analysis. Finally, Section VI gives some concluding remarks.

III. BACKGROUND
In this section, we present the GTDL and GTDL with gamma frailty regression models, highlighting the hazard, reliability and probability density functions. These models are useful for data sets with covariates that do not satisfy the proportional hazards assumption.

A. GTDL model
Let T > 0 be a random variable representing the failure time and h(t) the instantaneous failure rate or baseline hazard function. According to [7], the hazard function of the GTDL model is given by where λ > 0 is a scalar, α ∈ R is a measure of the time effect, x = (x 1 , . . . , x p ) are the sets of covariates and β=(β 1 , . . . , β p ) are the regression coefficients. The corresponding reliability function, R(t|x), and probability density function, f (t|x), are given, respectively, by and The hazard function (1) has monotonic behavior, being defined by the value of parameter α. More specifically, when α > 0, the hazard function is increasing; when α < 0, it is decreasing; and finally, when α = 0, the hazard function is constant over time, that is, the resulting model is a PH model with exponential baseline hazard function, as highlighted in [10].
The GTDL model is said to be a non-PH model because the ratio of the hazard functions for two individuals is not constant over time. Consider two systems i and j, i = j, with covariate vectors x i and x j , x i = x j , for i, j = 1, . . . , n. Then, the ratio of the hazard functions is given by Note from (3) that the time effect does not disappear, and hence the non-PH condition becomes evident. As mentioned in [7], the GTDL model is neither a PH model nor an accelerated life model. The reliability function (2) is proper for α > 0, i.e., R(0|x) = 1 and lim t→∞ R(t|x) = 0. But when the value of parameter α is negative, the GTDL model naturally acquires an improper distribution, i.e., R(0|x) = 1 and lim t→∞ R(t|x) = p > 0; hence, the GTDL model is a cure rate model when α < 0. An advantage of the GTDL model over the mixed model [16], is that the former makes no assumption about the existence of a cure rate, leaving the data to indicate the presence or not of a cure fraction. In literature, models with this property have recently been called "defective" [37], [38] and [39].
In reliability, the event (failure or error) may not occur with some units, even after a very long period of time. Thus, the cure rate in the population is calculated as the limit of the reliability function (2) when α < 0, given by Let T i > 0 be a random variable denoting the failure time for the i-th unit, and δ i a censoring indicator variable, which is δ i = 0 if the observed time is censored and δ i = 1 otherwise, for i = 1, . . . , n. Also, consider η = (λ, α, β 1 , . . . , β p ) and assume that T i 's are independent and identically distributed (IID) random variables with hazard and reliability functions specified, respectively, by (1) and (2). Then, the likelihood function considering right-censored reliability data is given by and the log-likelihood function, (η) = log (L(η)), is given by

B. GTDL frailty model
The frailty model is characterized by the use of an unobservable random effect, which represents information that cannot or has not been observed, such as environmental factors, or information that has not been considered in planning. In conventional frailty models, the frailty variable is introduced in the modeling of the hazard function, with the aim of controlling the unobservable heterogeneity of the units under study, including the dependence of the units that share the same factors.
Based on the GTDL model, the hazard function of the i-th individual with the multiplicative frailty term v i is given by where v i represents a value of the random variable V , and h 0 (t|x i , v i ) is called the conditional hazard function of the i-th individual given v i . When v i > 1, we have that the individual i is more fragile, and becomes stronger when v i < 1; hence, the model's name "frailty" (or "fragility"). It is necessary to adopt a known distribution for the random variable V ; as it can only assume positive values, the natural candidates are: gamma, inverse Gaussian, Weibull, positive stable and power variance function (PVF) distributions, among others; for more details, see [40] and [27]. In general, the restriction adopted is E[V ] = 1 and Var[V ] = θ, where θ is interpretable as a measure of unobserved heterogeneity; this restriction was proposed in [26].
In order to make inferences on frailty models, we have some options. For instance, obtaining the marginal hazard and reliability functions and using the traditional likelihood function; or choosing other methods that obviate the need for marginalization, such as the h-likelihood approach proposed by Ha et al. [41] and used in [42]. This paper considers the marginal hazard and reliability functions.

C. The GTDL gamma frailty model
The GTDL gamma frailty model was proposed in [11], wherein they added a frailty term to the hazard function in a multiplicative way and assumed a gamma distribution for it, i.e., The marginal reliability function is given by the corresponding marginal hazard function is given by and, finally, the density function is given by where h 0 (t|x) is the hazard function defined in (1). The hazard function (5) takes unimodal and decreasing forms, therefore, the inclusion of frailty makes the GTDL hazard function more flexible; for more details, see [43]. It is evident that such a hazard function depends on the time, consequently, the GTDL gamma frailty model also accounts for non-PH. On the other hand, the reliability function (4) behaves similarly to the GTDL model, with cure fraction, p(x), given by The explanatory variables can be incorporated in the model through the hazard function (5) and the scale parameter α. The use of regression in the α parameter is a more flexible approach, since it can directly reflect the influence of covariates on the effect of time-to-failure. Due to fhe fact that the parameter α can be estimated to be negative or positive, the identity link function is used, i.e., . , x * q ) are the sets of covariates and α = (α 0 , α 1 , . . . , α q ) and are the regression coefficients. In practice, the covariate vectors may be the same, i.e., x = x * . Note that we can include the intercept in the vector x, and with that, we also include the parameter β 0 . As mentioned in [44], the parameters β 0 and λ are interchangeable. We chose to include the parameter β 0 because we are interested in the interpretation of the explanatory variables. Hence, the model parameters are ν = (α 0 , . . . , α q , β 0 , . . . , β p , θ).
Let T i > 0 and δ i be as previously defined, for i = 1, . . . , n. Also, consider that T i 's are IID random variables with reliability and hazard functions given, respectively, by (4) and (5). Then, the likelihood function considering right-censored reliability data is given by The log-likelihood function, (ν) = log (L(ν)), is given by The maximum likelihood estimates (MLEs) for the parameters of the GTDL and GTDL gamma frailty models are obtained by numerical maximization of the corresponding loglikelihood functions. In this work, we use the optimr function of the "optimx" package [45], [46], which is implemented in software R Core Team [47].

IV. DIAGNOSTIC ANALYSIS
Global influence analysis consists of studying the effect of case-deletion from the data. It was introduced by Cook [48] and studied later by several authors [49], [50], [32], [34], among others. We denote by the subscript "(i)" the removal of the i-th observation from the original data set. The loglikelihood function of the parameter vector ν is denoted by (ν), as previously given; when we delete the i-th observation, we represent it by (i) (ν), with the respective MLE given byν (i) = (α (i) ,β (i) ,θ (i) ) . In this study, we analyze two measures of global influence. The first is the generalized Cook's distance (GD), whose idea is to compareν andν (i) ; if the deleted observation seriously influences the estimates, more attention should be paid to that observation. The GD is given by where Σ (ν) is the expected Fisher information matrix. For the models under study, this matrix is extremely complex, so in practice we use its observed version. An alternative way is to assess GD i (α), GD i (β) and GD i (θ), whose values reveal the impact of the case-deletion on the estimates of α, β and θ, respectively. The second measure adopted here is the likelihood distance (LD), whose idea is to compare (ν) and ν (i) , similarly to the previous one; if the deleted observation seriously influences the value of the log-likelihood function, it deserves further attention. The LD is given by In order to investigate the impact of the detected influential cases, we calculate the relative change (RC), which is computed from parameter estimates with and without removing the influential cases, as follows: whereν j(i) and SE(ν j(i) ) are the MLEs and their respective estimated standard errors (SEs) when the i-th case is deleted, with j = 1, . . . , p + q + 3 and ν 1 = α 0 , . . . , ν q+1 = α q , ν q+2 = β 0 , . . . , ν p+q+2 = β p and ν p+q+3 = θ. Note that this section was developed for the GTDL gamma frailty model, but if the adopted model is the GTDL one, it is only necessary to remove the parameter θ.

V. APPLICATION
In this section, we apply the proposed methodology to the DHSV data set. A descriptive summary of the failure times or censoring (in years) provides the following main sample results: n = 366, mean = 5.0761, median = 3.6082, minimum = 0.0164 and maximum = 28.8000, with only 83 (22.68%) failure times, while the rest are censored times.
For better understanding, we present a descriptive statistical summary of the explanatory variables in Tables II and III.  Table II displays the minimum value (Min), median, mean, standard deviation (SD), coefficient of variation (CV), skewness (Sk) and kurtosis (K), maximum value (Max) and number of observations (n), for the continuous variables. The summary of failure times that are analyzed together with the covariates is also presented in this table. For the qualitative (categorical) variables, it is possible to observe the categories and the number of observations per category in an absolute and relative way, as shown in Table III.
The database contains 366 observations, but there is a lot of missing data in the explanatory variables. The removal of the cases with missing data reduces the database to only 54 observations, which makes multivariate analyses difficult.  Hence, we decided to fit a model for each group of explanatory variables, thereby eliminating the observations with missing data inside the groups. The summary measures previously presented already consider this deletion. In order to choose/select the explanatory variables, we initially adopted the GTDL model with gamma frailty term and the following steps: Step 1: For each group, select the significant covariates in the x β structure, using the stepwise method and the generalized likelihood ratio test, and also considering α as a scalar; Step 2: For each group, select the significant covariates in the x * α structure, using the stepwise method and the generalized likelihood ratio test, considering for the x β structure the covariates obtained in Step 1. At the end of Step 2, we perform a hypothesis test to verify whether there is observed heterogeneity (H 0 : θ = 0). In this case, the generalized likelihood ratio test was adopted with the modification presented in [21], because the value of the parameter under the null hypothesis is on the boundary of the parametric space. If the null hypothesis is not rejected, the adopted model will be the GTDL; otherwise, the GTDL model with gamma frailty term will be the chosen one.

A. Adjustment to the flow characteristics group
After applying the procedure previously described, we obtain in Step 1 that the statistically significant explanatory variables are H2S and BSW, while in Step 2 we do not identify any significant variables, and so only α 0 is included in the model. We performed the hypothesis test for the parameter that measures the unobserved heterogeneity and obtained a p-value greater than the significance level of 10%, so the null hypothesis is not rejected and the GTDL model (without frailty) is the adopted one. The MLEs, SEs and 90% confidence intervals (90% CIs) for the GTDL model parameters are presented in Table IV. By analyzing the confidence intervals, we conclude that all parameters are significant at the 10% level.  For the reliability functions and hazard ratios (HRs) that will be illustrated hereinafter in the text, we adopt the median value of continuous explanatory variables whenever necessary. When the objective is to present reliability functions for different values of a continuous covariate, we always choose the variation from the minimum to the maximum of the observed value.
In order to illustrate what is the effect on the reliability function for an increase in the amount of H2S or BSW, we exhibit in Figure 3 (a) several reliability curves for different values of the H2S variable. We note that, with the increase in the value of the H2S variable, the reliability function shows a faster decreasing behavior, that is, the higher the concentration of H2S, the lower the reliability of DHSVs. It is known that the concentration of H2S is associated with failures of metallic components in the oil and gas exploration industry. This can be seen, for instance, in [51], which presents a summary of common threats to corrosion, with some of them involving H2S; and [52], which gives recommendations for material selection when H2S is present.
In Figure 3 (b), we show the variation of the BSW variable being reflected in the reliability function. Observe that alterations in the BSW value changed the reliability curve, the higher the concentration of BSW, the greater the reliability.
In Figure 4 (a), we present the HR curve for the first and third quartiles of the H2S variable. We note that before 10 years of age the HR is less than one, so the risk of valve failure is greater when the H2S variable assumes the value of the third quartile; but after 10 years the HR is approximately one, and, therefore, we have approximately equal hazards. In Figure 4 (b), we show the HR for the first and third quartiles of the BSW variable. From this plot, we observe that the HR is greater than one up to 14 years old, therefore, in this period the valve has a higher risk of failure when the BSW variable takes on the value of the first quartile; after the first 14 years the HR is approximately one, so the risk of failure is approximately equal in both groups. We highlight here a different result than the one obtained if we had considered the Cox model, since the ratio of hazard functions in the Cox model would be constant throughout the time.  In order to check for the presence of influential observations, we calculated the GD and LD measures. The obtained results are presented in Figure 5, from which we can see the existence of four influential observations according to the Cook's distance -cases 2, 3, 34 and 70; while from the LD, we also observe four influential observations -cases 2, 3, 24 and 70. Hence, the detected influential observations are cases 2, 3, 24, 34 and 70.
We checked the impact of the detected influential cases on the model inference. With removal of influential observations, the RC values (in percentage, %) and p-values are displayed in Table V. At the 10% level, note that the parameters α 0 , β 0 and β 1 remained significant in all scenarios, whereas the parameter β 2 was only significant in two scenarios (specifically, when excluding the observation 24, and when excluding all influential observations). Therefore, the effect of time and the H2S variable were significant to explain the time-to-failure, while the BSW variable became non-significant in some scenarios. The RC of the parameter α is the largest when case 2 is excluded (RC of 20.4872%), while for the parameters β 0 and β 1 the RCs are the largest when case 3 is removed (RC of 11.1820% and 13.3283%, respectively), and, finally, for the parameter β 2 the RC is the largest when case 24 is removed (RC of 48.0736%). When all influential observations are excluded, we find that the RCs are less than 6%, indicating

B. Adjustment to the valve characteristics group
In the structure x β we identified only the variable Family as statistically significant, whereas in the structure x * α we found that the variables PC and Mfr. are meaningful. Therefore, the effect of time is different for each level of these two variables. The hypothesis test for the frailty distribution's parameter resulted in the rejection of the null hypothesis. Hence, the GTDL gamma frailty model is the one that best fits these data. The obtained MLEs are shown in Table VI. From the fact that the frailty parameter is significant, there is evidence that important variables were not included in the modeling, so indicating that the variables PC, Mfr. and Family are not the only ones that impact the failure time of the valves.
It is worth mentioning that when the Manufacturer is the "Others" class, the GTDL model with gamma frailty term assumes a cure fraction, regardless of what the other explanatory variables are, since in this case x * α < 0, with the cure fraction value close to one. From a descriptive analysis, we identified that only two out of 30 observations in the "Others" class were failure times, occurring before one year of operation. Figure 6 shows the reliability functions for the variables Family, PC and Mfr.. Note that there is a little difference between the reliability curves of the "family A" and "family B" classes; the same occurs with the "Mfr. A" and "Mfr. B" classes' curves. By analyzing the curves of the PC variable, we see that the lowest reliability is for PC equal to "7,500", while the highest one is for PC equal to "5,000".
The closeness between the reliability curves of the "family B" and "family A" levels is justified by analyzing the confidence interval of the "family B" Family, since this level is not statistically different from the "family A" reference level. From this, we can conclude that the failure times showed no significant difference in relation to these two levels of the Family variable. The same conclusion can be made for the reliability curves considering the PC "7,500" and "10,000", and also the Manufacturers "Mfr. A" and "Mfr. B". But for that, it was necessary to change the reference classes and refit the model.
Due to the architecture imposed for safety valves in deep water wells used by Petrobras, most of its valves have nitrogen chambers, which is a technology that mitigates the pressure sensitivity of the well and ensures, through a surface calibration for the individual condition of each well, the opening and closing pressures of a particular specification. In other words, the pressure class envisaged in the analysis, consists of a control variable and easy handling for new DHSVs. From the fact that the calibration is feasible and a low-cost action, with a relative impact on reducing the risks of that component, it is advisable to use the PC equal to 5,000 psi. Figure 7 presents an analysis of the behavior of the hazard function using the GTDL gamma frailty model fitted to the data. From the HR of the Family variable shown in Figure  7 (a), we observe that the ratios between "family B" and "Others", "family A" and "Others" start below one, so indicating that the "Others" family has a higher risk; however, after approximately 3 years this relationship is reversed. For the ratio between "family B" and "family A", we see that it is greater than one until approximately 5 years, i.e., at the beginning "family B" shows a higher risk of failure compared to "family A"; but after 5 years it is "family A" that exhibits  Figure 7 (b), we observe that, initially, the PC of "10,000" shows more risk of failure than the PC of "5,000", and the PC of "7,500" is more at risk of failure than PC of "5,000" and "10,000"; nevertheless, these relationships are reversed over time. It is worth noting that the risk of failure of PC "7,500" reaches approximately 14 times the risk of failure of PC "5,000". When the comparison is made with PC "10,000", the risk of PC "7,500" is even approximately 6 times.
Finally, Figure 7 (c) exhibits the HRs for the Mfr. variable, from which we note that the ratios between "Others" and "Mfr. A", "Others" and "Mfr. B" are always below one, so indicating that the risk of failure is greater for the Manufacturers "Mfr. B" and "Mfr. A". When comparing "Mfr. B" and "Mfr. A", we observe that, initially, the ratio is less than one until the age of approximately 3 years, thus indicating that "Mfr. A" has a higher risk of failure than "Mfr. B". However, after 3 years the relationship is inverted and maintained over time.
The GD measure identified 18 influential observations, while the LD indicated 9 influential observations; these indications can be seen in Figure 8. The cases 76, 99, 148, 164, 196 and 290 were detected by both metrics.
In Table VII, we observe that the MLEs of the parameters α 0 , α 1 , α 2 , α 3 , α 4 , β 0 , β 2 and θ are always statistically significant at the 10% level, while the MLE of β 1 is only meaningful when removing all influential observations. We also note a

C. Adjustment to the environment characteristics group
In Step 1, the variables OU, CWT and WC are statistically significant; while in Step 2, only the intercept is relevant. The GTDL model (without frailty) is the one that best fits these data, since we do not reject the null hypothesis (H 0 : θ = 0) at the 10% significance level. From the obtained MLEs displayed in Table VIII, we observe that all parameters are significant at the 10% level, except for the parameter β 1 , which measures the effect of the OU class "SB". Thus, we can say that there is no significant difference between the OU levels "CB" (reference) and "SB".
From Figure 9, we note that "ES" is the class with the lowest reliability among the three operating units, while "SB" is the one with the highest reliability. Moreover, it can be observed that the higher the value of the CWT and WC variables, the lower the reliability. Figure 10 (a) shows that the HRs between "CB" and "SB" with "ES" are below one all the time, so indicating that the risk of valve failure is lower in the "CB" and "SB" operating units. When analyzing the HR between the "CB" and "SB",  we note that it is always greater than one, thus indicating that "SB" has a lower risk of failure. Finally, by analyzing the HRs involving the first and third quartiles of the CWT and WC variables (Figure 10 (b) and Figure 10 (c), respectively), we observe that both are less than one, thus indicating that the increase in the value of these variables causes the risk to also increase. Finally, Figure 11 exhibits the GD and LD measures, considering the GTDL model fitted to the environmental group data. In total, 17 influential observations are detected. In Table  IX, we present the RCs and p-values, from which we note that the effect of time, the CWT variable and the OU class "ES" are significant at the 10% level in all arrangements. While the OU class "SB" is not significant in any of the scenarios, and the WC variable is significant in most of the scenarios. The RC is the largest when we remove all the influential observations, and this result is valid for all parameters. These changes range from 29.1455% (for the parameter β 0 ) to 65.9195% (for the  parameter β 2 ).

D. Adjustment to the operation characteristics group
In this last fitting, only the CWP variable is significant in Step 1, the effect of time is measured only by α 0 and the GTDL model (without frailty) is the one that best fits these data. Its estimation results are presented in Table X, from which we note that all parameters are significant at the 10% level.   Figure 12 (a) exhibits the behavior of the reliability function when we vary the value of the CWP variable. Note that with the increase in the value of the CWP variable, an increase in the value of the valve reliability is observed. An example of the HR is shown in Figure 12 (b), using the first and third quartiles of the CWP variable, from which we see that the ratio is greater than one, so the risk of failure is greater when the CWP variable takes on the value equal to the first quartile. We also observe that the ratio is initially close to the value 2, indicating that the risk of valve failure, when operated at the value of the first quartile, is approximately twice when operated at the value of the third quartile. Such a risk value decreases with time, but stays above 1.5 in the time of 20 years.
The total number of influential observations detected by the GD and LD measures are 16 and 8, respectively, as can be seen in Figure 13. Note, however, that only observations 14 and 47 were identified by both metrics. From the removal of influential observations, we can see in Table XI that the CWP variable remains significant in all configurations, and that the effect of time is almost always significant. The point estimates underwent few changes when excluding only one influential observation; this fact is quantified by RC less than or equal to 20.0448%. But when we removed all the influential observations, we noticed major changes in the estimates of the parameters α 0 and β 1 , with RC of 266% and 165%, respectively. In this case, it is worth noting that the estimate of the parameter α 0 changed from 0.1403 to 0.5145, thus the effect of time is greater when removing the influential observations.  VI. CONCLUSIONS In this paper, we analyzed a real reliability data set on DHSVs used by the Brazil's Petrobras oil firm. In the graphical analysis, we verified the presence of non-PH, therefore, the traditional Cox model cannot be used. Then, our proposed modeling was developed using the GTDL and GTDL gamma frailty models with regression also in the parameter that measures the effect of time. The modeling was divided into four groups due to the large amount of missing data. We identified that the variables H2S, BSW, PC, Mfr., Family, OU, CWT, WC and CWP are relevant to describe the time until failure. We also noted that only variables with valve characteristics are not enough to describe the time until failure, because the model with fragility needed to be adopted. The diagnostic analysis highlighted possible influential points in the adjustments made. We presented summaries of the adjustments without the influential observations, because further information and investigations of these observations are for the exclusive use of the company. In this way, we demonstrated the importance of diagnostic analysis in the model because we detected inferential changes after eliminating potentially influential cases.
The variables were divided into 4 groups considering their characteristics and the proposed modeling adopted this division, since the amount of missing data is large. As a future work, the use of the well-known statistical method of Principal Component Analysis (PCA) can be an alternative for modeling without considering groups of variables.