Counterfactual building and evaluation via eXplainable Support Vector Data Description

Increasingly in recent times, the mere prediction of a machine learning algorithm is considered insufficient to gain complete control over the event being predicted. A machine learning algorithm should be considered reliable in the way it allows to extract more knowledge and information than just having a prediction at hand. In this perspective, the counterfactual theory plays a central role. By definition, a counterfactual is the smallest variation of the input such that it changes the predicted behaviour. The paper addresses counterfactuals through Support Vector Data Description (SVDD), empowered by explainability and metric for assessing the counterfactual quality. After showing the specific case in which an analytical solution may be found (under Euclidean distance and linear kernel), an optimisation problem is posed for any type of distances and kernels. The vehicle platooning application is the use case considered to demonstrate how the outlined methodology may offer support to safety-critical applications as well as how explanation may shed new light into the control of the system at hand.


I. INTRODUCTION A. BACKGROUND
Counterfactual explanations (CEs), a concept borrowed from philosophy of language and logic, has been first declined in the context of machine learning by Wachter et al [1] as the minimal change that is required in the input features of a certain observation in order for the prediction of that observation to fall into the opposite class, in a binary classification problem. Specifically, a change of a certain delta in the features describing the observation x, belonging to class C, leads to the generation of an observation x ′ (i.e., the counterfactual of x) that will be classified as belonging to class C ′ . These kind of local explanations are assuming a certain importance, especially in machine learning models dealing with images [2], as they allow to add a certain degree of interpretability to the underlying behavior of complex models like neural networks, in line with the demand of the European General Data Protection Regulation (GDPR) 1 1 https://gdpr.eu/tag/gdpr/ for greater transparency when handling decisions made by a model. Different approaches have been recently proposed to produce realistic and feasible counterfactuals to provide local explanations for automated decision making processes. Table  1 provides an overview of related literature with regards to the methods for CEs generation, the use cases, the validation approach and open issues. For example, White et al [3] determined counterfactuals by applying minimum perturbations for each feature separately and use them to generate local regression models, then evaluating the fidelity of these regressions, in five different case studies. Poyiadzi et al [4], instead, proposed a method for generating CEs by considering a trade-off between the length of the path from the point to its corresponding counterfactual and the data density along this path. Finally, Mochaourab et al [7] considered the design of robust CEs for privacy preserving mechanisms based on binary Support Vector Machines, by applying the bisection method between two points belonging to different classes and evaluating the trade off between accuracy, privacy and CLEAR: minimization of the fidelity error, obtained by iteratively comparing progressive bperturbations of each single feature with estimates of b-perturbations calculated using a local regression equation built around the initial point.
Numerical: Pima Indians Diabetes, Iris, Default of Credit Card Clients, Adult and Breast Cancer Wisconsin data sets Fidelity of regression, against LIME CEs quality depends on the neighbourhood dataset used for step-wise regression Poyiadzi et al [4] FACE: minimization of the fdistance describing the trade-off between path length and data density along the path, through the Shortest Path First Algorithm applied to a graph constructed over data points by using KDE, KNN or ε-graph.
Numerical: synthetic data; Images: MNIST data set Comparison with CEs generated with a baseline method [1] Limited validation of generated CEs Van Looveren et al [5] Addition of a prototype loss term in the objective function, to guide and fasten the search process. Encoders or K-d trees may be used to define class prototypes.
Numerical: Breast Cancer Wisconsin data set; Images: MNIST Quantitative and visual interpretability, sparsity, and speed.
The interpretability measures depend on AE trained on the original and counterfactual classes, hence are associated with prediction uncertainty.
Nemirovsky et al [6] CounteRGAN: fixed target classifier (e.g., NN) coupled with a RGAN trained to produce residuals that are added to the input, to produce CEs.
Numerical: PIMA Indian Diabetes, COMPAS recidivism data set; Images: MNIST data set Prediction gain, realism, latency and actionability against others related works (e.g., [5]) The application of this method to large images data sets would require need more complex architectures and finer hyper-parameters tuning.
Mochaourab et al [7] Bisection method: starting from two prototypes with opposite class, according to Privacy preserving SVM with RBF kernel.
Numerical: Breast Cancer Wisconsin data set Trade-off between accuracy, privacy and explainability Privacy requirement degrades the quality of generated CEs. CEs assessment is based on SVM prediction confidence.
Dhurandhar et al [8] CEM: optimization of the perturbation variable using the fast iterative shrinkage-thresholding algorithm (FISTA) coupled with the use of a CAE to evaluate the distance from the data manifold.

explainability.
CEs are a rather versatile solution that can be applied to different contexts with various purposes. For example they can be generated in order to understand what are the changes in the characteristics of a medical image that lead to a certain diagnosis of pathology (e.g., [8]and [5]). Another possible use of counterfactuals recently proposed in the literature [11] concerns their application to provide actionable feedback (e.g., realistic changes in expected salary or increase in work experience word count) to candidates in a hiring marketplace in order to improve their profile.
Whether an observation belongs to a certain class may depend on two categories of features: controllable features, which can be manipulated through internal/external intervention (e.g., therapies or lifestyle changes in clinical classification problems or control algorithms in systems modelling and control problems) and non-controllable features, which by their nature are not manipulable (e.g., the age of a subject in health prediction algorithms). Therefore, the search for realistic counterfactuals should be performed by perturbing only controllable variables. To our knowledge, the only attempt to force the generated CEs to have no change in terms of non-controllable characteristics was carried out by Nemirovsy et al [11] who developed a method to produce counterfactuals able to provide actionable feedbacks in realtime using Generative Adversarial Networks (GANs). However, in that case, feature immutability was imposed after the application of the counterfactuals search algorithm by setting the values of non-controllable features to the original values rather than to the values suggested by the counterfactuals search algorithm. By contrast, in this study, for the first time, the search for counterfactuals is guided by directly perturbing only controllable features.
Previous related works validated the proposed CEs with respect to explanations obtained with other local explainability methods, like Local Interpretable Model-agnostic Explanations (LIME) or Layer-Wise Relevance Propagation (LRP) [3], [8] or with respect to other state-of-the-art method for generation of CEs [4], [9], [11]. Often, the validation measure relies on verifying that the CE is correctly associated with its target outcome, based on the prediction of a classifier. However, this measure is characterized by a degree of uncertainty, since it is not guaranteed that the real class matches the predicted class. To our knowledge, none of the approaches presented in the literature is supported by a validation of the generated CEs with computational simulations, capable of verifying that the CE belongs to a certain class, and rulebased models that explain the reason for this belonging.
The aim of this paper is to introduce a novel methodology for counterfactual generation and validation. The counterfactuals generation method uses regions defined by Two Class-Support Vector Data Descriptors (TC-SVDDs) and is here introduced in both analytical (II-A) and numerical (II-B) form. The validation method combines computational simulations and eXplainable AI (XAI), specifically in the form of rule-based classification of counterfactuals. An example of application to collision detection in truck platooning is introduced to demonstrate the method (III).

B. CONTRIBUTION
The main contributions of this paper include: • the introduction of constrained counterfactuals, whose search is based on perturbations of controllable features; • the analytical and numerical formulations for the generation of counterfactual that include: -the introduction of the minimum distance problem between SVDD classes and its analytical solution in the linear case; -an SVDD-based counterfactuals generation algorithm which is simpler than deep learning-based solutions; -the assessment of Counterfactual Distance (i.e., whether it is over-or under-dimensioned); • the use of XAI global method to extract knowledge from counterfactuals; • the application of the newly introduced method to an example of cyberphysical system and the validation by means of simulations, together with the identification of the rules that characterise the decisions.

C. STRUCTURE OF THE PAPER
The paper is structured as follows: section II introduces the concept of counterfactual SVDD, its analytical (II-A) and numerical solution (II-B), and the natively explainable method used to define the rules that characterize both factuals and counterfactuals (II-C), section III describes an example of application of counterfactual SVDD to a case of truck platooning, and section IV discusses our findings with respect to related literature.

II. METHODOLOGY
Suppose we have a dataset X ×Y ⊂ R N ×{−1, +1}, N ≥ 2, consisting of a subset of controllable features u and a subset of non-controllable features z, so that an observation x ∈ X can be described as x = u 1 , u 2 , . . . , u n , z 1 , z 2 , . . . , z m ∈ R n+m=N We perform a TC-SVDD classification as in [12], obtaining two regions , a 1 , a 2 are, respectively, the radii and the centers of the spheres of the computed TC-SVDD.
Given an object x = (u, z) ∈ S 1 , our goal is to determine the minimum variation ∆u * of the controllable variables so that the point belongs to the class S 2 . To determine ∆u * , we define the following minimization problem where d is a distance and (2b), (2c) are the constraints that require x * to belong to S 2 and not to S 1 , respectively. In other words, the counterfactual x * is the nearest point, with respect to distance d, that belongs to the class opposite to the original class of a given point x, taking into account that only controllable features u can be modified.

1) Optimality
The optimality of a counterfactual refers to the identification, in the target output class (i.e., the class opposite to the original class the point belongs to), of the point that exhibits the joint minimum variation of the input features with respect to the starting point (i.e., the factual, that is by definition a point outside the target class), as shown in (2). Typically, it is possible to have variations of several combinations of features although only one of these joint variations would be at minimal distance. The proposed algorithm searches for the minimal joint variation (of all the control variables) through the minimum distance from the factual.

2) Closed-form versus numerical solution
Finding an analytical solution of (2) is not an easy task and might be impossible since the space of constraints is not convex (i.e., the constraint (2c) is not convex), also it is necessary to take into account the choice of distance d. However, there are some cases where it is possible to analytically explicate the solution of (2), for example choosing as distance the Euclidean norm, performing a linear TC-SVDD and assuming to be only in two dimensions, with one feature controllable and the other non-controllable. In other cases, the solution of (2) will be performed numerically by sampling the classification regions with quasi-random methods and searching for the closest point of a given observation with respect to a fixed distance. VOLUME 4, 2016 (a) Counterfactual solution for S1 ∩ S2 = ∅. The solution in this case is obtained by simply posing λ2 = 0, i.e., imposing nullity on the constraint (3c).

(b)
Counterfactual solution for S1 ∩ S2 ̸ = ∅. In this case the optimal solution is not on the edge of the region S2 but it is inside it.

A. R 2 ANALYTICAL SOLUTION
Let be X × Y ⊂ R 2 × {−1, 1} a labelled two-dimensional dataset, in which each object x ∈ X consists of a controllable component u and a non-controllable one z, i.e. x = (u, z) ∈ R 2 . After performing a linear TC-SVDD [12] and determining two regions S 1 , S 2 ⊂ R 2 , our goal is, given an object x = (u, z) ∈ S 1 , to find the minimum change in the controllable variable ∆u * so that the object x * = (u + ∆u * , z) is the closest point to x belonging to S 2 and not belonging to S 1 . In R 2 , the problem to be solved is the following: Two slack variables ξ 1 , ξ 2 are introduced and the above problem changes in: where the parameters D 1 , D 2 control the trade-off between the distance and the error. Introducing the Lagrange multipliers λ 1 , λ 2 , λ 3 , λ 4 ≥ 0 we get the Lagrangian function Setting partial derivatives to zero gives the following constraints: where a u 1 , a u 2 are the projections of a 1 , a 2 onto the controllable variable u. By substituting (6) into the expression of L we get: which must be maximized under the constraints (7) and (8) to get λ * 1 and λ * 2 to be substituted into (6) to obtain the minimum variation ∆u * .

B. NUMERICAL SOLUTION
As the size of the feature space increases and for more complicated distances d or kernels, the solution of (2) may be analytically unfeasible. Thus, a discreet search algorithm has been developed.

1) CounterfactualSVDD algorithm
Algorithm 1 returns the set C of counterfactuals of points belonging to S 1 . Of course, the same procedure can be applied to find the counterfactuals of the points belonging to S 2 simply by reversing the roles of S 1 and S 2 . For better understanding, Table II-B1 shows the meaning of the symbols and variables used in Algorithm 1. [12] is performed on X tr × Y tr and validated on X vl × Y vl in order to derive S 1 and S 2 . N C > 0 is fixed.
The points for which a counterfactual is desired are randomly or directly sampled in S 1 , while their counterfactual is sought in the set G 2 , obtained from the intersection of S 2 with the set G, sampled in feature space using quasi-random sampling techniques [13], with the non-controllable features fixed. Thus, the accuracy of the counterfactual is related to the granularity of the sampling: the denser the sampling, the more accurate the counterfactual will be (bounds on the best number of random sampling points can be found in the literature [14]). Moreover, since the concept of counterfactual is closely related to explainability, a set of rules for each TC-SVDD class, R(S i ), is defined according to ExplainableSVDD algorithm [15], [16]. This is a further validation that will then also be used as a basis for extracting knowledge from the rules that characterise counterfactuals (see Section III).

2) Convergence
The counterfactual generation method can, in principle, converge to the optimal counterfactual based on the information available. According to statistical learning theory, this information corresponds to the set of points available for the method to choose the candidate optimal one. More specifically, this depends on to the size (L) of the set of candidate counterfactuals, taken within the randomly sampled SVDD target region, on which the distance from the starting point (the factual) is computed to find the point at minimum distance. In this respect, [14] gives convergence assurance, whose rate is linear with respect to L. It is also worth noting that the gap between the solution and the optimum grows exponentially in the dimension of the feature space.

3) Computational cost
The estimation of the computational cost of the Algorithm 1 takes into account several aspects and considerations that need to be thoroughly investigated. First, there are two complexities involved: the SVDD and the research of the counterfactuals. Then, the counterfactual search itself involves other methods with their own complexities.
Since the SVDD is closely related to the SVM, we can assume that the computational cost is similar without losing any information, and denoting with n the number of points and with d the number of features, its computational cost is estimated in O(max(n, d) min(n, d) 2 ) [17]. Let us indicate this computational time with O(SV DD).
Regarding instead the research of the counterfactuals, we have to take into account • the complexity of the quasi-random sampling, • the number of the counterfactuals N C , • the computation of the distance, • the search of the minimum of a vector. The complexity of the quasi-random sampling depends on the method used for the sampling and references for its estimation can be found in [18]. Let us denote with O(q) the complexity of the quasi-random sampling. The number of counterfactuals N C affects the computational time of the for-loop, that is O(N C ). Inside this loop, we have to compute the distance d which, in principle, can be based on any kind of distance definition [19]. So let us indicate with O(D) its computational cost. Finally, the cost of the research of the minimum of a vector can be estimated to be linear in the order of the number of the elements composing the vector [20]. So its computational cost, denoting with g = #G 2 |z=z i , is O (g). Therefore, putting together all the components computed so far, the total complexity of the search of the counterfactuals, O (SC)), can be estimated with O (max (q, N C · (max (D, g)))). And then, the total computational cost of the Algorithm 1 can be estimated with O (max (SV DD, SC)). VOLUME 4, 2016

4) Counterfactual Distance
Since the counterfactual determined by the algorithm is an approximation of the real counterfactual, a metric of the quality of the extracted counterfactual is needed. Given a point, its counterfactual is, by definition, the nearest point belonging to the opposite class. Thus, a straightforward metric for evaluating the quality q of the counterfactual x ′ of a point x ∈ S 1 is to evaluate its distance from S 1 : where a 1 and R 1 are respectively the center and the radius of S 1 . We define this new metric as Counterfactual Distance (CD).

FIGURE 2.
2D-linear example of CD: this metric evaluates the goodness of the counterfactual, the closer q is to zero the more the counterfactual is optimal in terms of minimum distance. In the figure, q2 > q1 and the blue counterfactual x ′ is worst than the green (optimal) one x * .
From Figure 2 it is easy to see that the lower the q, the better the counterfactual and if q < 0 then the counterfactual determined is incorrect.

C. EXPLAINABLE AI
XAI has gained a lot of importance in recent years. The already mentioned European GDPR, in 2018, stated that "the existence of automated decision-making should carry meaningful information about the logic involved". XAI is therefore a concept related to all those methods which can guarantee trustworthiness and understanding to humans. Hence, they often come in the form of intelligible rules. XAI drives the SVDD counterfactual characterization and knowledge extraction. The Logic Learning Machine (LLM) is used to this aim. The LLM algoritm is based on a four-step process: discretization and latticization, shadow clustering, and rule generation as defined in [21], [22]. First, each variable is transformed into a binary string in a proper Boolean lattice, using the inverse only-one code binarization. All strings are eventually concatenated in one unique large string per each sample. Then, a set of binary values, called implicants, which allow the identification of groups of points associated with a specific class, is generated. Finally, implicants are transformed into a collection of simple conditions and combined into a set of intelligible rules. Therefore, the decision process of an LLM algorithm can be summarized as a set of m intelligible rules in the form IF (premise) THEN (consequence), with the premise being the logical product of n k conditions and the consequence being the output class. The relevance of a rule r k is associated with two measures, namely: where T P (r k ), F P (r k ), T N (r k ), and F N (r k ) are the true positives, false positives, true negatives, and false negatives associated with the rule r k . The covering is the percentage of points for which a rule is true and maps the points on a target class. The error is the percentage of points for which the rule is true on classes other than the target one. Like decision trees, the LLM is explainable by design and it is a global method as it discovers rules which map clusters of points into classes. Other XAI methods, such as Anchors and their optimised variations [23], are "local" as they specialise rules locally for each separate sample. More specifically, Anchors explains the results of any black-box classifier, by approximating it locally through linearization as in LIME [24], [25] and an interpretable model 2 . Extending the validity (covering) of a local rule over neighbour points is not a straightforward matter [23]; for this reason, the LLM is preferred to facilitate the knowledge extraction from the SVDD counterfactuals, by following the approach in [15], [16]. This approach applies the LLM around the boundary of the SVDD, thus maintaining the global structure of the rulebased clustering, still limiting the number of involved points and the inherent computational burden.

1) Feature ranking
Feature ranking helps rule interpretation and knowledge discovery. It gives the importance of each feature in inferring the right classification (e.g., distance and speed of vehicles as outlined later on). It is also used for feature reduction in order to synthesize the model (just using the most relevant features). Whatever the XAI solver is, feature ranking may be easily derived from the ruleset, by applying sensitivity analysis on model accuracy, with and without the feature to be ranked. The interested reader is referred to [26] for further details on that subject. Feature ranking is later used to synthetize the knowledge extracted from the factual and counterfactual rulesets at hand.

III. EXPERIMENT: VEHICLE PLATOONING
The following safety-critical application is considered. Vehicle platooning is one of the most challenging problems in smart mobility scenarios. It consists of a group of vehicles interconnected via wireless that travel autonomously; the aim is to find a compromise between performance (e.g., maximize speed and minimize reciprocal distance, thus minimizing air drag resistance and fuel consumption, too) and safety (avoid collisions, even in the presence of anomalous events, such as sudden brakes or cyberattacks, [27]) The aim here is to determine what is the minimum variation in terms of controllable factors (i.e, the initial mutual distance and speed between two consecutive vehicles in the platoon, respectively d 0 and v 0 ) that allows for a change in system safety (collision / non-collision or vice versa; a point of the dataset is labelled as collision if the distance between any couples of vehicles, during the simulation run, becomes lower than 2 meters).

1) Data set Description
The data set concerning collision prediction in vehicle platooning is taken from [27], [28] 3 . The machine learning solution is based on a supervised classification task that maps the features into a potential collision in the near future; features are: braking force of lead vehicle (at the top of the platoon), current speed, distance and acceleration, number and weight of vehicles, as well as quality of service of the communication channel (loss probability and delay). Controllable variables are speed and distance only, thus making the restrictions on counterfactual generation (with respect to the other variables), as well as the search in the grid of the destination SVDD, very tight. In this scenario, the counterfactual explanation can play an effective role in improving the safety of the platooning system: given a combination of the platoon input parameters that brings the system into collision, the counterfactual finds the minimal change in the controllable features such that the platoon no longer collides. Finding such a minimal change simplifies the recovery operation (from collision). The behaviour of the platooning system is synthesised by the following vector of features: where N is the total number of vehicles of the platoon, F is the braking force applied by the leader, m is the weight of the vehicles, d ms is the communication delay in milliseconds, p is the probability of packet loss, and d 0 and v 0 are the mutual distance and speed between each pair of vehicles in the initial condition. Data points are sampled by implementing the CACC simulator as in [27]  The considered ranges are very challenging as they cover a very large set of working conditions. As already said, since the control of the dynamical system reacts by changing the initial distance and speed, we consider the variables d 0 and v 0 as the only controllable ones and the others as noncontrollable, therefore, named X P L the platooning dataset, an observation x ∈ X P L can be written as VOLUME   x = (u, z) where u = (d 0 , v 0 ) and z = (N, F, m, d ms , p). The analysed platooning data set includes 20000 records with equally distributed samples for the collision (+1) and noncollision (-1) classes. A TC-SVDD with Gaussian Kernel [29] has been trained (σ = 1.87, C 1 = C 2 = 1, C 3 = 1/(νN 1 ), C 4 = 1/(νN −1 ), where N 1 and N −1 are the sizes of the collision and non collision class, respectively, and ν = 0.05 as in [12]) on 60% of the data and evaluated on the remaining 40%. A set of 10000 CEs has been generated through the implementation of Algorithm1 and validated both with rule-analysis and simulations. Figure 3 presents the scatterplots of all the possible pairs of features in the platooning data set, grouped by target class, and reveals how the separation between safety and collision may be hardly found without complex combinations of more than two features.

2) Results
The TC-SVDD trained on the platooning data achieved the following classification performance: training accuracy of 0.88, test accuracy of 0.88, sensitivity of 1.00, specificity of 0.75. LLM decision rules describing the two SVDD regions are extracted as in [15], [16] and presented in Table 3. Specifically, the collision region is described by four rules (average number of conditions =2.75), whereas the non collision region is described by ten rules (average number of conditions =3.3). The feature ranking in Figure 4 helps understand the most relevant features for classes separation. Distance, braking force and delay are the most meaningful ones; surprisingly, speed and number of vehicles have less importance than expected. The left and right directions of the bars indicate the relevance in decreasing and increasing values, respectively, of the feature. The directions of distance and speed are coherent with intuition, e.g., decreasing distance increases the frequency of collision. The direction of the bar associated with the delay feature in the safety class (no collision) is however counter-intuitive as it states that safety is achieved by increasing delay. This is not uncommon in machine learning analysis as it should give unexpected insights into the problem. In this case, the delay effect is superseded by the ones of the other variables; the delay subplots in Figure  3 show the spread of red (collision) points over almost all the delay ranges (except very low delays). Together with Table 3, the ranking figures help understand how much global XAI drives a more synthetic knowledge extraction than local XAI (such as through LIME, as often used in counterfactual explanation [?]), which gives rules that are built around the point of interest and have a limited covering over the rest of the dataset. Global XAI still has local explanation property (as outlined in Table 4), but it may give global insight, too (as outlined later in Figure 6c).

3) Explanation
To determine a counterfactual explanation of X P L , 10000 points were randomly sampled from the collision class (+1) and a counterfactual was determined for each of them through Algorithm 1, using the Gaussian kernel-induced distance d as the distance [30] d(x, y) is the Gaussian kernel. Ten examples are shown in Table 4. For each row of Table 4, the point belonging to the collision class, classified with the SVDD and LLM and the rule, with largest covering, it satisfies; the corresponding CE, also classified with the SVDD and LLM, and the rule it satisfies is reported. The last column reports the minimum change ∆u in distance and speed that allowed to move from the collision class to the non-collision class.

4) Validation
The validation of the counterfactuals safety is as follows: the 10000 CEs determined by Algorithm 1 were tested by the CACC simulator [27], obtaining 7.82% error (i.e., that the determined counterfactual still brings the system into collision) and 92.18% actual counterfactuals, of which only 2.07% are found to be overestimated. Overestimation is defined with respect to a final distance larger than 10 meters 4 , such a distance is found at the end of the simulation run, which is driven by the counterfactual. Figure 5 deals with the temporal behaviour of three significant cases; the first two (from top to bottom subplots) are optimal counterfactuals (the first with change in speed and the second one with change in distance), as they lead to a final condition which is very close to collision. The last subplot (at the bottom of the figure) highlights an over-dimensioned counterfactual as the final distance is much larger than the boundary one (between collision and non-collision). 4 A collision is considered, in the original dataset, when the distance is below the threshold of 2 meters.

5) On the minimum distance
The analysis would suggest more insightful thinking on the concept of "minimum" counterfactual distance, which is ubiquitous in the literature. In the platooning application, that concept would imply "almost collision" because the counterfactual, by construction, should lie in the safety SVDD (under the constraint of non-controllable variables), but still closest to the collision one. On the one hand, this corroborates the flexibility of counterfactual construction through the SVDD with respect to deep learning, in which the positioning of the (constrained and with minimum distance) counterfactual should be mapped into a very complex training cost. On the other hand, it would lead to other, more restricted, forms of counterfactual construction, when safety plays a crucial role. This topic is left open for future research.

6) Quality
The validation of the counterfactuals quality is as follows. The CD of each CE is calculated (see Section II-B), thus evidencing satisfactory statistics, as shown in Figure 6a, in line with simulation evidence (Figure 6b). The CD metric well synthesises the overestimation issue. Recall that high QC means low quality in counterfactuals. In order to derive further knowledge extraction from the CD analysis, the following supervised problem is defined over the CD values and solved via the LLM. The factuals (i.e., points of the collision class, which are mapped into the corresponding counterfactuals) are mapped into two classes; the classes label CD values under and above the 0.03 threshold. Values larger than the threshold represent overdimensioned and almost overdimensioned points, as evidenced in Figure 6a. The resulting feature ranking in Figure 6c (for CD>threshold) shows that high CD samples are associated with critical factuals, namely, with increasing delay, leader acceleration (force divided by the mass), loss, speed and number of vehicles as well as decreasing distance. The rationale of the conditions relies on the fact that critical factuals need to go deeper inside the destination class (thus leading to larger CD) to replace the original conditions of collision into new safety VOLUME 4, 2016  ones. Moreover, the rules identifying high CD may drive further optimisation of the respective counterfactuals, e.g., through a finer granularity of the grid in a reduced search space, identified by the ruleset itself [31]. This is left open for future research as well.

IV. DISCUSSION
This study aims to define a new method for generating local explanations by defining counterfactuals from observations characterized by controllable and non-controllable features. Nemirovsky et al. [6] first introduced the concept of CEs with controllable and non-controllable features in a diabetes prediction algorithm, however they first applied counterfactual search to all the features and then they removed the perturbations related to non-controllable features like age and the number of pregnancies. In this study, controllable and non-controllable features are handled in a more straight-This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  The platoon collides when the minimum distance in the simulation is less than or equal to 2 (red dots). Black dots refer to counterfactuals that overestimate the correction (minimum distance greater than 10).
(c) Feature ranking which describes the relevance of the features in classify high value of CQ. forward way, since the search for counterfactuals is instead done by perturbing only the controllable features (i.e., d 0 and v 0 ) in the kernel space, keeping the non-controllable variables fixed. Most of the recently proposed methods are deep learning based [6], [8], thus requiring more complex architectures and higher computational cost for training. The use of TC-SVDD allows to define the two regions with a reduced computational cost, yet still achieving more than satisfactory accuracy (e.g., > 85%). Furthermore, the additional rule-based description of the SVDD regions provides transparency to the point classification process, allowing for a robust validation of correctness and consistency of the generated CEs. Specifically, as shown in Table 4, in the platooning example, CEs are generally associated with greater initial distance and reduced initial velocity of the platoon. Moreover, the quality of explanations have been evaluated in terms of distance from the region associated with the opposite outcome. The optimal CE of x is the point, with opposite class, located at minimum distance from x. The introduction of a quality metric (CD) allows to verify the correctness of CEs, generated with the proposed numerical approximation, since a distance greater than zero ensures the non-intersection between the two SVDD regions, thus the belonging of the CE to the correct class, with a certain level of confidence defined by the TC-SVDD (i.e., 88% in the platooning example) and a distance close to zero ensures the minimum distance requirement. Figure 6b shows CD values for the generated platooning CEs, demonstrating the effectiveness of the proposed method, as most of the points are associated to a low but positive CD value. Indeed, almost 40% of the points are associated to CD lower than 0.02 and about 92% of the points present CD lower than 0.1. Unlike previous works in this area, the validation of the generated counterfactuals is not only based on class prediction VOLUME 4, 2016 via SVDD, but further supported by validation via simulations. In fact, the attribution of the point to the correct class according to the prediction of the previously trained model does not guarantee its real belonging to that class, because of the existence of a certain number of false positives and false negatives that, even if minimized, should not be neglected. The validation process through the CACC simulator (see 6a) has proven that the generated CEs are descriptive of the noncollision class with a more than satisfactory accuracy, and that only a small part of the generated points overestimates the minimum distance. Hence, the use of CE in platooning results applicable to the generation of control algorithms, based on the correction of the system dynamics, to prevent collisions.

1) Other applications
The considered application seems applicable to cyberphysical systems, empowered by simulated digital twins. However, the method results applicable to a wide range of applications. Examples may lie in the following sectors: health sector (e.g, disease prediction and prevention), human behavioral analysis (fraud detection) and social networks (guidance of public opinion [32] 5 ). The health sector is currently our next step as it introduces some conceptual differences in the validation process. As already pointed out for cyberphysical systems, testing tools (via simulation, emulation, or replicable experiments) may offer support to validation through additional counterfactual-driven ground truth (i.e., testing the exact counterfactual collision avoidance). Clinical analysis, on the other hand, cannot exploit controllable ground truth in a straigthforward manner (i.e., applying a medical treatment just in accordance of the counterfactual!). The health scenarios would claim for additional human interaction between AI and the clinician who interprets the (explained) artificial reasoning (i.e., the suggested counterfactual) and maps it into current clinical practice. In this case, the testing environment would consist of dedicated medical trial campaigns.

2) Diabetes characterization and prevention
In [33], CEs were used to characterize the smallest changes in biomarker values that distinguish diabetic patients from non-diabetic ones. Preliminary results have shown that nondiabetics patients have on average lower values in terms of fasting blood sugar (-0.88 mmol/L) and body mass index (-0.14 kg/m 2 ) and higher values of high-density lipoprotein (0.26 mmol/L) with respect to diabetic ones. Particularly, the changes in biomarkers tend to increase with age. These variations, albeit small, reflect the literature on risk factors for Type 2 diabetes and suggest the importance, in biomedical applications, of integrating AI-generated recommendations with medical knowledge and clinical guidelines. Possible next developments could head in this direction as CEs generated through the application of variable distance perturba-tions could be useful to provide an estimate of risk in the case of chronic diseases, such as diabetes, and contribute to the formulation of preventive strategies. In fact, CEs generated at minimum distance are associated to an higher risk of developing the disease, whereas CEs generated at a progressively increasing distance are associated with a lower risk. The proposed framework, proves to be trustworthy, thanks to the use of the LLM, which allows to characterize the extracted CEs through readily interpretable rules that can be easily understood and validated by application domain experts, even if they have no prior knowledge in the field of artificial intelligence.

V. CONCLUSION AND FUTURE WORKS
Future research will need to focus on further optimization of the method as anticipated in the results section, as well as on modifying the proposed method to handle categorical variables and images. Moreover, the aforementioned method shall be compared with other state-of-the-art solutions and investigated with respect to different domains of application, like the field of disease prevention, for example using observations derived from electronic medical records, from longitudinal population studies, or from individual monitoring devices.