Prototype-Based Interpretable Graph Neural Networks

Graph neural networks have proved to be a key tool for dealing with many problems and domains, such as chemistry, natural language processing, and social networks. While the structure of the layers is simple, it is difficult to identify the patterns learned by the graph neural network. Several works propose post hoc methods to explain graph predictions, but few of them try to generate interpretable models. Conversely, the topic of the interpretable models is highly investigated in image recognition. Given the similarity between image and graph domains, we analyze the adaptability of prototype-based neural networks for graph and node classification. In particular, we investigate the use of two interpretable networks, ProtoPNet and TesNet, in the graph domain. We show that the adapted networks manage to reach better or higher accuracy scores than their respective black-box models and comparable performances with state-of-the-art self-explainable models. Showing how to extract ProtoPNet and TesNet explanations on graph neural networks, we further study how to obtain global and local explanations for the trained models. We then evaluate the explanations of the interpretable models by comparing them with post hoc approaches and self-explainable models. Our findings show that the application of TesNet and ProtoPNet to the graph domain produces qualitative predictions while improving their reliability and transparency.


Prototype-Based Interpretable Graph Neural Networks
Alessio Ragno , Biagio La Rosa , and Roberto Capobianco Abstract-Graph neural networks have proved to be a key tool for dealing with many problems and domains, such as chemistry, natural language processing, and social networks.While the structure of the layers is simple, it is difficult to identify the patterns learned by the graph neural network.Several works propose post hoc methods to explain graph predictions, but few of them try to generate interpretable models.Conversely, the topic of the interpretable models is highly investigated in image recognition.Given the similarity between image and graph domains, we analyze the adaptability of prototype-based neural networks for graph and node classification.In particular, we investigate the use of two interpretable networks, ProtoPNet and TesNet, in the graph domain.We show that the adapted networks manage to reach better or higher accuracy scores than their respective black-box models and comparable performances with state-of-the-art selfexplainable models.Showing how to extract ProtoPNet and TesNet explanations on graph neural networks, we further study how to obtain global and local explanations for the trained models.We then evaluate the explanations of the interpretable models by comparing them with post hoc approaches and self-explainable models.Our findings show that the application of TesNet and ProtoPNet to the graph domain produces qualitative predictions while improving their reliability and transparency.
Impact Statement-Explainability works for graph neural networks mainly focus on post-hoc explanations rather than developing self-explainable models.By adapting two self-interpretable prototype-based networks from the image to the graph domain, we analyze a self-explainable graph technique that enables both graph and node classification.We show that the explainable methods are able to obtain equal or higher accuracy scores than their respective black box versions while providing the possibility to obtain insights about their reasoning process.We set the basis for future works in the explainability of the graph neural networks.We also provide an open source implementation that can be employed for approaching several scientific problems, such as computational chemistry and natural language processing.Index Terms-Artificial neural networks, case-based reasoning, classification and regression, deep learning, explainable artificial intelligence, interpretable artificial intelligence.

I. INTRODUCTION
T HE ability of graph neural networks (GNNs) to adapt to several domains, such as social networks [1], molecules [2], [3], and textual data [4] have made these architectures palatable.Although GNNs succeed in solving many problems, it is difficult to understand the motivations behind their predictions or to identify the learned patterns.
The field of explainable artificial intelligence (XAI) addresses this issue by proposing ways to understand the inner working of the models and provide interpretations of their predictions.XAI approaches can be categorized into two main groups: methods that interpret already trained models, known as post hoc methods; and the ones that aim at defining models that are interpretable by design, i.e., intrinsic methods [5].In this study, we focus on the latter.
The field of image recognition is one of the most active on this topic.Some recent works of this field focus on prototypebased networks that learn representations able to encode entire samples, or common parts of the input, called prototypes, which are then used during the reasoning process.Among them, Pro-toPNet [6] and TesNet [7] use similarity distances to identify relevant portions of the images close to the learned prototypes and use these to classify the images.Additionally, the case-based reasoning behind these models makes it easier to interpret their predictions.
The aim of this work is to take a step in this direction by investigating the application of part-based prototype models to the graph domain for both graph and node classification tasks.The idea is to look at images as a particular type of graph and replace 2D-convolution layers with graph ones.In particular, we focus on adapting ProtoPNet and TesNet to learn prototypes that represent node embeddings and that are able to identify relevant class-aware motifs.At inference time, the models actively use the prototypes to generate the prediction, and we can use them to extract explanations about the model's behavior too.
More in detail, since the prediction is based on the similarity between the node prototypes and the input graph/node, we inspect the subgraphs that most activate the prototypes of a certain class to understand the reasoning process of the model.When using n graph convolution layers, in facts, the node embeddings contain a latent representation of the k-hop surrounding subgraph.This information allows us to explain the models both locally, i.e., for a certain prediction, and globally, i.e., visualizing the learned patterns from the network.
We then train such networks to perform graph and node classification on seven datasets and compare their performance against their black-box variants.We further investigate the explanations by comparing different ways to compute them and by providing a qualitative and quantitative comparison against alternative methods, i.e., post hoc.
Summarizing, the purpose of this work is to show and analyze the benefits of the adaptation of ProtoPNet and TesNet to the graph domain.Indeed, we note that they are often overlooked by current research, despite their capabilities.We highlight the changes in the architecture needed for the adaptation, and propose a different procedure tailored for the graph domain to extract the explanations.More in details, the contributions can be summarized as follows.
1) We analyze the adaptation of ProtoPNet and TesNet to GNNs and study the elements that impact the performance and explanations.2) We show how to explain the obtained models through the inspection of the prototype activations.3) We analyze the performance of these approaches by comparing the models' classification and explanation performances with black-box architectures over seven graph and node classification datasets: 4) We compare the performance of these approaches with other state-of-the-art self-explainable architectures over three classification datasets.5) We analyze the quality of the explanations provided by the interpretable models by comparing them with the ones of post hoc methods and self-explainable models.

6)
We provide an open-source implementation1 useful as a baseline for future work.The rest of this work is organized as follows.Section II presents the related work for GNNs and interpretability; Section III describes prototype-based neural networks and how they can be translated to the graph domain; in Section IV, we analyze the results for both performances and explanations.Finally, Section V concludes this article.

A. Prototype-Based Networks
The idea of prototype-based neural networks is to use an embedding for clustering points of the datasets around specific ones called prototypes.Snell et al. [8] first propose this framework for performing few-shot classification, where a classifier has to be able to generalize to new classes not seen during the training process.Having some representative instances that resemble a big portion of the dataset, prototypes can also be used by post hoc methods, in particular those that generate explanations by example.These methods explain the prediction by identifying samples similar to the input that the model associates with either a similar prediction (i.e., factuals) or an opposite one (i.e., counterfactuals).When it is possible to assign a semantic meaning to the set of samples identified by a prototype, we refer to them as concepts.For instance, Kim et al. [9] identify concepts by using concept activation vectors, which are vectors in the direction of the neuron activations of a concept's set of examples.
Explainable architectures can learn and use concepts too, both in a supervised [10] and an unsupervised fashion [6], [7], [11].They usually use embedding layers to represent concepts and feed them into a final layer to make the prediction.
The supervised process aligns the latent representation to specific labeled concepts.Even if it produces a meaningful representation, this approach requires a labeled concept dataset.On the other hand, unsupervised methods aim to extract relevant patterns for predicting the desired output in the embedding layer.
Among the unsupervised methods, there are different solutions for tackling various tasks.For time series, for example, Ni et al. [12] proposed to split input sequences into relevant segments that represent with prototypes.Alvarez Melis and Jaakkola [13], instead propose a general form of a selfexplainable neural network (SENN), based on prototypes that consists of the following three components: a concept encoder that transforms input features into basis concepts; an inputdependent parametrizer that calculates relevance scores for the basis concepts; and an aggregation function that combines the first two to produce the prediction.Starting from the SENN architecture, Li et al. [14] proposed a model without the inputdependent parametrizer, where the prediction is performed on the similarity between the input and the basis concepts.
Chen et al. [6] proposed ProtoPNet, a slight modification of SENN, where the encoder consists of an embedding layer, trained using a clustering and a separation losses based on the L 2 -distance, and the parametrizer does not depend on the input.Since the L 2 -distance used by ProtoPNet cannot ensure the disentanglement of the prototypes, Wang et al. [7] proposed TesNet, replacing the L 2 -distance with the inner product similarity.In this architecture, the prototypes are trained to produce an orthonormal space, which minimizes and maximizes the similarity of same-class and different-class concepts, respectively.
We contribute to this literature by investigating the application of ProtoPNet and TesNet to the graph domain.We substitute 2D-convolutional layers with graph ones and analyze the modifications needed to adapt such models.

B. Graph Neural Networks
GNNs are based on a message passing paradigm that consists of a neighborhood aggregation scheme, where the latent representation of a node is computed by recursively aggregating and updating the ones of its adjacent nodes [15].
We can define different variants of GNNs by changing the aggregate and update functions.
In this work, we employ the following three popular GNN architectures: graph convolutional networks (GCNs) [16], graph attention networks (GATs) [17], and graph isomorphism networks (GINs) [18].With GCN, Kipf and Welling [16] proposed a layer that updates the node representations by means of the average latent representations of neighbor nodes, as in Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
2D-convolution.While GCN assumes equal contribution between the connections of different nodes, GAT [17] changes the convolution scheme by weighting the aggregation with attention scores between couples of nodes.Finally, using a theoretical approach based on the Weisfeiler-Lehman algorithm [19], Xu et al. [18] find that previous message passing neural networks lack in distinguishing different subgraphs based on the generated graph embedding and, to address this problem, they propose GIN, which adjust the weight of the central node with a trainable parameter.
While this article focuses on these three architectures, its findings can be easily translated to other architectures that share the same update and aggregate paradigm.
A first drawback of these approaches, especially the ones based on permutations, is the time needed to compute a reasonable explanation.At the same time, Kindermans et al. [25] and Adebayo et al. [26] show that some of these methods are independent of both the training data and the model parameters, a crucial problem when they are used to debug the models.Other works propose an alternative to post hoc methods in the form of neural networks interpretable by design, thus enforcing reasoning processes easier to explain.
To the best of the authors' knowledge, little effort has been done for building interpretable GNNs.The closest work to ours is the one by Zhang et al. [27], who propose ProtGNN.In this case, the prototypes are computed on graph embedding and the classification is based on the similarity between them and the current input graph embedding.ProtGNN+ is an evolution that uses a subsampling layer to identify substructures in the graph.Differently from them, we do not compare the similarity at a graph level but at a node level and focus on learning graph-prototypical parts.An other related approach is the one proposed by Dai and Wang [28]: SEGNN is a node classification network that uses subgraph similarity to find K-nearest nodes for performing the prediction.
While they represent a promising direction, the current research seems to overlook at similar methods proposed for the image counterparts.Indeed, none of the so far published selfexplainable architectures for GNNs compares against Protop-Net, TesNet, or similar architectures.Thus, one of the goal of our work is to encourage future research to compare with these kind of models by providing an implementation ready to be used.
Several proxy metrics have been proposed in order to quantitatively evaluate the explanations.As most of XAI for GNNs methods aim to find structural motifs that are relevant for a certain prediction, many metrics focus on input attribution approaches.Among them, we consider the Fidelity+ [20], defined as the average of the difference of accuracy between the prediction of the graphs and the predictions after masking out relevant nodes and edges.

III. UNSUPERVISED PROTOTYPE-BASED GRAPH NETWORKS
This section analyzes the different strategies needed to adapt prototype-based image recognition networks to graphs.First, we present the details of ProtoPNet and TesNet, and then we provide the mathematical formulation for their graph application.
A. Prototype-Based Neural Networks 1) ProtoPNet: ProtoPNet's architecture consists of a convolutional neural network f , followed by a prototype layer g P and a final linear layer l without bias.
Given an input image x, f extracts features organized as a matrix of dimensions H×W ×D (1) g P then computes squared L 2 -distances between the m prototypes and all the possible patches z i ∈ R H P ×W P ×D of z and returns the minimum distance where p j is the jth prototype learned by the layer g P p j ∈ R H P ×W P ×D , H P ≤ H, W P ≤ W, j ∈ {1, . .., m}.
(3) Finally, the distances are converted into similarity scores, using a logarithmic activation function, and the last layer l takes them as input and returns the prediction The prototypes are allocated such that there exist m k prototypes for each class k ∈ {1, . .., K}.The subset of allocated prototypes to the class k are defined as P k ∈ P.
Each epoch consists of the following three phases: 1) stochastic gradient descent of layers before the last layer; 2) projection of prototypes; 3) convex optimization of last layer.
First phase: The first phase optimizes the following problem: where w are the weights of the convolutional layers f and λ 1 and λ 2 are two penalty terms set to 0.8 and 0.08, as proposed by Chen et al.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.[6].CrsEnt is the cross entropy loss for the classification task, and Clst and Sep are two cost functions, which encourage and penalize same-class and different-class prototypes similarity, respectively.
Second phase: The second phase consists of replacing the jth prototype of the class k p j ∈ P k with the closest point argmin z i L 2 (p j , z i ) over all the patches z i of the features extracted from images of the class k.
Third phase: In the last phase, the weights w l of the last layer are optimized starting from the following initialization: where the (k, j)th entry of w l is the weight between the output g P j and the logit of the class k.This setting induces the model to learn the weights such that the similarity to a class k prototype should increase the predicted probability that the image belongs to class k, whereas the similarity to a nonclass k prototype should decrease class ks predicted probability.
2) TesNet: TesNet shares the same learning paradigm as ProtoPNet, organized in three phases, but it uses an alternative formulation for the optimization problem: In this case, ClassSep encourages the similarity between the basis concepts of different classes and Orth enforces the orthonormality among the basis concepts of a class.1) Prototype-Based Explanations: ProtoPNet and TesNet allow producing both global and local explanations.The former can be obtained by inspecting the images represented by prototypes, while the latter is obtained by looking at the similarities of a certain input to the various prototypes.Additionally, we can visualize the image patches that activate the most a prototype by plotting its activations (similarities) on the different pixels in a heatmap.In particular, given an image x, we can extract the activations of the (i, j)th patch of x with respect to the kth prototype as where s : (R D , R D ) − → [0, 1] is a similarity function between the embedding and the prototypes, such as the logarithmic activation for ProtoPNet and the cosine similarity for TesNet.

B. Graph Neural Networks
GNNs are a particular kind of neural networks designed to work with graph domains.Most of them use a message passing schema formed by two basic operations: aggregation and update.The generic kth layer of a graph network can be defined in the following way: ) where v is the current node, h v is the feature vector of v, and N (v) is the set of the neighbor nodes of v.
GCNs [16] update the representation by aggregating an average contribution from the neighboring nodes, GATs [17], instead, weight the aggregation mean by learning self-attention scores between couples of node.Finally, GINs [18] base aggregation and update on the Weisfeiler-Lehman test.The three aggregate functions are summarized as follows: where W (k) are the learnable weights of the convolutional layers.The structure of a GNN is composed of a sequence of convolutional layers that generate a latent representation of the nodes.For node classification tasks, a final feed-forward layer takes this representation and produces the logits.For graph classification, a graph representation is obtained by means of a readout layer, such as a global pooling layer, followed by a final classification layer that makes the prediction [15].

C. Adapting Prototype Networks to the Graph Domain
Recall that a graph G = (N, E) is a tuple of nodes and edges that connect them.We can represent graphs by using two matrices: the node features matrix X ∈ R |N |×F , where F is the number of node features, and the adjacency matrix A ∈ {0, 1} N ×N .Therefore, a prototype-based GNN is composed of the following three functions.
1) f , is a GNN feature extractor formed of graph convolutional layers that return node embeddings: 2) g P , is a prototype layer that projects the node embeddings and calculates the similarities between them and the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
3) l, is a classification layer that takes the prototype similarities and outputs the probabilities of a graph, or a node, to belong to a certain class Additionally, in the case of graph classification, as g P returns the similarities between prototypes and projected node embeddings, we need to add a global max-pooling readout layer after g P , which selects the highest similarities between the nodes and each prototypes.The resulting similarity of the graph to the jth prototype d j is calculated as follows: In this way, l classifies using prototype similarities at graph level.Summarizing, since node embeddings encode information of a subgraph as big as the number of GNN layers, the prototype layer compares the presence of particular patterns inside the graph for predicting the graph or node properties.Figs. 1 and  2 show a schema of the described architecture for performing graph and node classification, respectively.Note that while in the original TesNet and ProtoPNet the prototypes identify patches of the images, here the prototypes are obtained from the node embeddings, and thus, their meaning is different.In fact, since the node representation over k convolutions contain information of the k-hop subgraphs around the nodes, we are actually embedding the information of subgraphs of radius k inside the prototypes.
1) Prototype-Based Explanations: Inspecting prototypes is a key procedure for understanding the model's reasoning process.Chen et al. [6] and Wang et al. [7] provide prototype visualization by means of a heat-map that shows the prototype activations over different image patches.This procedure allows them to identify the areas in the image that mostly activate a certain prototype.Similarly, we can identify the most important subgraphs learned by the model.
While images usually have thousands of pixels, in most datasets graphs might be composed of less than a hundred nodes.This difference would result in a less sparse visualization of the prototype activations.Instead of showing the prototype activations over all the nodes, we propose to highlight only the k-hop subgraph around the node that is mostly similar to the specific prototype, where k is the number of GNN layers.This is justified by the fact that by using k GNN layers, each node embedding contains information about the node and the neighbor nodes within a radius of k edges.
For node classification, instead, as the datasets are composed of only one graph, where nodes are masked throughout training and validation procedures, we propose to extract the (k+2)-hop subgraphs around the most similar node and only highlight the k-hop subgraphs with the prototype activations.This allows us to extract a small subgraph from the original one, without the need of plotting all the nodes.
The proposed technique allows dropping unnecessary attributions derived from prototype activations that are not actually used for the prediction, by only selecting them that are within a k-hop subgraph from the activated node.

IV. EXPERIMENTS
In this section, we analyze the classification and explanation performances of ProtoPNet and TesNet applied to the graph domain: first, we present the experimental setting; subsequently we compare the classification performances on the presented datasets with black-box and self-explainable models, including an analysis of the impact of the number of prototypes and transfer learning; finally, we study the explanations of the trained models providing visual and quantitative results.

A. Experimental Setup
We train and evaluate the models on the following graph and node classification datasets: 1) MUTAG [29], BBBP [30], and BACE [31], which are molecule datasets for graph classification, where atoms and bonds are encoded as nodes and edges, respectively, and the labels classify molecules as either active or inactive against some targets.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I CLASSIFICATION PERFORMANCES
2) BA-shapes [23] and BA-Community [23], which are two synthetic node classification datasets that contain graphs obtained by the Barabási-Albert model with the house-like five node motifs.3) BA-2Motifs [24], which is a synthetic graph classification datasets where graphs are discriminated by the presence of a house-like motif between their nodes.4) Tree-Grid [23], which is a synthetic node classification datasets where nodes are labeled by their presence in a grid motif inside some tree-shaped graphs.5) Cora, Citeseer, and Pubmed [32] are three widely used real-world benchmark datasets.They consist of graphs where nodes represent scientific publications labeled in several classes while links represent the citations.While for Cora and Pubmed we use the publicly available splits, for Citeseer we refer to the version from [28] and [33] for comparison purposes.For each dataset, we compare three black-box models composed of 3 GCN, GAT, and GIN layers, respectively, against their ProtoPNet and TesNet variants, obtained by adding the prototype layers.Following the dive-into-graphs benchmark repository [23], for the molecular graph datasets, the GNN layers are formed of 128 units, and each GNN layer's output is activated by an ReLU unit.We set the number of prototypes per class to 10, as done by [6] and [7].We repeat each experiment 15 times to compare different seeds for statistical significance.

B. Classification Performances
In this section, we test whether adding the prototype layers leads to a decrease of performances.Wang et al. [7] and Chen et al. [6] show that, for images, this is not necessarily true, and self-interpretable models sometimes manage to perform better than the black-box ones.
1) Small Datasets: In Table I, we report the accuracy scores for the graph and node classification tasks using GCN, GAT, and GIN networks as backbone.We observe that, in general, explainable models reach higher accuracy than the black-box counterparts.Furthermore, in most cases, the TesNet models reach the highest score, thus confirming the findings in the image domain.Our hypothesis is that by using basis concepts in place of the prototypes, we encourage learning a disentangled prototype space that is crucial for the graph domain too.
More in detail, we see that TesNet based GIN models achieve the highest performances for almost all the tasks.While for the molecules graph classification the difference between TesNet models and the baseline is subtle, for the synthetic datasets, and in particular for node classification, it gets sharper.We relate these findings to the observations by Xu et al. [18], where they show that GNNs suffer from not being able to distinguish between some graph structures and propose GIN as a remedy.
These results demonstrate that prototype-based models, such as TesNet and ProtoPNet can be adapted to the field of graphs, and they can also be employed for training self-interpretable models for the node classification task.
2) Big Datasets: Now, we compare the performances with other state-of-the-art models on bigger datasets.In particular, we use three big node classification datasets and train and compare GCN-TesNet and GCN-ProtoPNet models with two state-ofthe-art self-explainable models: SEGNN [28] and ProtGNN [27].We also show the accuracy score for the black-box GCN model for comparison purposes (see Table II).
We observe comparable performances on Cora and Pubmed for both ProtoPNet and TesNet with respect to the top performer (SEGNN).On Citeseer, the performances are lower than SEGNN but higher than ProtGNN.The lower performance is probably due to the difficulty of generalizing the publications with a limited set of prototypes.Moreover, this gap is mitigated by the higher explanation performances of TesNet and ProtoP-Net (see Section IV-C).
3) Impact of the Number of Prototypes: Here, we investigate the impact of the number of prototypes per class on the accuracy of the transparent models.We test three different values (5,   10, and 15) on four datasets, using GIN as backbone model.Although in some cases using a varying number of prototypes determines an increase or decrease in the model's performances, we do not find a general pattern over the datasets (see Fig. 3).The reason is that, for most datasets, the discrimination between the classes is represented by few patterns and the models, even if they have different prototypes, end up learning the same subgraphs.On the other hand, the number of prototypes is crucial when we know a priori that there exists a specific number of patterns to look at.
Therefore, the number of prototypes is a hyperparameter to be tuned, possibly using the explanations returned by the model (see Section IV-C).
4) Impact of Transfer Learning: In this section, we investigate two ways for reproducing the transfer learning procedure in our setting.In fact, the models analyzed in the previous section are trained from scratch.This differs from the image domain, where the networks are obtained by means of transfer learning of models trained on large corpora.However, for graphs there is not any available library of pretrained GNNs.Moreover, while when dealing with images it is possible to preprocess the input to certain shapes and channels, on the other hand, with GNNs, node, and edge features might vary between different tasks and domains.
Therefore, we test pretraining on a dataset and then use the learned weights of the convolutional layers to initialize the explainable networks for a new training phase.In this way, we end up having a task-specific feature extraction network to which we add a prototype layer.In particular, we compare the performance when the network is pretrained both on the same dataset and on another dataset that share the same features.For instance, the datasets BBBP and BACE represent the molecules with the same features, hence, we can use the first for pretraining and the second for the standard training procedure.
Using pretrained models on the same dataset, both TesNet and ProtoPNet almost reproduce the same performance of the earlier results, although in general the values are slightly lower (see Fig. 4).Overall, transfer learning tests do not provide any advantage and the performances decrease when the pretrained network is built on another dataset.This issue represent the main dissimilarity between the application of ProtoPNet and TesNet to images and graphs.For the molecule dataset, we hypothesize that the datasets are particularly different, and the molecules belong to separate chemical spaces.This problem is noted as domain of applicability [34] in chemistry.Similarly, on the other datasets, the substructures in the graphs are dissimilar from each other, and, for this reason, the features learned on BA-Shapes might not be adequate for BA-Community.

C. Explanations
In this section, we analyze the explanations provided by ProtoPNet and TesNet on the graph domain.We test different ways for computing them, we compare their quality against some post hoc methods, and finally we provide some visual examples to show of how users can exploit this type of explanations to extract insights about the model behavior.
We start the analysis by comparing the explanations obtained with two different methods (see Section III-C1): the original method from ProtoPNet and TesNet, plotting the activation scores over the whole input; and our modification, where we restrict the explanation only to the nodes involved in the actual prediction.To compare them, we use the Fidelity+ score, defined as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Although in terms of Fidelity+, the explanation produce the same results (see Table III), the visualization of our method appears clearer (see Fig. 5) and gives insights on the activations of a certain prototype: the model is recognizing the house-like motif with the chosen prototype.Conversely, the classic method does Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE IV EXPLANATIONS PERFORMANCES ON BIG DATASETS
not highlight the specific subgraph that is used in the reasoning process.These results show that neglecting the activations of the other nodes does not reduce the performance of the explanation and, at the same time, provides a more "human-readable" interpretation.
Once the method is chosen, we study the goodness of explanations provided by such models, by comparing them against explanations computed on both the interpretable and the blackbox models using GNNExplainer and DeepLift (see Fig. 6).
White-box models' explanations report relatively higher scores than the post hoc ones.For the graph classification task, the explanations of ProtoPNet are more faithful than the ones of TesNet, while for the node classification this is inverted.In some cases, the black-box models reach higher Fidelity+ than the white box ones, but with a small difference.This is somehow compensated by the smaller amount of time required by ProtoPNet and TesNet for producing the explanations: while DeepLift, using the gradients, takes similar time to ProtoPNet and TesNet for explaining the prediction, GNNExplainer takes up to 20 times longer for optimizing the graph mask.
Regarding the self-explainable models, we observe higher fidelity scores on the explanations extracted by ProtoPNet and TesNet (see Table IV).Indeed, ProtoPNet reports the highest fidelity score on the Citeseer dataset, while TesNet is the leading one explainer on Pubmed and Cora.We justify these findings by the fact that masking out relevant nodes the has a high impact on the prototype activations and affects the prediction.On the other hand, SEGNN, not having a fixed number of prototypes, might relate on the similarity to other labeled nodes.
We also report the ROC-AUC scores of the predictions of masked inputs, it is expected to have lower ROC-AUC scores when relevant edges are masked out.While DeepLIFT obtains the lowest values of ROC-AUC score, TesNet generally reports lower values compared to SEGNN and ProtGNN, which, in accordance with the Fidelity+ score, confirms the better explainability of the adapted models.
Finally, to inspect the model behavior, we can visually study which are the most activated prototypes for the elements of the dataset and plot the subgraphs that correspond to the learned prototypes, as shown in Section III-C1.Fig. 7 shows the learned prototypes for a ProtoPNet GCN model trained on Ba-2Motifs.For the class 0, all the prototypes match the house motif.Instead, the prototypes of class 1 match a particular 3-node loop motif, captured in different positions on the graph.We further investigate this result by checking the activations of the class 1 graphs with respect to these prototypes, and we find that over 86% of the samples share an activation higher than 0.95, likely due to an artifact resulting from the automated generation of the dataset.Fig. 8 shows another example, where we identify the most important substructures of the best TesNet GIN model trained on the tree-grid dataset.In particular, we use the explanations to understand why the model was not able to correctly score mislabeled nodes.Similarly to the example of Ba-2Motifs, we can obtain global interpretation of the model behavior by analyzing the subgraphs that the model learns as the most important.The first two rows highlight the subgraphs matched for predicting the nodes that do not belong to a grid shape subgraph.Conversely, the last two rows clearly grasp the grid shape for predicting the nodes.
Additionally, Fig. 9 shows the local explanations for a mispredicted node.It reports the activations of the nodes for each prototype together with the similarities between the analyzed node (red border) and the prototypes.The reason for the misprediction is that the node shares a similarity of 0.7 with class 1 prototype due to the high degree of the node, which makes it more similar to a "grid node" rather than a "tree node." These analyzes show that we can understand the reasoning process that led to a specific prediction, both at local and global level, by visualizing the activations of the nodes to the different prototypes.

V. CONCLUSION
This article analyzed the application of ProtoPNet and TesNet for performing graph and node classification.We found that, in contrast to the image domain, the best models are obtained by starting from randomized weights.We confirmed that these models reach equal or higher performances of black box ones.Additionally, we studied the interpretability power of the architectures, both in terms of global and local explanations, and we found that they produce more faithful explanations than post hoc methods and competitors.In particular, we showed that while maintaining comparable classification performances with stateof-the-art self-explainable networks, our models outperform the others in terms of explanation capability.
We think that the findings of this work could open the path for further development of interpretable graph architectures.Future works might investigate the use of metrics better tailored to the graph domain, or methods to automatically modulate the number of prototypes learned by the model.Additionally, we release the open source code to encourage further research.
p j ∈P y i min z∈patches(f (x i )) − z T p j ||z|| p j / ∈P y i min z∈patches(f (x i ))

Fig. 3 .
Fig. 3. Accuracy scores of the explainable GIN models varying the number of prototypes on the BBBP, BACE, BA-Community, and Tree-Grid datasets.

Fig. 4 .
Fig. 4. Accuracy scores of the transfer learning models using the GIN architecture.BBBP and BACE share the same node features as well as BA-Shapes and BA-Community.

Fig. 5 .
Fig. 5. Comparison between classic (a) and our (b) visualization of ProtoPNet explanations.The prototype activations are highlighted with a shade of colors from light blue, low activation, to dark blue, high activation.The node used for the prediction is marked with a red border.

Fig. 6 .
Fig.6.Fidelity+ score comparison between explanations computed using post hoc (orange and green) and intrinsic methods (blue).

Fig. 7 .
Fig. 7. Subgraphs learned by the ProtoPNet GCN model on the Ba-2Motifs dataset.The most activated node is marked with a red borderline and the subgraph nodes are highlighted in blue: the darker the node the higher the activation.

Fig. 8 .Fig. 9 .
Fig. 8. Subgraphs learned by the TesNet GIN model on the tree-grid dataset.The most activated node is marked with a red borderline and the subgraph nodes are highlighted in blue: the darker the node the higher the activation.

TABLE III COMPARISON
BETWEEN VANILLA AND K-HOP METHOD FOR EXTRACTING EXPLANATIONS FROM PROTOPNET AND TESNET