Feature Interpretation Using Generative Adversarial Networks (FIGAN): A Framework for Visualizing a CNN’s Learned Features

Convolutional neural networks (CNNs) are increasingly being explored and used for a variety of classification tasks in medical imaging, but current methods for post hoc explainability are limited. Most commonly used methods highlight portions of the input image that contribute to classification. While this provides a form of spatial localization relevant for focal disease processes, it may not be sufficient for co-localized or diffuse disease processes such as pulmonary edema or fibrosis. For the latter, new methods are required to isolate diffuse texture features employed by the CNN where localization alone is ambiguous. We therefore propose a novel strategy for eliciting explainability, called Feature Interpretation using Generative Adversarial Networks (FIGAN), which provides visualization of features used by a CNN for classification or regression. FIGAN uses a conditional generative adversarial network to synthesize images that span the range of a CNN’s principal embedded features. We apply FIGAN to two previously developed CNNs and show that the resulting feature interpretations can clarify ambiguities within attention areas highlighted by existing explainability methods. In addition, we perform a series of experiments to study the effect of auxiliary segmentations, training sample size, and image resolution on FIGAN’s ability to provide consistent and interpretable synthetic images.


I. INTRODUCTION
Artificial intelligence (AI) systems, particularly convolutional neural networks (CNNs), have become increasingly popular in biomedical imaging for their ability to perform and automate a variety of complex imaging tasks, including classification, segmentation, image registration, modality translation, and synthetic image generation [1], [2], [3], [4]. Their performance over traditional approaches is attributed to their increasingly complex architectures, which model highly nonlinear relationships in imaging tasks. However, their architectural complexity also makes it difficult to explain the imaging The associate editor coordinating the review of this manuscript and approving it for publication was Kaustubh Raosaheb Patil . features used in each image evaluated by the CNN. Understanding the meaning of these underlying features in the context of a CNN task would both improve the transparency of these models for clinical application and potentially uncover new biomarkers for disease. Multiple post hoc methods for CNN explainability have been proposed but provide only partial visibility into the characteristics used by CNNs during inference, limiting the translation of these methods into clinical radiology practice [5].

A. LIMITATIONS OF COMMONLY USED EXPLAINABILITY METHODS
Several methods have been proposed to interpret network decisions [6], [7], [8], the majority of which being attribution methods, which highlight salient areas on an input image important for CNN prediction. These static visualizations each provide localization of the portions of the image utilized by the CNN in each inference, though not the specific characteristics within the active region of the image. It is perhaps for this reason that their effectiveness for explainability has been subject of debate [9], [10], [11], [12]. Importantly, current methods are not able to convey relevant texture characteristics within the region of activation. This problem is particularly relevant for diffuse disease processes where large portions of the image are potentially relevant for classification.
For example, Fig. 1 shows several commonly used attribution (i.e. saliency) maps from two CNNs designed to evaluate two diffuse disease processes, pulmonary edema and hepatic fibrosis. The first is a regression CNN designed to infer log NT-pro B-type natriuretic peptide (BNPP) from chest radiographs-a task related to assessing the severity of pulmonary edema [13]. The second is a classification CNN designed to classify liver magnetic resonance images (MRI) as having adequate or suboptimal contrast uptake for cancer screening and surveillance-a task related to assessing the severity of liver fibrosis [14].
In both cases, the attribution maps highlight areas of attention, but it is not clear what specific characteristics within those areas are relevant to diagnosis. Any of the anatomic or texture characteristics within the highlighted regions could be responsible. In the case of the chest radiograph algorithm, attention could be attributed to any of the following-the vasculature, the interstitium, air spaces, or the edge of the cardiomediastinal silhouette. Similarly, for the liver MRI algorithm, attention is centered on the portal or heptic veins, but could be attributed to any of the following-the veins themselves, contrast with hepatic parenchyma, or texture. With current methods, the relative importance of individual characteristics within the localized region is unknown. For CNN explainability, the ''what'' is likely as important as the ''where''. Current methods provide localization information but their effectiveness is limited in medical imaging, where often multiple objects are spatially co-localized.
With these limitations in mind, we therefore propose Feature Interpretation using Generative Adversarial Networks (FIGAN), a framework for the dynamic visualization of the features used by a CNN. FIGAN generates synthetic images that smoothly change with a CNN's features. The evolution of these synthetic images is then used to elicit the features' meanings. We apply FIGAN to two independently developed source CNNs and show that the resulting feature interpretations can clarify ambiguities within the attention areas highlighted by the approaches in Fig 1. In addition, we perform a series of experiments to study the effect of auxiliary segmentations, training sample size, and image resolution on FIGAN's ability to provide consistent and interpretable synthetic images. This paper is organized into the following sections. Section II describes existing explainable AI approaches, with an emphasis on medical imaging. Section III introduces the proposed FIGAN framework. Sections IV and V evaluate the performance of FIGAN when applied to two separate source CNNs and Section VI presents experimental results. We discuss the results and conclude in Section VII.

II. EXPLAINABLE AI IN MEDICAL IMAGING
There are several surveys detailing a variety of explainable AI methods for CNNs, specifically applied to medical imaging [6], [7], [8]. We provide a brief summary of commonly used explainable AI methods to motivate the advantages of our proposed FIGAN framework for CNN feature interpretation.

A. ATTRIBUTION METHODS
Most explainable AI algorithms are attribution methods, a broad class of algorithms that highlight salient areas of an input image important for a CNN's prediction. Attribution methods are widely used since they are model-agnostic and often readily available as open-source implementations in a variety of neural network packages.

1) GRADIENT-BASED SALIENCY MAPS
The most popular form of attribution is gradient-based saliency, which computes the gradient of a prediction with respect to the pixels of the input image. Visualization of these gradients can then be interpreted as the influence of input pixels or regions on the CNN's prediction. Many gradient-based approaches have been proposed [15], [16], [17], [18], [19], [20], [21], [22], mostly differing in how the gradient is computed, however, vanilla gradient (VG) maps [15] and gradient weighted class activation mapping (Grad-CAM) [18] are most popular.

2) PERTURBATION-BASED SALIENCY MAPS
Perturbation maps are another form of saliency that visualize the effect of input feature perturbations on a CNN's prediction. Perturbed areas of the input image showing a relatively large effect on the CNN predictions are highlighted for interpretation. Popular approaches include Shapley values [23] and local interpretable model-agnostic explanations (LIME) [24].
Although saliency maps are widely used, studies have shown that the highlighted areas in these images may not have clinically relevant interpretations or be repeatable across aspects of the model training process, such as weight initialization [11], [12]. In addition, assuming a saliency map highlights clinically meaningful anatomical structures, the interpretation of the highlighted areas may remain unclear, especially if referring to imaging features such as texture or morphology. Saliency maps are also local interpretability methods and do not provide a global understanding of the CNN features used for prediction since the maps are generated at the image level [25]. Example saliency maps from a lung regression CNN designed to infer log BNPP from chest radiographs and a liver classification CNN designed to classify liver MRI images as having adequate or suboptimal contrast uptake. Although attribution methods are useful for identifying the location of attention, they do not necessarily uncover the underlying anatomical or pathophysiological nature of this attention.

1) ATTENTION NETWORKS
In contrast to attribution methods, which are applied after CNN training, attention networks incorporate attention mod-ules directly within the CNN architecture to facilitate explainability of a CNN's predictions while improving CNN performance. Specifically, attention modules have been incorporated into a variety of architectures for classification [26] and segmentation [27]. Attention modules within the intermediate layers of a CNN function act as feature selectors, which enhance features important for prediction and suppress features that are not. These enhanced features maps can then be visualized to determine the types of features important for the CNN's prediction.
However, this approach is not model-agnostic, and similar to attribution methods, provides only a localized understanding of the features used for prediction. Attention networks, and CNNs in general, also contain a large number of feature maps within each layer of the network, many of which showing little or no activation, making exploration of this feature space less tractable.

2) FEATURE ANALYTIC METHODS
Visualization of the CNN feature space using lowdimensional embeddings has also been used to improve understanding of CNN decisions. Feature embeddings are often visualized across the output class distributions and can be used to identify cases or clusters of cases that might be difficult for automated assessment. Methods such as principal components analysis, t-SNE, and UMAP are commonly used [8].
Concept methods [28], [29] enforce networks to learn features representing ''human-friendly'' high-level concepts by incorporating user-defined concepts during the training process in a supervised manner. These methods achieve competitive accuracy with conventional end-to-end models while enabling human-friendly interpretation of the model features.

3) GENERATIVE METHODS
A creative approach to improve CNN explainability is to generate synthetic images that maximize the activation of an output neuron or to augment the appearance of existing images to appear as a different output class. Early activation maximization methods used gradient descent to generate an image that maximizes the activation of a specific neuron, thereby visualizing the imaging features important for a CNN's prediction [15]. However, the synthetic images produced by these methods do not appear realistic, which make their interpretation quite difficult, especially in the context of medical images. Extending this idea, Nguyen et al. [30] proposed an end-to-end activation maximization approach that considerably improved the quality of synthetic images by prepending a deep generator network to the input layer of a given CNN. An alternative approach proposed by Seah et al. [31] involved permuting the CNN feature vectors corresponding to an input image until the CNN prediction changed. A generative network was then used to reconstruct an image from these permuted features. Changes in image appearance resulting from the feature permutation provided insight into the imaging features useful for the CNN decision.
Other generative adversarial network (GAN)-based methods have been proposed to ''disentangle'' a set of latent features that describe high-level concepts (e.g., pose, inten-sity) across a distribution of images in an unsupervised manner. Following GAN training, synthetic images are generated across the range of disentangled feature values to elicit feature meaning. Methods such as Information Maximizing GAN (InfoGAN) [32], Self-attention Conditional GAN (SCGAN) [33], and Improved Information Maximizing GAN (IInfoGAN) [34], incorporate additional information-theoretic constraints on the GAN minimax loss to learn these disentangled feature representations. Alternatively, Karras et al. [35] proposed StyleGAN, which included an entirely new generator architecture designed to perform the unsupervised task of disentangled feature learning. The quality of StyleGAN-generated images was then improved through changes in architecture and training in a later work [36]. The ability to embed outside images into a StyleGAN latent space was also studied [37].
We build on these generative methods to propose a model agnostic GAN-based explainability method that facilitates global interpretation of embedded features important for CNN prediction and show the resulting embedded feature interpretations can clarify the ambiguities observed in commonly used attribution maps.

III. METHOD
This retrospective study is Health Insurance Portability and Accountability Act (HIPPA)-compliant and was approved by the institutional review boards of the participating institutions with waived requirement for written informed consent.
Let C be a previously developed regression or classification CNN trained to map an input image array A to an output vector y (i.e. C : A → y), and let x represent a vector comprising a subset of features from the intermediate layers of C. We propose a framework that uses a conditional generative CNN to elicit meaning of the features x in the context of the task performed by source network C. Our proposed FIGAN framework, outlined in Fig. 2, is organized into three steps: 1) feature extraction, 2) generative model training, and 3) synthetic image analysis for feature interpretation.
In the feature extraction step, a set of images are propagated through C and features of interest, x, are extracted. Since intermediate layers of a network C are often large in dimension, we apply a supervised dimension reduction transformation, f , to x, resulting inx = f (x). In a second step, a conditional generative CNN G is trained to mapx back to the input image space of C, producing a synthetic imageÃ visually representing the embedded featuresx (i.e. G :x →Ã). In a final step, G is used to generate synthetic images across the range of observedx values, which are then assessed using image analysis techniques to provide feature interpretation.

A. FEATURE EXTRACTION AND DIMENSION REDUCTION 1) FEATURE EXTRACTION
Let A i , i = 1, . . . , I , represent a set of I two-dimensional input images to network C, each with dimension J × K × VOLUME 11, 2023 FIGURE 2. Proposed framework for Feature Interpretation using Generative Adversarial Networks (FIGAN). FIGAN is organized into three steps: 1) feature extraction, 2) generative model training, and 3) synthetic image analysis for feature interpretation. Images are propagated through a source CNN and features are extracted and reduced in dimension. A conditional generative CNN G is then trained to map the embedded features back to input image space. G is then used to generate a synthetic image sequence across the range of embedded feature values, which are then assessed using image analysis techniques to provide feature interpretation.
L, where L represents the channel axis. Note that images A i are not necessarily the same images used to train C. Each A i are propagated through C and feature vectors of interest x i , each of dimension U , are extracted. Although features in each x i can conceivably be extracted from any intermediate layer within C, for the remainder of the study, we focus only on the high level features from the final fully-connected layer of C since these are typically the features of interest used to explain a CNN's decisions.

2) DIMENSION REDUCTION
The number of features, U , in the final fully-connected layer is often quite large (∼2048 features) and sparse. We therefore reduce the dimension of this feature space using one of the many dimension reduction techniques available. In this study, we select partial least squares (PLS) [38], [39] for its ability to project x i , using a learned linear transformation f , onto an orthogonal subspace that explains the majority of variability in the source network outcome vector y i . In effect, the majority of information contained within features x i that is useful for completing the task performed by C is preserved within the low-dimensional embedding, Note that other supervised dimension reduction techniques, both linear and nonlinear, may also apply, but the exact nature of the dimension reduction is not the focus of this study.

3) SELECTION OFŨ
Traditionally, PLS selection ofŨ , the dimension of the space spanned byx i , is determined empirically by selecting the number of components that minimizes cross validation error for predicting y i [38]. However, preliminary experiments showed these approaches tend to select a larger number of components explaining minimal variation in y i , and subsequently produced synthetic images with a feature distribution dissimilar to that of real images. We therefore determined an a priori selection ofŨ that produces the most ''realistic'' synthetic images as measured by the Fréchet Inception Distance (FID) criterion [40] in Eq. 1, where x (A) , x (Ã) represent the features vectors from C corresponding to real image A and synthetic imageÃ, and µ x (A) ,µ x (Ã) and x (A) , x (Ã) represent the means and covariances of the source network feature vectors, respectively. A smaller FID indicates the feature distribution of the synthetic images is similar to the feature distribution of real images.

B. CONDITIONAL GENERATIVE MODEL
Following feature reduction tox i with preselected dimensioñ U , we train a conditional generative CNN to predict A i (i.e. the input images of C) using information inx i as input.
To accomplish this task, we select the pix2pix conditional GAN framework proposed by Isola et al. [41], given its proven ability to synthetize realistic images from minimally informative generator inputs (e.g., masks) across a variety of applications. Briefly, the pix2pix GAN comprises a generator network G and discriminator network D, which are adversarially trained through a minimax loss (Fig. 3). The generator G is trained to synthetize ''fake'' images that are indistinguishable from ''real'' images when evaluated by the discriminator D. In turn, the discriminator D is trained to determine if an image provided by G is ''real'' or ''fake''. We use the architectures for network G and patch network D with receptive field 70×70 as described in [41], with exception to the dropout layers in the generator, which are turned off during test time to enforce deterministic predictions forÃ i .
Letx i represent the input to generator G.x i is a J × K ×Ũ array where theũ th channel (ũ = 1, . . . ,Ũ ) is the product x iũ × 1 J ×K for the embedded feature x iũ and ones matrix 1 J ×K .The generator network mapsx i back to the input image space of C, producing synthetic imageÃ i = G(x i ). Input to D isÃ i ||x i in the ''fake'' case or A i ||x i in the ''real'' case, where a||b represents concatenation along the channel axis for two arrays a and b.
Networks G and D are adversarially trained using gradient descent in the same manner described in [41] with GAN loss (2) such that L cGAN (G, D) (Eq. 2) is the average binary cross entropy loss of the discriminator output neurons corresponding to each 70×70 patch and L L1 (G) is the L1-norm between the ''real'' A i and ''fake''Ã i images with some mixing parameter λ > 0. The result of the training process is a generator G capable of synthesizing realistic imagesÃ i that visually represent the meaning of the feature information contained withinx i , or equivalentlyx i .
Auxiliary information: Auxiliary information (e.g., segmentations, demographics, clinical data) relevant to the task performed by C, but not included in the development of C, may also be incorporated as additional channels in the input arrayx i to improve GAN performance. In this study, we incorporate segmentations of the structures of interest to control for the spatial and morphological variation of structures across images in the training set. We later show this regularization approach can improve GAN performance and further facilitate feature interpretation.

C. SYNTHETIC IMAGE ANALYSIS 1) IMAGE INTERPOLATION
Following generator training, we propagate eachx i through G and apply image analysis techniques to the resulting set of synthetic imagesÃ i for visual feature interpretation. Since the feature informationx i is not observed on a regular grid, we first interpolate the synthetic images across the range of feature values (illustrated in Fig. 4).
Letã ijkl represent the pixel in the j th row, k th column, and l th channel of synthetic imageÃ i . For each pixel j, k, l and each featureũ, we estimate a separate univariate smoothing function g where ε (ũ) ijkl represents error. Note the superscriptũ is included in each of the terms to emphasize separate functions for each feature. The function estimate,ĝ (ũ) jkl , is then determined using one of many univariate smoothing methods available. We found a computationally efficient 1D smoother with Gaussian kernel to suffice (Scipy v1.1.0) [42]. The function estimateĝ (ũ) jkl (Eq. 4) is then used to interpolate the synthetic images across a regular dense grid of valuesp r , r = 0, . . . , R, that partition the range ofxũ for eachũ, wherep r is the value of the r th grid point andâ (ũ) rjkl is the interpolated synthetic value at pixel j, k, l for featureũ at the r th grid point. The interpolated synthetic 2D image for featurẽ u at the r th grid point is represented byÂ (ũ) r . Note that we assumep r is the same for eachũ.

2) FEATURE INTERPRETATION
The interpolated synthetic imagesÂ (ũ) r can be viewed as a sequence that visualizes the ''main effect'' of each featureũ. That is, the evolution ofÂ (ũ) r as r increases, which can be explored using a gif or medical image viewer, elicits the features' interpretations. However, since viewing the sequence in this manner can render subtle changes imperceptible, we further facilitate visualization of the most salient changes in A (ũ) r (Eqs. 5 and 6) by identifying the dominant mode of variation in the sequence using the following M-dimensional principal components decomposition, where µ (ũ) is a J × K × L mean image across r, φ (ũ) m is the eigen image for the m th component, and ξ

D. FIGAN IN PRACTICE
Unlike attribution maps, which are local interpretability methods that are applied to individual images, FIGAN is a global interpretability method, providing a holistic interpretation of the features used for CNN prediction [43]. FIGAN output comprises a single synthetic image sequence and

IV. APPLICATION TO A LUNG CNN
We study the ability of the proposed FIGAN framework to facilitate global interpretation of features from a CNN developed to infer severity of pulmonary edema from chest radiographs, trained with concurrent serum NT-proBNP measurements (C).

A. SOURCE CNN ARCHITECTURE
The source network C is a ResNet152 regression CNN developed to predict log NT-pro B-type natriuretic peptide (BNPP), a biomarker for pulmonary edema, using 256 × 256 × 1 chest radiographs as input [13]. Note the radiographs were expanded to three channels to accommodate the ResNet152 pretrained weights.

B. IMAGING DATA AND PREPROCESSING
We collected I = 21, 374 radiographs (A i ) and log BNPP values (y i ) used to train the source network and an additional 500 for validation. The lungs, spine, and clavicles were then segmented using a 2D U-net CNN independently developed at our institution as part of another ongoing study [44]. Source radiographs were then used to extract the U = 2048 features (x i ) for each image from the final fully connected layer of C. A PLS linear transformation, f , was then estimated using x i and y i (i = 1, . . . , I ), and applied to x i , producing the embedded featuresx i . The embedded featuresx i were then broadcasted to the generator input arrayx i of dimension 256× 256 × Ũ + 5 , where the five additional input channels represent binary segmentations of the left and right lung, left and right clavicle, and spine as auxiliary information. Radiographs, segmentations, and each embedded feature were then scaled between -1 and 1 for GAN training.

C. GENERATIVE CNN TRAINING
We trained five separate pix2pix GANs withŨ = 1, . . . , 5, respectively, whereŨ = 1 refers to the GAN using only the leading PLS component as input,Ũ = 2 uses the first two leading components as input, and so on. For each GAN, generator and discriminator networks were trained with a batch size of one using λ = 100 and Adam optimizer with learning rate 0.0002 and momentum decay 0.5, which are identical to the hyperparameters specified in [41]. Each GAN was trained for 500,000 steps, and generator weights were saved every 10,000 steps. Total training time for each FIGAN instance required ∼48 hours of processing time on a NVIDIA Titan V graphics card.
To prevent generator overfitting, all inputs were aggressively augmented dynamically during training using random rotations (±25 degrees), horizontal and vertical shifts (±25 pixels), horizontal and vertical flips, and zoom (90%-110%). Although the augmentation process produces images outside of the native data distribution, information on the spatial orientation of the augmented images is implicitly contained within the auxiliary channels. This enabled sufficient regularization of the generator network while avoiding the dimin-ished performance associated with spatial transformations outside the data distribution.
The generator weights producing the minimum FID on the validation set across the number of PLS componentsŨ and training steps were used for synthetic image analysis and feature interpretation. Since FIGAN is a global interpretability method that facilitates a holistic understanding of embedded featuresx i using the synthetic image sequence [43], FIGAN is not applied to individual images during a testing phase. Therefore, generalizability assessment using a leave-out testing set is not necessary to evaluate FIGAN's utility for explainability.

D. FEATURE INTERPRETATION
Following GAN training, featuresx i from the validation set and a randomly selected segmentation were used to generate synthetic imagesÃ i . Synthetic images were then used to calculate the interpolated synthetic image sequenceÂ

E. RESULTS FOR THE LUNG CNN 1) FEATURE EXTRACTION AND SELECTION
FID traces for the validation images (Fig. 5) across the number of training steps indicatedŨ = 1 to 3 to be the optimal number of features for realistic image generation. The generator weights forŨ = 3 at training step 470,000 produced the minimum FID and was selected for subsequent analysis. The first three PLS components explained 77.74%, 5.14%, and 1.6% of the variation in log BNPP, respectively.

2) FEATURE INTERPRETATION
Instances of the synthetic image sequence,Â (ũ) r , and the leading eigen and deviation images for each feature are shown in Figs. 6 and 7. The leading principal components forũ = 1, 2, 3 explained 85.27%, 92.30%, and 58.98% of variation in the synthetic image sequenceÂ (ũ) r , respectively. The eigen and deviation images for bothũ = 1 andũ = 2 indicate cardiomegaly (enlarged heart) and increased perihilar vascularity as important imaging features for prediction. Feature two (ũ = 2) additionally emphasizes the importance of the chest wall soft tissues. Bothũ = 1 andũ = 2 show negative correlations with body habitus. Feature 3 (ũ = 3) exhibits similar salient areas to feature 1 but with increased upper lobe cephalization and peripheral vascularity. However note that u = 3 only explained 1.6% of variability in log BNPP and must be interpreted with caution.

3) COMPARISON TO VG AND GRAD-CAM
The VG map (Fig. 7) shows diffuse network attention ambiguously scattered throughout the radiograph. VOLUME 11, 2023 FIGURE 5. Loess smoothed validation FID traces when applying FIGAN to the lung and liver source CNNs. Lung CNN traces indicate FIGAN training using the leading one to three reduced features produces the smallest FID. Liver CNN traces strongly indicate FIGAN training using the first three features produces the smallest FID. Traces for both CNNs exhibit increases in FID when using more than the first three features during FIGAN training, suggesting feature distributions of synthetic images that are dissimilar to that of real images. We selected the first three reduced features for both source CNNs in our implementation. Grad-CAM, shows attention toward the left lung and right hilum, but cannot resolve which co-localized structures in these areas drive the regression of NT-proBNP.

V. APPLICATION TO A LIVER CNN A. SOURCE CNN ARCHITECTURE
The second source network is a ResNet50 classification CNN with customized feature fusion layer developed to classify hepatobiliary phase liver MRI images as having adequate or suboptimal contrast uptake for cancer screening and surveillance. Input to the liver CNN is a 224 × 224 × 1 masked liver image [14]. Note the liver images were expanded to three channels to accommodate the ResNet50 pretrained weights.

B. IMAGING DATA AND PREPROCESSING
We collected the I = 826 liver images (A i ), liver segmentations, and uptake classifications (y i ) used to train the source network and an additional 375 for validation. U = 2048 features (x i ) were extracted from the source CNN C for each liver image, and a learned PLS transformation applied to estimate the embedded featuresx i . The embedded featuresx i were then expanded to the generator input arrayx i  of dimension 256 × 256 × Ũ + 1 , where the additional input channel represents the liver segmentation as auxiliary information. Liver images, segmentations, and each embedded feature were then scaled between -1 and 1 for GAN training. Note liver images and segmentations were resized to 256 × 256 resolution prior to training.

C. GENERATIVE CNN TRAINING
GAN training was performed in the exact manner as for the FIGAN application to the lung CNN, with identical training parameters, augmentation, and FID criterion for generator weight selection. Total training time for each FIGAN instance required ∼36 hours of processing time on a NVIDIA Titan V graphics card.

D. FEATURE INTERPRETATION
FIGAN synthetic images, the leading eigen image φ (ũ) 1 , and its effect on deviations from the mean image µ (ũ) ±2 1 , were then reviewed by an abdominal radiologist (G.M.C) for feature interpretation. VOLUME 11, 2023 FIGURE 9. Eigen and deviation images for the lung (left) and liver (right) CNNs when applying FIGAN without auxiliary segmentations. FIGAN images provided the same feature interpretations but contain boundary artifacts due to changes in anatomical morphology. Subtle features, such as heterogeneous liver texture (ũ= 2), are not as prominent.

FIGURE 10.
Eigen and deviation images for the lung CNN when training FIGAN on 50 lung images without (left) and with (right) generator pretraining. FIGAN captures signal similar to that found in the complete training runs, with exception to the third lung feature, which showed a decrease in heart size. Generator pretraining produces results consistent with the training runs using the complete datasets.

E. RESULTS FOR THE LIVER CNN 1) FEATURE EXTRACTION AND SELECTION
FID traces for the validation images (Fig. 5) across the number of training steps indicatedŨ = 3 to be the optimal number of features for realistic image generation. The first three PLS components explained 35.79%, 19.25%, and 7.5% of the variation in contrast uptake adequacy (y i ), respectively. Seven PLS components would have been necessary to exceed 80% of the variation in y i .

2) FEATURE INTERPRETATION
Instances of the synthetic image sequence and the leading eigen and deviation images are shown in Figs. 6 and 7.
Leading principal components forũ = 1, 2, 3 explained 47.97%, 70.58%, 80.56% of variation in the synthetic image sequenceÂ (ũ) r , respectively. Note that FIGAN also simulated magnetic inhomogeneity in the synthetic images, which made visualization of subtle features more difficult. We therefore detrended the synthetic image sequence at each grid point r using a 32 × 32 kernel prior to the applying principal components analysis. The original eigen and deviation images without the detrending step can be found in the supplement (Figs. 13-14).
The eigen image forũ = 1 indicates high vessel-liver contrast as an important predictor for adequate contrast uptake. This is apparent in the deviation images, where changes in the leading score are correlated with the appearance of FIGURE 11. Eigen and deviation images for the liver CNN when training FIGAN on 50 liver images without (left) and with (right) generator pretraining. FIGAN images are similar to the images from the complete training run, with exception to the second liver feature, which did not show obvious changes in texture. The interpretation of texture is preserved when using a pretrained generator. vessels in the liver. Feature two (ũ = 2) was associated with heterogeneous liver texture, which is negatively associated with adequate contrast uptake, and is often indicative of liver fibrosis. Finally, feature 3 was associated with general liver brightness, and brighter livers typically indicate adequate contrast uptake.

3) COMPARISON TO VG AND GRAD-CAM
Both VG and Grad-CAM (Fig. 7) indicate attention toward the portal vein. However, the VG also shows activations scattered throughout the liver, and the meaning of these activations is unclear. FIGAN images provide additional insight, highlighting the liver texture (ũ = 2) and poor liver VOLUME 11, 2023 FIGURE 13. Synthetic image sequences when applying FIGAN to the liver source CNN without a detrending step. enhancement (ũ = 3) to make its decision that this particular liver has suboptimal contrast uptake.

VI. EXPERIMENTS A. PERMUTED FEATURE REFERENCE
To ensure that the variability across the synthetic image sequence was not attributed to GAN-generated noise, we applied FIGAN to the lung and liver CNNs, but permuted the featuresx iũ at the image (i) level as a reference for comparison (i.e. shuffledx iũ over the training set so that training images corresponding to a different set of embedded features). Corresponding scaled eigen and deviation images across features for each CNN on their respective validation sets (Fig. 8) indicate little or no meaningful changes in the synthetic image sequence, suggesting FIGAN is capturing meaningful signal between featuresx iũ and input images A i .

B. NO AUXILIARY INFORMATION
Auxiliary information in the form of anatomical segmentations may not be available. We therefore repeated the analysis without segmentations. Since we initially relied on the segmentations to heavily augment the images during training, in the absence of segmentations, we control spatial orienta-tion by including the plane z = x + 2y as an additional input channel inx iũ . Eigen and deviation images (Fig. 9) provided the same feature interpretations, but also contained boundary artifacts attributed to changes in anatomical morphology throughout the synthetic image sequence, as observed in the deviation images, especially for the liver. Subtle features, such as heterogeneous texture, were not as prominent.

C. SMALL SAMPLE SIZE
We randomly selected I = 50 images from the lung training set and trained FIGAN on these images using the same lung GAN settings, both with and without generator pretraining. The pretrained lung FIGAN generator used the weights from the liver FIGAN application. Similarly, we randomly selected I = 50 images from the liver training set and trained FIGAN on these images using the same liver GAN settings, both with and without generator pretraining. The pretrained liver FIGAN generator used the weights from the lung FIGAN application. Since the lung generator required additional channels to accommodate the corresponding anatomical segmentations, we expanded the liver generator input to 256 × 256 × Ũ + 5 , repeating the liver mask across the five channels for pretraining. Eigen and deviation images using the reduced sample with and without generator pretraining are shown in Figs. 10 and 11. For the lung CNN, FIGAN was able to capture signal similar to that found in the complete training runs, with exception to the third lung feature, which showed a decrease in heart size. FIGAN images for the liver CNN were also similar to the complete training run, with exception to the second liver feature, which did not show obvious changes in texture. However, in both cases, generator pretraining produced results consistent with the training runs using the complete datasets, suggesting FIGAN application to smaller datasets is feasible, especially when using pretrained generators.

D. APPLICATION TO HIGHER RESOLUTION IMAGES
We also explored the effects of high-resolution images on FIGAN performance. We applied FIGAN to the lung CNN using input images of size 512 × 512 × 1. The model was trained in the exact manner as done previously, except with a stride of four in the first convolutional layer of the discriminator to maintain a similar receptive field to the 256 × 256×1 application. The resulting eigen and deviation images (Fig. 12) showed greater detail in vascular changes across the synthetic image sequence, which is a subtle imaging feature that may be overlooked when using lower resolution networks.

VII. CONCLUSION
In this study, we showed the proposed FIGAN framework can generate synthetic images that elicit meaning of the features used by independently developed regression and classification CNNs. Application to a lung regression CNN revealed cardiomegaly (enlarged heart) and increased perihilar vascularity as the primary imaging features used to predict log BNPP, as well as body habitus, which is a lesser known correlate [45]. Application to a liver classification CNN identified vessel-liver contrast, which is the primary imaging feature used by radiologists to determine contrast uptake adequacy [14]. Although not the primary features for adequacy assessment, FIGAN also identified heterogeneous texture and liver brightness as secondary features of importance, which are also known correlates of liver pathology, and therefore, contrast uptake adequacy. We also showed that FIGAN images can clarify ambiguities in the interpretation of commonly used explainability visualizations, such as VG maps and Grad-CAMs.
Prior to GAN training, we used PLS to project features of the source CNN onto a low-dimensional orthogonal basis.
This has the dual purpose of reducing the dimension of the sparse space while mitigating feature confounding. The dimension reduction step was shown to successfully discriminate between features with different interpretations (e.g., vessel-liver contrast vs heterogeneous liver texture), as visualized in the synthetic images. However, we acknowledge that orthogonalization only implies linear independence, and that nonlinear correlations between embedded features may exist. These can appear in the synthetic images as an interaction between the embedded features, where the effect of one feature on synthetic image appearance depends on the value of another feature. For simplicity, we focused only on the ''main effect'' of each embedded feature and reserve the exploration of methods designed to understand the manifestation of feature interactions in synthetic images as a direction for future research.
To facilitate feature interpretation of the GAN-synthesized images, we also proposed an approach for summarizing the salient features of the synthetic images corresponding to the observed feature values through interpolation across a regular grid. As an alternative, we considered prediction of synthetic images on a regular grid directly from the generator network. Although useful for the single feature case (Ũ = 1), this approach suffers from the curse of dimensionality, exponentially increasing in the number of synthetic images for larger values ofŨ , and with no guarantee that all feature combinations are observable for larger dimensions, where the space is increasingly sparse (i.e. images with these feature values may not exist). In addition, following the interpolation step, we summarized the salient features across the synthetic image sequence using the leading principal component. One may alternatively visualize salient areas by subtracting the synthetic images corresponding to the feature extrema, but this approach fails to summarize variability across the entire sequence.
Through a series of experiments, we showed FIGAN identifies legitimate signal relating featuresx i and images A i , when compared to a permuted reference. FIGAN is also capable of preserving feature interpretation for small sample sizes (n = 50), especially when using generator pretraining, and can be applied to images of higher resolution with minimal changes in architecture. In the absence of segmentation auxiliary information, FIGAN still provided meaningful feature interpretation. However, we recognize this is primarily attributed to the regions of interest being relatively colocalized to the center of the image with minor morphological differences in anatomy across images. Nevertheless, the majority of radiological applications have standardized views of the regions of interest, and colocalization may be further improved using affine or rigid registration against some atlas reference.
In comparison with the existing literature, our approach is most similar to the application and method proposed by Seah et al. [29]. Seah et al. extracted features from a source CNN designed to predict BNP as a marker for congestive heart failure from chest radiographs. These feature vectors were then permuted until the classification of disease was removed. A generative network was then trained to synthesize the appearance of the chest radiographs with the disease removed using the permuted feature vectors as input. Their resulting visualizations of radiographs with and without disease produced interpretations and visualizations consistent with the leading eigen maps found in our study for the lung CNN.
Although similar to the method of Seah et al., there are key differences between our approaches, the primary difference being FIGAN's ability to discriminate between features with different interpretations. This is most apparent in the eigen images corresponding to the liver CNN, where FIGAN was capable of discriminating between vessel-liver contrast, texture, and intensity features through the PLS dimension reduction. In addition, FIGAN provides synthetic image sequences comprising images that smoothly change with a CNN's embedded features. These sequences can help us better understand the functional relationship between the embedded features and the source CNN input images, in contrast to the categorical visualization of images with and without disease in Seah et al.
Our proposed framework also has similarities to GANbased disentangled feature learning methods [32], [33], [34], [35], [36], [37], which also generate synthetic images across feature values to elicit their conceptual meaning. However, the general goal of these methods is to learn an embedding comprising features that represent high-level concepts across a distribution of images in an unsupervised manner. In contrast, the goal of our proposed method is not to learn such an embedding, but to facilitate interpretation of features from an existing embedding extracted from an independently developed source CNN. That is, we propose FIGAN as a model-agnostic explainability method that facilitates global interpretation of embedded features from an existing CNN.
As with other explainability approaches, we recognize FIGAN has limitations. GANs can be challenging to train, requiring more hyperparameters to tune and several ad-hoc strategies to prevent discriminator overfitting and GAN collapse [46]. However, we found the out-of-box implementation of the pix2pix architecture to be sufficient, owing to its robustness, but this may change with other applications. In addition, FIGAN operates independently on features from a source network, and therefore requires the source network to have satisfactory performance to ensure signal betweeñ x i and A i . In contrast, competing methods operate directly on the source CNN architecture and may be more useful for diagnosing poor CNN performance in this case.
In summary, the FIGAN framework can be a useful tool for interpreting the features used by a CNN to perform medical imaging related tasks, complementing current approaches that provide localizations. This is particularly relevant for diffuse diseases and images where anatomic structures are superimposed and closely adjacent (i.e. attribution methods may identify the ''where'' and FIGAN images may identify the ''what''). Holistic interpretations using these methods can improve the transparency and understanding of how classification and regression CNNs make predictions. This improved understanding can further facilitate the translation and acceptance of these algorithms into radiology clinical practice.

SUPPLEMENTAL FIGURES
See Figures 13 and 14.