Loading web-font TeX/Math/Italic
Heatmap Assisted Accuracy Score Evaluation Method for Machine-Centric Explainable Deep Neural Networks | IEEE Journals & Magazine | IEEE Xplore

Heatmap Assisted Accuracy Score Evaluation Method for Machine-Centric Explainable Deep Neural Networks


Overall architecture of the proposed HAAS evaluation scheme.

Abstract:

There have existed many studies about the explainable artificial intelligence (XAI) that explains the logic behind the complex deep neural network called a black box. At ...Show More

Abstract:

There have existed many studies about the explainable artificial intelligence (XAI) that explains the logic behind the complex deep neural network called a black box. At the same time, researchers have tried to evaluate the explainability performance of various XAIs. However, most previous evaluation methods are human-centric, that is, subjective, where they rely on how much the results of explanation are similar to what people’s decision is based on rather than what features actually affect the decision in the model. Their XAI selections are also dependent of datasets. Furthermore, they are focusing only on the output variation of a target class. On the other hand, this paper proposes a robust heatmap assisted accuracy score (HAAS) scheme over datasets that helps selecting machine-centric explanation algorithms to show what actually leads to the decision of a given classification network. The proposed method modifies the input image with the heatmap scores obtained by a given explanation algorithm and then puts the resultant heatmap assisted (HA) images into the network to estimate the accuracy change. The resultant metric ( HAAS ) is computed as a ratio of accuracies of the given network over HA and original images. The proposed evaluation scheme is verified in the image classification models of LeNet-5 for MNIST and VGG-16 for CIFAR-10, STL-10, and ILSVRC2012 over totally 11 XAI algorithms of saliency map, deconvolution, and 9 layer-wise relevance propagation (LRP) configurations. Consequently, for LRP1 and LRP3, MINST showed largest HAAS values of 1.0088 and 1.0079, CIFAR-10 achieved 1.1160 and 1.1254, STL-10 had 1.0906 and 1.0918, and ILSVRC2012 got 1.3207 and 1.3469. While LRP1 consists of \epsilon -rules for input, convolutional, and fully-connected layers, LRP3 adopts a bounded-rule for an input layer and the same \epsilon -rules for other layers as LRP1. The consistency of evaluation results of HAAS and AOPC has been compared by means of Kullb...
Overall architecture of the proposed HAAS evaluation scheme.
Published in: IEEE Access ( Volume: 10)
Page(s): 64832 - 64849
Date of Publication: 20 June 2022
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.