Loading [MathJax]/extensions/MathMenu.js
Interpreting Image Classifiers by Generating Discrete Masks | IEEE Journals & Magazine | IEEE Xplore

Interpreting Image Classifiers by Generating Discrete Masks


Abstract:

Deep models are commonly treated as black-boxes and lack interpretability. Here, we propose a novel approach to interpret deep image classifiers by generating discrete ma...Show More

Abstract:

Deep models are commonly treated as black-boxes and lack interpretability. Here, we propose a novel approach to interpret deep image classifiers by generating discrete masks. Our method follows the generative adversarial network formalism. The deep model to be interpreted is the discriminator while we train a generator to explain it. The generator is trained to capture discriminative image regions that should convey the same or similar meaning as the original image from the model’s perspective. It produces a probability map from which a discrete mask can be sampled. Then the discriminator is used to measure the quality of the sampled mask and provide feedbacks for updating. Due to the sampling operations, the generator cannot be trained directly by back-propagation. We propose to update it using policy gradient. Furthermore, we propose to incorporate gradients as auxiliary information to reduce the search space and facilitate training. We conduct both quantitative and qualitative experiments on the ILSVRC dataset. Experimental results indicate that our method can provide reasonable explanations for predictions and outperform existing approaches. In addition, our method can pass the model randomization test, indicating that it is reasoning the attribution of network predictions.
Page(s): 2019 - 2030
Date of Publication: 06 October 2020

ISSN Information:

PubMed ID: 33021938

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.