Abstract:
This paper presents an adversarial training framework WasserTrain for improving model robustness against the adversarial attacks in terms of the Wasserstein distance. Fir...Show MoreMetadata
Abstract:
This paper presents an adversarial training framework WasserTrain for improving model robustness against the adversarial attacks in terms of the Wasserstein distance. First, an effective attack method WasserAttack is introduced with a novel encoding of the optimization problem, which directly finds the worst point within the Wasserstein ball while keeping the relaxation error of the Wasserstein transformation as small as possible. The proposed adversarial training frame-work utilizes these high-quality adversarial examples to train robust models. Experiments on MNIST show that the adversarial loss arising from adversarial examples found by our method is about three times as much as that found by the PGD-based attack method. Furthermore, within the Wasserstein ball with a radius of 0.5, the WasserTrain model achieves 31% adversarial robustness against WasserAttack, which is 22% higher than that on the PGD-based training model.
Published in: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 23-27 May 2022
Date Added to IEEE Xplore: 27 April 2022
ISBN Information: