Loading [MathJax]/extensions/MathMenu.js
A Differentiable Language Model Adversarial Attack on Text Classifiers | IEEE Journals & Magazine | IEEE Xplore

A Differentiable Language Model Adversarial Attack on Text Classifiers


The training phase of the DILMA architecture consists of several steps. First step: obtain logits from a Language Model for input. Second step: sampling from the multinom...

Abstract:

Transformer models play a crucial role in state of the art solutions to problems arising in the field of natural language processing (NLP). They have billions of paramete...Show More

Abstract:

Transformer models play a crucial role in state of the art solutions to problems arising in the field of natural language processing (NLP). They have billions of parameters and are typically considered as black boxes. Robustness of huge Transformer-based models for NLP is an important question due to their wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial attack scenario: check if a small perturbation of an input invisible to a human eye can fool a model. Due to the discrete nature of textual data, gradient-based adversarial methods, widely used in computer vision, are not applicable per se. The standard strategy to overcome this issue is to develop token-level transformations, which do not take the whole sentence into account. The semantic meaning and grammatical correctness of the sentence are often lost in such approaches In this paper, we propose a new black-box sentence-level attack. Our method fine-tunes a pre-trained language model to generate adversarial examples. A proposed differentiable loss function depends on a substitute classifier score and an approximate edit distance computed via a deep learning model. We show that the proposed attack outperforms competitors on a diverse set of NLP problems for both computed metrics and human evaluation. Moreover, due to the usage of the fine-tuned language model, the generated adversarial examples are hard to detect, thus current models are not robust. Hence, it is difficult to defend from the proposed attack, which is not the case for others. Our attack demonstrates the highest decrease of classification accuracy on all datasets(on AG news: 0.95 without attack, 0.89 under SamplingFool attack, 0.82 under DILMA attack).
The training phase of the DILMA architecture consists of several steps. First step: obtain logits from a Language Model for input. Second step: sampling from the multinom...
Published in: IEEE Access ( Volume: 10)
Page(s): 17966 - 17976
Date of Publication: 11 February 2022
Electronic ISSN: 2169-3536

Funding Agency:

Author image of Ivan Fursov
Skolkovo Institute of Science and Technology, Moscow, Russia
Ivan Fursov was born in Chelyabinsk, Russia. He received the master’s degree from the Skolkovo Institute of Science and Technology, in 2020. He is currently a Deep Learning Research Engineer at Tinkoff and continues to work on new approaches in adversarial attacks on NLP models. In his master’s thesis, he proposed a new adversarial attack on sequence classifiers.
Ivan Fursov was born in Chelyabinsk, Russia. He received the master’s degree from the Skolkovo Institute of Science and Technology, in 2020. He is currently a Deep Learning Research Engineer at Tinkoff and continues to work on new approaches in adversarial attacks on NLP models. In his master’s thesis, he proposed a new adversarial attack on sequence classifiers.View more
Author image of Alexey Zaytsev
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexey Zaytsev was born in Kharkiv, Ukraine. He graduated from the MIPT, in 2012. He received the Ph.D. degree in mathematics from the IITP RAS, in 2017. He is currently an Assistant Professor at the Skoltech. His research interests include development of new methods for sequential data, Bayesian optimization, and embeddings for weakly structured data. In his master’s thesis, he proposed a modification of Bayesian approac...Show More
Alexey Zaytsev was born in Kharkiv, Ukraine. He graduated from the MIPT, in 2012. He received the Ph.D. degree in mathematics from the IITP RAS, in 2017. He is currently an Assistant Professor at the Skoltech. His research interests include development of new methods for sequential data, Bayesian optimization, and embeddings for weakly structured data. In his master’s thesis, he proposed a modification of Bayesian approac...View more
Author image of Pavel Burnyshev
Skolkovo Institute of Science and Technology, Moscow, Russia
Huawei Noah’s Ark Laboratory, Moscow, Russia
Pavel Burnyshev was born in Perm, Russia. He graduated from the MIPT, in 2020. He is currently pursuing the Master of Science degree with the Skolkovo Institute of Science and Technology. He is also a Data Scientist at the NLP Department, Huawei, and works on adversarial attacks for machine translation.
Pavel Burnyshev was born in Perm, Russia. He graduated from the MIPT, in 2020. He is currently pursuing the Master of Science degree with the Skolkovo Institute of Science and Technology. He is also a Data Scientist at the NLP Department, Huawei, and works on adversarial attacks for machine translation.View more
Author image of Ekaterina Dmitrieva
HSE University, Moscow, Russia
Ekaterina Dmitrieva is currently pursuing the Ph.D. degree with the CS Faculty, HSE University. Her research interests include semantic parsing, in particular text2SQL models and adversarial attacks.
Ekaterina Dmitrieva is currently pursuing the Ph.D. degree with the CS Faculty, HSE University. Her research interests include semantic parsing, in particular text2SQL models and adversarial attacks.View more
Author image of Nikita Klyuchnikov
Skolkovo Institute of Science and Technology, Moscow, Russia
Nikita Klyuchnikov received the M.Sc. degree in information science and technology from the Skolkovo Institute of Science and Technology, the M.Sc. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, in 2016, and the Ph.D. degree in computational and data science and engineering from the Skolkovo Institute of Science and Technology, in 2021. His main research interests include ma...Show More
Nikita Klyuchnikov received the M.Sc. degree in information science and technology from the Skolkovo Institute of Science and Technology, the M.Sc. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, in 2016, and the Ph.D. degree in computational and data science and engineering from the Skolkovo Institute of Science and Technology, in 2021. His main research interests include ma...View more
Author image of Andrey Kravchenko
University of Oxford, Oxford, U.K.
Andrey Kravchenko is currently a Researcher at the University of Oxford and the Skolkovo Institute of Science and Technology. His Ph.D. research was at the intersection of machine learning and unstructured data extraction. He also played a significant role in the DIADEM project, which produced state-of-the art research in the field of large-scale fully automated web data extraction. His current research interests include ...Show More
Andrey Kravchenko is currently a Researcher at the University of Oxford and the Skolkovo Institute of Science and Technology. His Ph.D. research was at the intersection of machine learning and unstructured data extraction. He also played a significant role in the DIADEM project, which produced state-of-the art research in the field of large-scale fully automated web data extraction. His current research interests include ...View more
Author image of Ekaterina Artemova
Huawei Noah’s Ark Laboratory, Moscow, Russia
HSE University, Moscow, Russia
Ekaterina Artemova graduated from HSE University. She received the Ph.D. degree from the Institute of System Analysis, RAS. She is currently a Postdoctoral Researcher at the CS Faculty, HSE University, and advises the Noah Ark’s NLP Team on advanced research topics. She focuses on NLU tasks, ranging from ToD systems to IE and creating new datasets.
Ekaterina Artemova graduated from HSE University. She received the Ph.D. degree from the Institute of System Analysis, RAS. She is currently a Postdoctoral Researcher at the CS Faculty, HSE University, and advises the Noah Ark’s NLP Team on advanced research topics. She focuses on NLU tasks, ranging from ToD systems to IE and creating new datasets.View more
Author image of Evgenia Komleva
Skolkovo Institute of Science and Technology, Moscow, Russia
Evgenia Komleva graduated from the MIPT, in 2021. She is currently pursuing the master’s degree in data science with the Skolkovo Institute of Science and Technology. She is also working on NLP problems at ABBYY and plans to continue her research on adversarial attacks.
Evgenia Komleva graduated from the MIPT, in 2021. She is currently pursuing the master’s degree in data science with the Skolkovo Institute of Science and Technology. She is also working on NLP problems at ABBYY and plans to continue her research on adversarial attacks.View more
Author image of Evgeny Burnaev
Skolkovo Institute of Science and Technology, Moscow, Russia
Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Evgeny Burnaev received the M.Sc. degree from the Moscow Institute of Physics and Technology, in 2006, and the Ph.D. degree from the Institute for Information Transmission Problems, in 2008. He is currently an Associate Professor at the Skolkovo Institute of Science and Technology, Moscow, Russia. His research interests include Gaussian processes for multi-fidelity surrogate modeling and optimization, deep learning for 3D...Show More
Evgeny Burnaev received the M.Sc. degree from the Moscow Institute of Physics and Technology, in 2006, and the Ph.D. degree from the Institute for Information Transmission Problems, in 2008. He is currently an Associate Professor at the Skolkovo Institute of Science and Technology, Moscow, Russia. His research interests include Gaussian processes for multi-fidelity surrogate modeling and optimization, deep learning for 3D...View more

Author image of Ivan Fursov
Skolkovo Institute of Science and Technology, Moscow, Russia
Ivan Fursov was born in Chelyabinsk, Russia. He received the master’s degree from the Skolkovo Institute of Science and Technology, in 2020. He is currently a Deep Learning Research Engineer at Tinkoff and continues to work on new approaches in adversarial attacks on NLP models. In his master’s thesis, he proposed a new adversarial attack on sequence classifiers.
Ivan Fursov was born in Chelyabinsk, Russia. He received the master’s degree from the Skolkovo Institute of Science and Technology, in 2020. He is currently a Deep Learning Research Engineer at Tinkoff and continues to work on new approaches in adversarial attacks on NLP models. In his master’s thesis, he proposed a new adversarial attack on sequence classifiers.View more
Author image of Alexey Zaytsev
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexey Zaytsev was born in Kharkiv, Ukraine. He graduated from the MIPT, in 2012. He received the Ph.D. degree in mathematics from the IITP RAS, in 2017. He is currently an Assistant Professor at the Skoltech. His research interests include development of new methods for sequential data, Bayesian optimization, and embeddings for weakly structured data. In his master’s thesis, he proposed a modification of Bayesian approach for linear regression that allows an automated feature selection.
Alexey Zaytsev was born in Kharkiv, Ukraine. He graduated from the MIPT, in 2012. He received the Ph.D. degree in mathematics from the IITP RAS, in 2017. He is currently an Assistant Professor at the Skoltech. His research interests include development of new methods for sequential data, Bayesian optimization, and embeddings for weakly structured data. In his master’s thesis, he proposed a modification of Bayesian approach for linear regression that allows an automated feature selection.View more
Author image of Pavel Burnyshev
Skolkovo Institute of Science and Technology, Moscow, Russia
Huawei Noah’s Ark Laboratory, Moscow, Russia
Pavel Burnyshev was born in Perm, Russia. He graduated from the MIPT, in 2020. He is currently pursuing the Master of Science degree with the Skolkovo Institute of Science and Technology. He is also a Data Scientist at the NLP Department, Huawei, and works on adversarial attacks for machine translation.
Pavel Burnyshev was born in Perm, Russia. He graduated from the MIPT, in 2020. He is currently pursuing the Master of Science degree with the Skolkovo Institute of Science and Technology. He is also a Data Scientist at the NLP Department, Huawei, and works on adversarial attacks for machine translation.View more
Author image of Ekaterina Dmitrieva
HSE University, Moscow, Russia
Ekaterina Dmitrieva is currently pursuing the Ph.D. degree with the CS Faculty, HSE University. Her research interests include semantic parsing, in particular text2SQL models and adversarial attacks.
Ekaterina Dmitrieva is currently pursuing the Ph.D. degree with the CS Faculty, HSE University. Her research interests include semantic parsing, in particular text2SQL models and adversarial attacks.View more
Author image of Nikita Klyuchnikov
Skolkovo Institute of Science and Technology, Moscow, Russia
Nikita Klyuchnikov received the M.Sc. degree in information science and technology from the Skolkovo Institute of Science and Technology, the M.Sc. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, in 2016, and the Ph.D. degree in computational and data science and engineering from the Skolkovo Institute of Science and Technology, in 2021. His main research interests include machine learning, Bayesian optimization, and their industrial applications.
Nikita Klyuchnikov received the M.Sc. degree in information science and technology from the Skolkovo Institute of Science and Technology, the M.Sc. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, in 2016, and the Ph.D. degree in computational and data science and engineering from the Skolkovo Institute of Science and Technology, in 2021. His main research interests include machine learning, Bayesian optimization, and their industrial applications.View more
Author image of Andrey Kravchenko
University of Oxford, Oxford, U.K.
Andrey Kravchenko is currently a Researcher at the University of Oxford and the Skolkovo Institute of Science and Technology. His Ph.D. research was at the intersection of machine learning and unstructured data extraction. He also played a significant role in the DIADEM project, which produced state-of-the art research in the field of large-scale fully automated web data extraction. His current research interests include the theory and application of anomaly detection in big data using sequences and graphs, and in particular, the development of efficient machine learning algorithms based on the embedding of vectors. He also works on exploring the broader connection between black-box machine learning models and knowledge-based systems, with a particular focus on knowledge graphs.
Andrey Kravchenko is currently a Researcher at the University of Oxford and the Skolkovo Institute of Science and Technology. His Ph.D. research was at the intersection of machine learning and unstructured data extraction. He also played a significant role in the DIADEM project, which produced state-of-the art research in the field of large-scale fully automated web data extraction. His current research interests include the theory and application of anomaly detection in big data using sequences and graphs, and in particular, the development of efficient machine learning algorithms based on the embedding of vectors. He also works on exploring the broader connection between black-box machine learning models and knowledge-based systems, with a particular focus on knowledge graphs.View more
Author image of Ekaterina Artemova
Huawei Noah’s Ark Laboratory, Moscow, Russia
HSE University, Moscow, Russia
Ekaterina Artemova graduated from HSE University. She received the Ph.D. degree from the Institute of System Analysis, RAS. She is currently a Postdoctoral Researcher at the CS Faculty, HSE University, and advises the Noah Ark’s NLP Team on advanced research topics. She focuses on NLU tasks, ranging from ToD systems to IE and creating new datasets.
Ekaterina Artemova graduated from HSE University. She received the Ph.D. degree from the Institute of System Analysis, RAS. She is currently a Postdoctoral Researcher at the CS Faculty, HSE University, and advises the Noah Ark’s NLP Team on advanced research topics. She focuses on NLU tasks, ranging from ToD systems to IE and creating new datasets.View more
Author image of Evgenia Komleva
Skolkovo Institute of Science and Technology, Moscow, Russia
Evgenia Komleva graduated from the MIPT, in 2021. She is currently pursuing the master’s degree in data science with the Skolkovo Institute of Science and Technology. She is also working on NLP problems at ABBYY and plans to continue her research on adversarial attacks.
Evgenia Komleva graduated from the MIPT, in 2021. She is currently pursuing the master’s degree in data science with the Skolkovo Institute of Science and Technology. She is also working on NLP problems at ABBYY and plans to continue her research on adversarial attacks.View more
Author image of Evgeny Burnaev
Skolkovo Institute of Science and Technology, Moscow, Russia
Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Evgeny Burnaev received the M.Sc. degree from the Moscow Institute of Physics and Technology, in 2006, and the Ph.D. degree from the Institute for Information Transmission Problems, in 2008. He is currently an Associate Professor at the Skolkovo Institute of Science and Technology, Moscow, Russia. His research interests include Gaussian processes for multi-fidelity surrogate modeling and optimization, deep learning for 3D data analysis and manifold learning, and on-line learning for prediction and anomaly detection.
Evgeny Burnaev received the M.Sc. degree from the Moscow Institute of Physics and Technology, in 2006, and the Ph.D. degree from the Institute for Information Transmission Problems, in 2008. He is currently an Associate Professor at the Skolkovo Institute of Science and Technology, Moscow, Russia. His research interests include Gaussian processes for multi-fidelity surrogate modeling and optimization, deep learning for 3D data analysis and manifold learning, and on-line learning for prediction and anomaly detection.View more

References

References is not available for this document.