Diversity Input Method [Eng]
Xie et al. / Improving transferability of adversarial examples with input diversity/ CVPR 2019
한국어로 쓰인 리뷰를 읽고 싶으시면 여기 를 누르세요.
1. Introduction
✔Adversarial Attack
Adversarial Attack is a technique that induces incorrect prediction of the model by intentionally adding noise to the image as shown in the figure. Adversarial attacks are classified into targeted attacks and non-targeted attacks. Targeted attack is an attack that induces the prediction of the target model into a specific class. And non-targeted is and attack that does not induce, but simply mispredicts.
A white box attack that can access the model to be attacked, also can access the weight of the model, so it is possible to obtain the gradient of the loss function for the input image. This gradient is used to create an adversarial image.
✔Transfer-Based Adversarial-Attack
If the model you want to attack is inaccessible, you should try transfer-based adversarial attack using transferability of your adversarial image. This is an adversarial image created by a white box attack on the source model, also attacks the target model. Therefore, in order to improve the transfer-based adversarial attack success rate, it is very important to prevent overfitting phenomenon in which the adversarial image depends on the source model and shows high performance only in the source model.
Diversity Input Method(DIM) generates a adversarial image using an image that has undergone random resizing and random padding as input to the model. This is based on the assumption that a adversarial image should act adversarially even if its size and location change. This prevents adversarial images from overfitting the source model, maintaining adversity across multiple models.
2. Method
Diversity Input Method(DIM)✨
The core idea of the Diversity Input Method(DIM) is to avoid the dependence of the adversarial image on the source model by using the slope of the transformed image with randomly resizing and random padding. This tranform process will be called DI transform. The image below compares the original image with the image after DI transform.
The implementation of the DI transformation in this paper is as follows:
randomly resizing : Resize image to rnd × rnd × 3 (rnd ∈ [299, 330))
random padding : Randomly pad the image to the top, bottom, left, and right so that it is 330 × 330 × 3
In this paper, TensorFlow is used, and the image size is fixed to 330 × 330 × 3 after DI transform. (After that, the image size is converted again according to the model input size.) I use PyTorch to maintain the process of random resizing and random padding of the paper, but change the image size after DI transform as original image. In this way, it does not have to go through the post-processing process.
DI transform has the advantage that it can be used with known transfer-based adversarial attacks (I-FGSM, MI-FGSM). In the case of attacking using the I-FGSM attack technique with DI transform, it will be referred to as DI-FGSM. In the related work below, I will also introduce each attack method.
Related work✨
1) Iterative Fast Gradient Sign Method (I-FGSM)
The fast gradient sign method (FGSM) changes each pixel of X by ε in the direction of increasing loss function L(X,y(true)) for the input image X and the real class y(true), to create a hostile image X^{ adv}.
iterative Fast Gradient Sign Method (I-FGSM) is that repeatedly executes an FGSM attack that changes each pixel by α.
2) momentum iterative FGSM (MI-FGSM)
As a method of preventing overfitting to the source model, there is a method using momentum (MI-FGSM). MI-FGSM is iteratively performed like I-FGSM, and it accumulates gradient (gt) information from the beginning to the present and uses it for adversarial image update. The difference is that the sign of gt is used for update, not the sign of the loss function.
Accumulating gradients helps not to fall into a poor local maxima, and it is stable because the direction of the repeatedly updated adversarial change is similar to that of I-FGSM. Therefore, MI-FGSM shows better transferability than I-FGSM.
3. Implementation
Use Python language, version >= 3.6
Using PyTorch in the code implementation process
Use manual seed : used to fix randomness (included in the example code below)
🔨 Environment
The environment (env_di-fgsm.yml) required in the process of implementing the DI transform was created as a yml file. You can use the Anaconda virtual environment and set the environment by entering the following command.
📋DI-FGSM
In this file, DI-FGSM is implemented. I used _**comments **_to explain the overall code. The size of tensors is shown as an example based on the CIFAR-10 image (size: 32, 32) which used in the example file (Transfer Attack.py) to be introduced below.
The diverse_input function part in class DIFGSM is the core part of DI-FGSM. Random resizing and Random padding parts are implemented. After calling the diverse_input function in the forward function, backpropagation occurs.
📋Example code
In Transfer Attack.py, I tested the performance of Transfer Attack using DI-FGSM.
##3 : This part indicate the attack process and result of the source model. You can specify an attack as atk = DIFGSM(model, eps=16 / 255, alpha=2 / 255, steps=10, di_pad_amount=5).
##5, ##6: Shows the clean accuracy tested on the target model with the validation set, and the robust accuracy tested with the adversarial image created in ##3.
example🚀
results🚀
The clean accuracy tested with the validation set on the target model is 87.26 %, which shows relatively high comparative performance.
On the other hand, the robust accuracy of testing the target model performance with an adversarial image made with DI-FGSM through the source model showed low performance at 38.87%, indicating that it is a successful transfer-based adversarial attack.
Author / Reviewer information
Author😍
Hee-Seon Kim
KAIST EE
https://github.com/khslily98
hskim98@kaist.ac.kr
Reviewer😍
Korean name (English name): Affiliation / Contact information
Korean name (English name): Affiliation / Contact information
...
Reference & Additional materials
Citation of this paper
Official (unofficial) GitHub repository
Citation of related work
Other useful materials
...
Last updated