Evaluating Robustness of a CNN Architecture introduced to the Adversarial Attacks

University essay from Blekinge Tekniska Högskola

Abstract: Abstract: Background: From Previous research, state-of-the-art deep neural networks have accomplished impressive results on many images classification tasks. However, adversarial attacks can easily fool these deep neural networks by adding little noise to the input images. This vulnerability causes a significant concern in deploying deep neural network-based systems in real-world security-sensitive situations. Therefore, research in attacking and the architectures with adversarial examples has drawn considerable attention. Here, we use the technique for image classification called Convolutional Neural Networks (CNN), which is known for determining favorable results in image classification. Objectives: This thesis reviews all types of adversarial attacks and CNN architectures in the present scientific literature. Experiment to build a CNN architecture to classify the handwritten digits in the MNIST dataset. And they are experimenting with adversarial attacks on the images to evaluate the accuracy fluctuations in categorizing images. This study also includes an experiment using the defensive distillation technique to improve the architecture's performance under adversarial attacks.  Methods: This thesis includes two methods; the systematic literature review method involved finding the best performing CNN architectures and best performing adversarial attack techniques. The experimentation method consists in building a CNN model based on modified LeNet architecture with two convolutional layers, one max-pooling layer, and two dropouts. The model is trained and tested with the MNIST dataset. Then applying adversarial attacks FGSM, IFGSM, MIFGSM on the input images to evaluate the model's performance. Later this model will be modified a little by defensive distillation technique and then tested towards adversarial attacks to evaluate the architecture's performance. Results: An experiment is conducted to evaluate the robustness of the CNN architecture in classifying the handwritten digits. The graphs show the accuracy before and after implementing adversarial attacks on the test dataset. The defensive distillation mechanism is applied to avoid adversarial attacks and achieve robust architecture. Conclusions: The results showed that FGSM, I-FGSM, MI-FGSM attacks reduce the test accuracy from 95% to around 35%. These three attacks to the proposed network successfully reduced ~70% of the test accuracy in all three cases for maximum epsilon 0.3. By the defensive distillation mechanism, the test accuracy reduces from 90% to 88% for max epsilon 0.3. The proposed defensive distillation process is successful in defending the adversarial attacks. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)