Securing Machine Learning: Streamlining Attacks and Defenses Under Realistic Adversary Models

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Doctoral thesis (article-based) | Defence date: 2022-08-12

Date

2022

Major/Subject

Mcode

Degree programme

Language

en

Pages

100 + app. 86

Series

Aalto University publication series DOCTORAL THESES, 90/2022

Abstract

Over the last decade, machine learning (ML) and artificial intelligence (AI) solutions have been widely used in many applications, due to their remarkable performance in various domains. However, the rapid progress in AI and ML driven systems induces new security risks associated with the entire ML pipeline and vulnerabilities that can be exploited by adversaries to attack these systems. Although different defense methods have been proposed to prevent, mitigate, or detect attacks, they are either limited or ineffective against continuously evolving attack strategies due to the increased availability of ML tools, databases, and computing resources. For instance, evasion attacks that aim to fool ML systems have become a major threat to security or safety-critical applications. Model extraction attacks, which attempt to compromise the confidentiality of ML models, are also an important concern. Both these attacks pose serious threats even if the ML model is deployed behind an application program interface (API) and does not expose any information about the model itself to end users.This dissertation investigates the security threats to ML systems caused by evasion and model extraction attacks. This dissertation contains three parts, namely, adversarial examples in ML, ownership verification in ML, and model extraction as a realistic threat. In the first part, we develop evasion attacks, which attain high effectiveness and efficiency at the same time, on image classifiers and deep reinforcement learning (DRL) agents. In both applications, we place a particular focus on operating within realistic adversary models. We show that our evasion attack on image classifiers can be as effective as state-of-the-art attacks with a cost decreased by three orders of magnitude. We also demonstrate that we can destroy the performance of DRL agents with a small online cost and without modifying their inner states. In the second part, we propose a novel approach that integrates ML watermarking solutions into the federated learning process with low computational (+3.2%) overhead and negligible degradation in model performance (-0.17%). We also demonstrate that different dataset tracing and watermarking methods can only reliably demonstrate the ownership of big datasets having a high number of classes (>=30), and with limited adversarial capabilities. In the third part, we show that the effectiveness of state-of-the-art model extraction attacks is affected by several aspects such as the amount of information delivered via the API for each input, or the adversary's knowledge about the task and ML model architecture. We also develop alternative watermarking techniques that can survive during model extraction attacks and deter adversaries by increasing the cost of the attack. The findings in this dissertation will help ML model owners evaluate potential vulnerabilities and remedies against model evasion and extraction attacks considering different security requirements and realistic adversary models.

Description

Supervising professor

Asokan, N., Adjunct Prof., Aalto University, Department of Computer Science, Finland

Thesis advisor

Marchal, Samuel, Dr., Aalto University, Department of Computer Science, Finland

Keywords

adversarial machine learning, model evasion, model extraction, ownership verification

Other note

Parts

  • [Publication 1]: Mika Juuti, Buse Gul Atli, N. Asokan. Making Targeted Black-box Evasion Attacks Effective and Efficient. In ACM Workshop on Artificial Intelligence and Security, London, UK, pp. 83-94, November 2019.
    DOI: 10.1145/3338501.3357366 View at publisher
  • [Publication 2]: Buse G. A. Tekgul, Shelly Wang, Samuel Marchal, N. Asokan. Real-time Adversarial Perturbations against Deep Reinforcement Learning Policies: Attacks and Defenses. In submission, 13 pages, 2022.
  • [Publication 3]: Buse G. A. Tekgul, Yuxi Xia, Samuel Marchal, N. Asokan. WAFFLE: Watermarking in Federated Learning. In International Symposium on Reliable Distributed Systems, Virtual, USA, pp. 310-320, September 2021.
  • [Publication 4]: Buse Gul Atli, Sebastian Szyller, Mika Juuti, Samuel Marchal, N. Asokan. Extraction of Complex DNN Models: Real Threat or Boogeyman?. In International Workshop on Engineering Dependable and Secure Machine Learning Systems, New York, NY, USA, Springer CCIS, volume 1272, pp. 42-57, February 2020.
    DOI: 10.1007/978-3-030-62144-5_4 View at publisher
  • [Publication 5]: Sebastian Szyller, Buse Gul Atli, Samuel Marchal, N. Asokan. DAWN: Dynamic Adversarial Watermarking of Neural Networks. In ACM International Conference on Multimedia, Virtual, China, pp. 4417–4425, October 2021.
  • [Publication 6]: Buse Gul Atli Tekgul, N. Asokan. On the Effectiveness of Dataset Watermarking. In ACM International Workshop on Security and Privacy Analytics, Baltimore, USA, pp. 93-99, April 2022.

Citation