A person wearing glasses with digital technology and data visualizations overlayed on their silhouette.

A Brief Taxonomy Of AI Membership Inference Attacks

Introduction

Reader note – You may also be interested in these other articles on artificial intelligence:

Membership Inference Attack Taxonomy

In the below taxonomy, membership inference attacks are categorized by: target model, adversarial knowledge, attack approach, training method, and target domain.

Target Model

The target model category of this membership inference attack taxonomy is subcategorized into the following: classification models, generative models, regression models, and embedding models.

Classification Models

The classification models subcategory of the target model category of this membership inference attack taxonomy is divided into the following groups: binary-class classifiers, and multi-class classifiers.

Binary-Class Classifiers

Multi-Class Classifiers

Generative Models

The generative models subcategory of the target model category of this membership inference attack taxonomy is divided into the following groups: GANs, and VAEs.

GANs

VAEs

Regression Models

The regression models subcategory of the target model category of this membership inference attack taxonomy includes only one group: deep regression.

Deep Regression

Embedding Models

The embedding models subcategory of the target model category of this membership inference attack taxonomy is divided into the following groups: NLP embedding, graph embedding, and image encoder.

NLP Embedding

Graph Embedding

Image Encoder

Adversarial Knowledge

The adversarial knowledge category of this membership inference attack taxonomy is subcategorized into the following: black-box attacks, and white-box attacks.

Black-Box Attacks

The black-box attacks subcategory of the adversarial knowledge category of this membership inference attack taxonomy is divided into the following groups: Prediction Vector, Top-K Confidence, and Label Only.

Prediction Vector

Top-K Confidence

Label Only

White-Box Attacks

The white-box attacks subcategory of the adversarial knowledge category of this membership inference attack taxonomy is not further divided into additional groups.

Attack Approach

The attack approach category of this membership inference attack taxonomy is subcategorized into the following: classifier-based attacks, metric-based attacks, and differential comparisons-based attacks.

Classifier-Based Attacks

The classifier-based attacks subcategory of the attack approach category of this membership inference attack taxonomy includes only one group: shadow training.

Shadow Training

Metric-Based Attacks

The metric-based attacks subcategory of the attack approach category of this membership inference attack taxonomy is divided into the following groups: prediction correctness, prediction loss, prediction confidence, prediction entropy, adversarial perturbation, and hypothesis test.

Prediction Correctness

Prediction Loss

Prediction Confidence

  • ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models – Salem et al. – https://arxiv.org/abs/1806.01246

Prediction Entropy

Adversarial Perturbation

Hypothesis Test

Differential Comparisons-Based Attacks

The differential comparisons-based attacks subcategory of the attack approach category of this membership inference attack taxonomy includes only one group: BLINDMI.

BLINDMI

Training Method

The training method category of this membership inference attack taxonomy is subcategorized into the following: centralized training, and federated training.

Centralized Training

The centralized training subcategory of the training method category of this membership inference attack taxonomy is not further divided into additional groups.

Federated Training

The federated training subcategory of the training method category of this membership inference attack taxonomy is divided into the following groups: FedAvg, and FedSGD.

FedAvg

FedSGD

Target Domain

The target domain category of this membership inference attack taxonomy is subcategorized into the following: natural language processing (NLP), computer vision (CV), graph, audio, and recommender system.

Natural Language Processing (NLP)

The natural language processing (NLP) subcategory of the target domain category of this membership inference attack taxonomy is divided into the following groups: text classification, text generation, and word embedding.

Text Classification

Text Generation

Word Embedding

Computer Vision (CV)

The computer vision (CV) subcategory of the target domain category of this membership inference attack taxonomy is divided into the following groups: image classification, image generation, and image segmentation.

Image Classification

Image Generation

Image Segmentation

Graph

The graph subcategory of the target domain category of this membership inference attack taxonomy is divided into the following groups: knowledge graphs, node classification, and graph classification.

Knowledge Graphs

Node Classification

Graph Classification

Audio

The audio subcategory of the target domain category of this membership inference attack taxonomy includes only one group: speech recognition.

Speech Recognition

Recommender System

The recommender system subcategory of the target domain category of this membership inference attack taxonomy includes only one group: collaborative filtering.

Collaborative Filtering

Thanks for reading!