The concept of the backdoor, or “trojan”, AI attack was first proposed in 2017 by Gu, Dolan-Gavitt & Garg in their paper ‘BadNets: Identifying Vulnerabilities In The Machine Learning Model Supply Chain’…
Author: Brian Colwell
Gradient And Update Leakage (GAUL) In Federated Learning
Gradient and Update Leakage attacks intercept and analyze gradient updates or model changes in distributed or federated learning environments to reconstruct the underlying model. In federated learning or other collaborative training setups,…
An Introduction To AI Model Extraction
AI model extraction refers to an attack method where an adversary attempts to replicate the functionality of a machine learning model by systematically querying it and using its outputs to train a…
What Are The Types Of AI Model Extraction Attacks?
Model Extraction Attacks aim at stealing model architecture, training hyperparameters, learned parameters, or model behavior, and are effective across a broad threat landscape that features many practical attack vectors. Today, let’s discuss the most…
What Is Alignment-Aware Extraction?
Alignment-Aware Extraction goes beyond conventional extraction methods by strategically capturing both the functional capabilities and ethical guardrails implemented in modern AI systems. By specifically accounting for alignment procedures like Reinforcement Learning from Human Feedback…
Cloud Infrastructure Creates Vulnerabilities For AI Model Extraction
Cloud infrastructure vulnerabilities comprise security weaknesses in the cloud platforms and services that host machine learning models, which can be exploited to gain unauthorized access to model artifacts. Machine learning models deployed…
Model Deployment Creates Vulnerabilities For AI Model Extraction
Model Deployment Vulnerabilities are weaknesses in how models are deployed in production environments that can be exploited to extract model information or parameters. Production deployments often expose vulnerabilities, such as insufficient access…
What Are Equation-Solving Attacks?
Equation-Solving Attacks represent a specialized and powerful subset of extraction techniques that, while limited in scope to certain model types, achieves perfect extraction scores (100% replication) with only black-box access to the…
What Is Model Leeching?
Model Leeching is a Model Extraction attack in which an adversary siphons task-specific knowledge from a target large language model (LLM) by interacting with it solely through its public API (API Querying) – the…
Introduction To API Querying In AI Model Extraction
API Querying is a systematic approach where attackers send repeated inputs to a model hosted as a service and collect the corresponding outputs to reconstruct the model’s functionality. This is the most…