Equation-Solving Attacks represent a specialized and powerful subset of extraction techniques that, while limited in scope to certain model types, achieves perfect extraction scores (100% replication) with only black-box access to the target model. But, unprecedented extraction precision and the ability to produce identical copies of the target model in as few as one to four queries per extracted model parameter come at a price, as Equation-Solving Attacks dorequire precise mathematical formulation and a deep understanding of model mathematics.

How Do Equation-Solving Attacks Work?

Beyond IP theft and the risk of sensitive information disclosure, Equation-Solving Attacks pose several significant security concerns. For example, these attacks subsequently improve the success rate of adversarial examples and enable target model privacy attacks, including membership inference.

In Equation-Solving Attacks, adversaries extract a target model by treating the model’s outputs as solutions to a set of mathematical equations solving for model parameters. Using non-adaptive querying methods, attackers select random or strategic test inputs, varying one feature at a time, in order to observe nuanced differences in output results. In doing so, the attacker is able to formulate an equation, for each query, that represents the relationship between the input data, the unknown model parameters, and the observed output. Attackers repeat this input-output process, accumulating a set of equations. With enough input-output pairs, attackers construct a complete set of equations and the system becomes solvable.

The system is solved using linear algebra techniques, by inverting activation functions, by isolating variables, or through optimization methods. When solved, this system of equations explicitly connects target model inputs, parameters, and outputs, thereby revealing the model’s exact internal configuration and enabling an attacker to reconstruct an identical copy of the original model.

Equation-Solving Attacks are particularly effective against models that provide detailed outputs, such as probability scores or confidence values, because these outputs can be mathematically related back to the internal parameters of the target model. These attacks are especially problematic for simple neural architectures and linear models, such as logistic regression or Multi-Layer Perceptrons (MLPs) with invertible mathematical structures. Proprietary models deployed through Machine Learning as a Service (MLaaS) platforms are also at risk.

Defenses Against Equation-Solving Attacks

Defending against Equation-Solving Attacks requires a multi-layered approach that combines mathematical hardening, controlled information release, and strategic limitations on what adversaries can learn through interaction with the system. Let’s categorize defenses against Equation-Solving Attacks as either mathematical or information restriction-based defenses.

Core Mathematical Defenses Against Equation-Solving Attacks

The foundation of protection against equation-solving attacks lies in making the underlying mathematical structure of the system inherently resistant to reverse engineering. These defenses work by increasing the computational complexity required to solve for protected parameters, applying transformations that obscure relationships between inputs and outputs, and preventing information leakage through side channels.

Mathematical Complexity & Obfuscation

Mathematical Complexity and Obfuscation involves deliberately structuring the system to be mathematically challenging to reverse engineer. This approach uses non-linear equations that resist easy solving, incorporates one-way functions that are computationally simple to compute in the forward direction but extremely difficult to invert, increases the dimensionality by adding more variables and equations, and employs high-degree polynomials instead of simpler linear relationships. By raising the computational burden required to solve the system, this defense effectively makes equation-solving attacks impractical even with significant computing resources.

Noise Addition & Prediction Perturbation

Noise Addition and Prediction Perturbation focuses on introducing controlled randomness into the system’s outputs to prevent attackers from forming precise equations. This defense strategy deliberately incorporates noise or redundancy into responses, employs probabilistic rather than deterministic approaches to create inherent uncertainty, and applies strategic nonlinear transformations to output probabilities. Techniques such as Reverse Sigmoid, Maximizing Angular Deviation (MAD), and Adaptive Misinformation specifically target equation-solving vulnerabilities by distorting the information available for equation formation while preserving the utility of the system for legitimate users.

Side-Channel Countermeasures

Side-Channel Countermeasures address the often-overlooked vulnerabilities where information about the system’s internal workings may leak through timing, power consumption, or other observable behavior. These defenses implement constant-time operations to prevent timing analysis, employ techniques to mask power consumption patterns, and introduce randomization in execution paths. By eliminating these alternative sources of information, side-channel countermeasures prevent attackers from bypassing direct mathematical defenses and gathering the data needed to formulate accurate equations.

Information Restriction-Based Defenses Against Equation-Solving Attacks

Beyond hardening the mathematical structure itself, effective defense against Equation-Solving Attacks requires controlling what information is made available to potential attackers and how they can interact with the system. Information restriction strategies limit an attacker’s ability to gather the necessary data points to formulate solvable equation systems, fundamentally undermining the feasibility of equation-solving attacks regardless of the attacker’s computational resources.

Time-Memory Trade-offs

Time-Memory Trade-offs exploit fundamental computational limits by designing protection schemes that force attackers to make impractical tradeoffs between computation time and memory requirements. This defense creates situations where solving the equations requires either an infeasible amount of memory or prohibitive computation time, making attacks practically impossible even if theoretically feasible. By carefully calibrating these requirements beyond reasonable resource constraints, systems can remain secure against equation-solving attempts while maintaining performance for legitimate use cases.

Top-1 Label Only (Hardening Output)

Top-1 Label Only (Hardening Output) significantly reduces the information available to attackers by providing only the highest confidence prediction rather than complete probability distributions. Instead of returning detailed softmax outputs that reveal gradient information and confidence levels across all possible classes, this approach returns only the winning class label. This straightforward but powerful defense effectively removes the rich probability data equation-solving attacks typically exploit, substantially limiting the precision with which attackers can formulate equations while still providing useful functionality for legitimate applications.

Query Rate Limiting & API Minimization

Query Rate Limiting and API Minimization directly addresses the data collection phase of equation-solving attacks by restricting how many queries can be made and what details are returned. This defense implements strict limits on query frequency, potentially with escalating restrictions for suspicious patterns, while also minimizing the detail in returned information through techniques like probability rounding or thresholding. Though persistent attackers may eventually overcome these restrictions given enough time, this defense significantly increases the cost and detectability of extraction attempts, making it an important component in a comprehensive protection strategy.

Information-Theoretic Defenses

Information-Theoretic Defenses apply formal mathematical principles from information theory to quantifiably limit what can be learned about protected parameters. These sophisticated defenses optimize system outputs to maximize uncertainty (conditional entropy) while preserving utility, minimize the mutual information between true values and what users receive, and provide provable guarantees about the maximum information leakage possible through interaction.

Thanks for reading!