Path-Finding is a specialized model extraction attack that targets tree-based machine learning models, such as decision trees and random forests, exploiting confidence values and using the rich information provided by APIs on a prediction query as a pseudo-identifier for the path that the input crossed in the tree – each query reply contains a unique identifier for the associated leaf, which allows attackers to map decision boundaries with perfect fidelity through “Leaf Node Enumeration” and “Threshold Probing”. Systematically varying the value of each input feature and cataloging the outputs associated with each leaf to pinpoint exact split thresholds between nodes, attackers are able to find the path in the tree, discover the tree’s structure, and reconstruct the decision rules governing each terminal node. These attacks also exploit the ability to query incomplete inputs.

While query-intensive, often requiring between 44-317 queries per parameter extracted, advanced techniques like partial input pattern analysis optimize the process, making it feasible to map complex tree structures efficiently. Noteworthy is that Path-Finding is not an anytime attack, meaning the full extraction process – and thousands of queries – must be completed to produce a useful model, unlike some approximation methods that can yield partially useful models before completion.

Defending against Path-Finding Attacks requires a multi-layered approach, including Anomaly Detection, Ensemble Disagreement, Output Obfuscation, Query-Rate Limiting, and Randomized Feature Masking.

Anomaly Detection

Machine learning-powered monitoring systems analyze query patterns to identify attack signatures, such as repetitive feature threshold probing. These systems combine behavioral analysis (e.g., query frequency spikes) with content inspection (e.g., systematic input variations) to flag suspicious activity. When integrated with real-time blocking capabilities, Anomaly Detection enables proactive defense while adapting to evolving attack methodologies through continuous model retraining.

Ensemble Disagreement

Ensemble Disagreement defense deploys multiple diverse models that collectively serve requests but individually handle only a subset of queries. By routing different queries to different ensemble members and introducing controlled variability in responses, this approach obscures the true decision boundaries of any single model. Attackers attempting Path-Finding extraction receive inconsistent signals that confound their boundary mapping efforts. The key benefit is that this defense maintains high overall accuracy while significantly increasing the complexity and query requirements for successful extraction, effectively raising the cost of attacks without substantial impact on legitimate users.

Output Obfuscation

Output Obfuscation reduces the precision of model responses to mask decision boundaries. Techniques include returning only class labels instead of probability scores, rounding numerical outputs, or injecting controlled noise via differential privacy mechanisms. These methods obscure the granular data attackers need to map node splits while maintaining sufficient utility for most legitimate applications. The strategy integrates well with broader privacy-preserving frameworks, such as federated learning.

Query Rate Limiting

This defense restricts the number of queries a user can submit within specific time windows, typically implemented through API gateways or token bucket algorithms. By enforcing strict quotas (e.g., 100 queries/minute) and temporary bans for violators, organizations significantly increase the time and cost required for successful attacks. The approach preserves service availability for legitimate users while requiring minimal infrastructure changes, making it a practical first-line defense.

Randomized Feature Masking

Randomized feature masking selectively obscures or transforms certain input features when processing queries, with the specific features and transformation changing across different queries. This technique disrupts the consistency needed for effective Path-Finding by preventing attackers from building reliable mappings between inputs and outputs. The model maintains accuracy by focusing the randomization on features with redundant information or by implementing coordinated transformations that preserve overall prediction quality. The benefit of randomized feature masking is its ability to significantly complicate Path-Finding Attacks while maintaining high utility for legitimate use cases that don’t rely on extracting precise decision boundaries.

Thanks for reading!