A digital brain composed of puzzle pieces connected to a futuristic circuit board, symbolizing AI and neural networks.

The Big List Of AI Data Poisoning Attack And Defense References And Resources 

Introduction

Reader note – You may also be interested in these other articles on artificial intelligence:

Data Poisoning Attack And Defense References And Resources

Note that the below are in alphabetical order by title. Enjoy!

  1. A Backdoor Approach With Inverted Labels Using Dirty Label-Flipping Attacks – https://arxiv.org/html/2404.00076v1
  2. A Backdoor Attack Against LSTM-Based Text Classification Systems – https://ieeexplore.ieee.org/document/8836465 
  3. Advances In Neural Information Processing Systems 26 (NIPS 2013) – https://papers.nips.cc/paper_files/paper/2013 
  4. Adversarial Clean Label Backdoor Attacks And Defenses On Text Classification Systems – https://arxiv.org/abs/2305.19607 
  5. Adversarial Label Flips Attack On Support Vector Machines – https://www.sec.in.tum.de/i20/publications/adversarial-label-flips-attack-on-support-vector-machines 
  6. A Game-theoretic Analysis Of Label Flipping Attacks On Distributed Support Vector Machines – https://ieeexplore.ieee.org/document/7926118 
  7. A Label Flipping Attack On Machine Learning Model And Its Defense Mechanism – https://www.researchgate.net/publication/367031053_A_Label_Flipping_Attack_on_Machine_Learning_Model_and_Its_Defense_Mechanism 
  8. Analysis Of Causative Attacks Against SVMs Learning From Data Streams – https://faculty.washington.edu/lagesse/publications/CausativeSVM.pdf
  9. Analyzing Federated Learning Through An Adversarial Lens – https://arxiv.org/abs/1811.12470
  10. An Embarrassingly Simple Approach For Trojan Attack In Deep Neural Networks – https://arxiv.org/abs/2006.08131
  11. Anti-backdoor Learning: Training Clean Models On Poisoned Data – https://arxiv.org/abs/2110.11571 
  12. Artificial Intelligence Crime: An Overview of Malicious Use And Abuse Of AI – https://ieeexplore.ieee.org/document/9831441
  13. A Semantic And Clean-label Backdoor Attack Against Graph Convolutional Networks –  https://arxiv.org/pdf/2503.14922
  14. Awesome Learning With Noisy Labels – https://github.com/subeeshvasu/Awesome-Learning-with-Label-Noise 
  15. BAAAN: Backdoor Attacks Against Autoencoder And GAN-based Machine Learning Models – https://arxiv.org/abs/2010.03007
  16. Backdoor Attacks Against Deep Learning Systems In The Physical World – https://ieeexplore.ieee.org/document/9577800
  17. Backdoor Embedding In Convolutional Neural Network Models Via Invisible Perturbation – https://arxiv.org/abs/1808.10307
  18. Backdooring And Poisoning Neural Networks With Image-scaling Attacks – https://arxiv.org/abs/2003.08633
  19. Backdoor Scanning For Deep Neural Networks Through K-arm Optimization – https://arxiv.org/abs/2102.05123
  20. BadNets: Identifying Vulnerabilities In The Machine Learning Model Supply Chain – https://machine-learning-and-security.github.io/papers/mlsec17_paper_51.pdf 
  21. BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements – https://arxiv.org/abs/2006.01043
  22. Bagging Classifiers For Fighting Poisoning Attacks In Adversarial Classification Tasks – https://dl.acm.org/doi/abs/10.5555/2040895.2040945
  23. Be Careful About Poisoned Word Embeddings: Exploring The Vulnerability Of The Embedding Layers In NLP Models – https://arxiv.org/pdf/2103.15543
  24. Blind Backdoors In Deep Learning Models – https://arxiv.org/abs/2005.03823 
  25. Blockwise p-tampering Attacks On Cryptographic Primitives, Extractors, And Learners – https://eprint.iacr.org/2017/950.pdf
  26. Bullseye Polytope: A Scalable Clean-label Poisoning Attack With Improved Transferability – https://arxiv.org/pdf/2005.00191
  27. Casting Out Demons: Sanitizing Training Data For Anomaly Sensors – https://ieeexplore.ieee.org/abstract/document/4531146 
  28. Certified Robustness To Label-Flipping Attacks Via Randomized Smoothing – https://arxiv.org/abs/2002.03018
  29. Clean-label Backdoor Attack And Defense: An Examination Of Language Model Vulnerability – https://dl.acm.org/doi/10.1016/j.eswa.2024.125856
  30. Clean-label Backdoor Attacks By Selectively Poisoning With Limited Information From Target Class – https://openreview.net/pdf?id=JvUuutHa2s 
  31. Clean-Label Backdoor Attacks On Video Recognition Models – https://openaccess.thecvf.com/content_CVPR_2020/html/Zhao_Clean-Label_Backdoor_Attacks_on_Video_Recognition_Models_CVPR_2020_paper.html 
  32. Clean-Label Feature Collision Attacks On A Keras Classifier – https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/main/notebooks/poisoning_attack_feature_collision.ipynb 
  33. COMBAT: Alternated Training For Effective Clean-Label Backdoor Attacks – https://ojs.aaai.org/index.php/AAAI/article/view/28019
  34. Concealed Data Poisoning Attacks On NLP Models – https://arxiv.org/pdf/2010.12563 
  35. Customizing Triggers With Concealed Data Poisoning – https://pdfs.semanticscholar.org/6d8d/d81f2d18e86b2fa23d52ef14dbcba39864b4.pdf
  36. DarkMind: Latent Chain-Of-Thought Backdoor In Customized LLMs – https://arxiv.org/abs/2501.18617
  37. DarkMind: A New Backdoor Attack That Leverages The Reasoning Capabilities Of LLMs – https://techxplore.com/news/2025-02-darkmind-backdoor-leverages-capabilities-llms.html
  38. Data Poisoning Against Differentially-private Learners: Attacks And Defenses – https://arxiv.org/abs/1903.09860
  39. Data Poisoning Attacks On Factorization-based Collaborative Filtering – https://papers.nips.cc/paper_files/paper/2016/file/83fa5a432ae55c253d0e60dbfa716723-Paper.pdf
  40. Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, And Defenses – https://arxiv.org/pdf/2012.10544
  41. DeepInspect: A Black-box Trojan Detection And Mitigation Framework For Deep Neural Networks – https://www.ijcai.org/proceedings/2019/0647.pdf
  42. Deep k-NN Defense Against Clean-Label Data Poisoning Attacks – https://dl.acm.org/doi/10.1007/978-3-030-66415-2_4
  43. Defending Neural Backdoors Via Generative Distribution Modeling – https://proceedings.neurips.cc/paper_files/paper/2019/file/78211247db84d96acf4e00092a7fba80-Paper.pdf
  44. Demon In The Variant: Statistical Analysis Of DNNs For Robust Backdoor Contamination Detection – https://arxiv.org/abs/1908.00686
  45. Design Of Intentional Backdoors In Sequential Models – https://arxiv.org/abs/1902.09972
  46. Detecting AI Trojans Using Meta Neural Analysis – https://ieeexplore.ieee.org/document/9519467
  47. Detecting Backdoor Attacks On Deep Neural Networks By Activation Clustering – https://arxiv.org/abs/1811.03728
  48. Detection Of Adversarial Training Examples In Poisoning Attacks Through Anomaly Detection – https://arxiv.org/abs/1802.03041
  49. Dp-InstaHide: Provably Defusing Poisoning And Backdoor Attacks With Differentially Private Data Augmentations – https://arxiv.org/pdf/2103.02079
  50. Dynamic Backdoor Attacks Against Machine Learning Models – https://arxiv.org/abs/2003.03675
  51. Effective Clean-Label Backdoor Attacks On Graph Neural Networks – https://dl.acm.org/doi/10.1145/3627673.3679905
  52. Efficient Label Contamination Attacks Against Black-Box Learning Models – https://www.researchgate.net/publication/317252983_Efficient_Label_Contamination_Attacks_Against_Black-Box_Learning_Models 
  53. Explanation-guided Backdoor Poisoning Attacks Against Malware Classifiers – https://www.usenix.org/system/files/sec21-severi.pdf
  54. Fast Adversarial Label-Flipping Attack On Tabular Data – https://arxiv.org/abs/2310.10744
  55. Fawkes: Protecting Privacy Against Unauthorized Deep Learning Models – https://arxiv.org/abs/2002.08327
  56. Fine-pruning: Defending Against Backdooring Attacks On Deep Neural Networks – https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13 
  57. GangSweep: Sweep Out Neural Backdoors By GAN – https://dl.acm.org/doi/pdf/10.1145/3394171.3413546
  58. Generative AI Misuse: A Taxonomy Of Tactics And Insights From Real-World Data – https://arxiv.org/abs/2406.13843
  59. Generative Poisoning Attack Method Against Neural Networks – https://arxiv.org/pdf/1703.01340 
  60. Google SAIF “Top Risks Of Generative AI Systems” – https://saif.google/secure-ai-framework/risks 
  61. Handcrafted Backdoors In Deep Neural Networks – https://arxiv.org/pdf/2106.04690
  62. Hardware Trojan Attacks On Neural Networks – https://arxiv.org/abs/1806.05768 
  63. Hidden Killer: Invisible Textual Backdoor Attacks With Syntactic Trigger – https://arxiv.org/abs/2105.12400
  64. Hidden Trigger Backdoor Attacks – https://arxiv.org/abs/1910.00033 
  65. How To Backdoor Federated Learning – https://proceedings.mlr.press/v108/bagdasaryan20a.html
  66. ImageNet – https://www.image-net.org/
  67. InfoGAN: Interpretable Representation Learning By Information Maximizing Generative Adversarial Nets – https://arxiv.org/abs/1606.03657
  68. Influence Functions In Deep Learning Are Fragile – https://arxiv.org/abs/2006.14651
  69. Influence Function Based Data Poisoning Attacks To Top-n Recommender Systems – https://arxiv.org/abs/2002.08025
  70. Invisible Black-Box Backdoor Attack Against Deep Cross-Modal Hashing Retrieval – https://dl.acm.org/doi/10.1145/3650205 
  71. Invisible Backdoor Attack With Sample-specific Triggers – https://arxiv.org/abs/2012.03816
  72. Is Feature Selection Secure Against Training Data Poisoning? – https://arxiv.org/abs/1804.07933
  73. Label-consistent Backdoor Attacks – https://arxiv.org/abs/1912.02771
  74. Learning To Confuse: Generating Training Time Adversarial Data With Auto-encoder – https://arxiv.org/abs/1905.09027
  75. Learning Under p-tampering Attacks – https://proceedings.mlr.press/v83/mahloujifar18a/mahloujifar18a.pdf
  76. Learning With Noisy Labels – https://papers.nips.cc/paper_files/paper/2013/hash/3871bd64012152bfb53fdf04b401193f-Abstract.html
  77. Less Is More: Stealthy And Adaptive Clean-Image Backdoor Attacks With Few Poisoned – https://openreview.net/forum?id=LsTIW9VAF7 
  78. LFGurad: A Defense Against Label Flipping Attack In Federated Learning For Vehicular Network – https://www.sciencedirect.com/science/article/abs/pii/S1389128624006005
  79. Local Model Poisoning Attacks To Byzantine-robust Federated Learning – https://www.usenix.org/conference/usenixsecurity20/presentation/fang
  80. Malicious ML Models Discovered On Hugging Face Platform – https://www.reversinglabs.com/blog/rl-identifies-malware-ml-model-hosted-on-hugging-face
  81. Manipulating Machine Learning: Poisoning Attacks And Countermeasures For Regression Learning – https://ieeexplore.ieee.org/document/8418594 
  82. Mapping The Misuse Of generative AI – https://deepmind.google/discover/blog/mapping-the-misuse-of-generative-ai/ 
  83. MetaPoison: Practical General-purpose Clean-label Data Poisoning – https://proceedings.neurips.cc/paper_files/paper/2020/file/8ce6fc704072e351679ac97d4a985574-Paper.pdf
  84. Mitigating Poisoning Attacks On Machine Learning Models: A Data Provenance Based Approach – https://www.cs.purdue.edu/homes/bb/nit/20_nathalie-Mitigating_Poisoning_Attacks_on_Machine_Learning_Models_A_Data_Provenance_Based_Approach.pdf
  85. ML Attack Models: Adversarial Attacks And Data Poisoning Attacks – https://arxiv.org/abs/2112.02797 
  86. Narcissus: A Practical Clean-Label Backdoor Attack With Limited Information – https://arxiv.org/abs/2204.05255 
  87. Neural Cleanse: Identifying And Mitigating Backdoor Attacks In Neural Networks – https://ieeexplore.ieee.org/abstract/document/8835365
  88. Neural Trojans – https://arxiv.org/pdf/1710.00942
  89. NIC: Detecting Adversarial Samples With Neural Network Invariant Checking – https://par.nsf.gov/servlets/purl/10139597 
  90. On Defending Against Label Flipping Attacks On Malware Detection Systems – https://arxiv.org/abs/1908.04473
  91. One-pixel Signature: Characterizing CNN Models For Backdoor Detection – https://arxiv.org/abs/2008.07711
  92. On The Effectiveness Of Mitigating Data Poisoning Attacks With Gradient Shaping – https://arxiv.org/abs/2002.11497
  93. Poison Forensics: Traceback Of Data Poisoning Attacks In Neural Networks – ​​https://www.usenix.org/system/files/sec22-shan.pdf 
  94. Poison Frogs! Targeted Clean-label Poisoning Attacks On Neural Networks – https://arxiv.org/abs/1804.00792
  95. Poisoned Classifiers Are Not Only Backdoored, They Are Fundamentally Broken – https://arxiv.org/abs/2010.09080
  96. Poisoning And Backdooring Contrastive Learning – https://arxiv.org/abs/2106.09667 
  97. Poisoning Attack In Federated Learning Using Generative Adversarial Nets – https://ieeexplore.ieee.org/document/8887357
  98. Poisoning Attacks Against Support Vector Machines – ​​https://arxiv.org/abs/1206.6389
  99. Poisoning Attacks On Algorithmic Fairness – https://arxiv.org/abs/2004.07401
  100. Poisoning Attacks With Generative Adversarial Nets – https://arxiv.org/abs/1906.07773 
  101. Poisoning Deep Reinforcement Learning Agents With In-distribution Triggers – https://arxiv.org/abs/2106.07798
  102. Poisoning Language Models During Instruction Tuning – https://arxiv.org/pdf/2305.00944 
  103. Poisoning Web-Scale Training Datasets Is Practical – https://arxiv.org/abs/2302.10149 
  104. Practical Detection Of Trojan Neural Networks: Data-limited And Data-free Cases – https://arxiv.org/abs/2007.15802
  105. Practical Poisoning Attacks On Neural Networks – https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720137.pdf
  106. Preventing Unauthorized Use Of Proprietary Data: Poisoning For Secure Dataset Release – https://arxiv.org/abs/2103.02683
  107. Protecting Intellectual Property Of Deep Neural Networks With Watermarking – https://dl.acm.org/doi/10.1145/3196494.3196550
  108. Protecting The Public From Abusive AI-Generated Content – https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/msc/documents/presentations/CSR/Protecting-Public-Abusive-AI-Generated-Content.pdf 
  109. Provable Robustness Against Backdoor Attacks – https://arxiv.org/abs/2003.08904
  110. Radioactive Data: Tracing Through Training – https://arxiv.org/abs/2002.00937
  111. REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data – https://arxiv.org/pdf/1911.07205
  112. Removing Backdoor-based Watermarks In Neural Networks With Limited Data – https://arxiv.org/pdf/2008.00407
  113. Robust Linear Regression Against Training Data Poisoning – https://dl.acm.org/doi/10.1145/3128572.3140447 
  114. Seal Your Backdoor With Variational Defense – https://arxiv.org/pdf/2503.08829
  115. SentiNet: Detecting Localized Universal Attack Against Deep Learning Systems – https://ieeexplore.ieee.org/document/9283822
  116. Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack – https://arxiv.org/abs/2301.02615 
  117. Smart Lexical Search For Label Flipping Adversial Attack – https://aclanthology.org/2024.privatenlp-1.11.pdf
  118. Spectral Signatures In backdoor attacks – https://proceedings.neurips.cc/paper_files/paper/2018/file/280cf18baf4311c92aa5a042336587d3-Paper.pdf 
  119. Stop-and-Go: Exploring Backdoor Attacks On Deep Reinforcement Learning-based Traffic Congestion Control Systems – https://arxiv.org/abs/2003.07859
  120. STRIP: A defence Against Trojan Attacks On Deep Neural Networks – https://dl.acm.org/doi/abs/10.1145/3359789.3359790 
  121. Strong Data Augmentation Sanitizes Poisoning And Backdoor Attacks Without An Accuracy Tradeoff – https://arxiv.org/pdf/2011.09527
  122. Stronger Data Poisoning Attacks Break Data Sanitization Defenses – https://arxiv.org/pdf/1811.00741 
  123. Support Vector Machines Under Adversarial Label Noise – https://proceedings.mlr.press/v20/biggio11.html
  124. TABOR: A Highly Accurate Approach To Inspecting And Restoring Trojan Backdoors In AI systems – https://arxiv.org/pdf/1908.01763
  125. Targeted Backdoor Attacks On Deep Learning Systems Using Data Poisoning – https://arxiv.org/abs/1712.05526 
  126. Targeted Poisoning Attacks On Social Recommender Systems – https://ieeexplore.ieee.org/document/9013539
  127. TensorClog: An Imperceptible Poisoning Attack On Deep Neural Network Applications – https://ieeexplore.ieee.org/document/8668758
  128. The Art Of Deception: Robust Backdoor Attack Using Dynamic Stacking Of Triggers – https://arxiv.org/html/2401.01537v4 
  129. The Curse Of Concentration In Robust Learning: Evasion And Poisoning Attacks From Concentration Of Measure – https://ojs.aaai.org/index.php/AAAI/article/view/4373
  130. The Path To Defence: A Roadmap To Characterising Data Poisoning Attacks On Victim Models – https://dl.acm.org/doi/10.1145/3627536
  131. Top Risks Of Generative AI Systems – https://saif.google/secure-ai-framework/risks 
  132. Towards Clean-Label Backdoor Attacks In The Physical World – https://arxiv.org/html/2407.19203v1
  133. Towards Data Poisoning Attacks In Crowd Sensing Systems – https://cse.buffalo.edu/~lusu/papers/MobiHoc2018.pdf
  134. Towards Poisoning Of Deep Learning Algorithms With Back-gradient Optimization – https://arxiv.org/pdf/1708.08689
  135. Transferable Clean-label Poisoning Attacks On Deep Neural Nets – https://arxiv.org/abs/1905.05897
  136. Triggerless Backdoor Attack For NLP Tasks With Clean Labels – https://arxiv.org/abs/2111.07970 
  137. Trojan Attack On Deep Generative Models In Autonomous Driving – https://link.springer.com/chapter/10.1007/978-3-030-37228-6_15
  138. Trojaning Attack On Neural Networks – https://www.ndss-symposium.org/wp-content/uploads/2018/02/ndss2018_03A-5_Liu_paper.pdf
  139. Trojaning Language Models For Fun And Profit – https://arxiv.org/abs/2008.00312
  140. TrojDRL: Evaluation Of Backdoor Attacks On Deep Reinforcement Learning – https://dl.acm.org/doi/10.5555/3437539.3437570
  141. Truth Serum: Poisoning Machine Learning Models To Reveal Their Secrets – https://arxiv.org/abs/2204.00032
  142. Turning Your Weakness Into A Strength: Watermarking Deep Neural Networks By Backdooring – https://www.usenix.org/conference/usenixsecurity18/presentation/adi
  143. Understanding Black-box Predictions Via Influence Functions – https://proceedings.mlr.press/v70/koh17a/koh17a.pdf
  144. Universal Litmus Patterns: Revealing Backdoor Attacks In CNNs – https://arxiv.org/abs/1906.10842
  145. Universal Multi-Party Poisoning Attacks – https://arxiv.org/abs/1809.03474
  146. Unlearnable Examples: Making Personal Data Unexploitable – https://arxiv.org/abs/2101.04898
  147. Using Machine Teaching To Identify Optimal Training-set Attacks On Machine Learners – https://ojs.aaai.org/index.php/AAAI/article/view/9569
  148. Weight Poisoning Attacks On Pretrained Models – https://aclanthology.org/2020.acl-main.249.pdf 
  149. What Doesn’t Kill You Makes You Robust(er): How To Adversarially Train Against Data Poisoning – https://arxiv.org/abs/2102.13624
  150. Wicked Oddities: Selectively Poisoning For Effective Clean-Label Backdoor Attacks – https://arxiv.org/abs/2407.10825
  151. Witches’ Brew: Industrial Scale Data Poisoning Via Gradient Matching – https://arxiv.org/abs/2009.02276
  152. You Autocomplete Me: Poisoning Vulnerabilities In Neural Code Completion – https://www.usenix.org/system/files/sec21-schuster.pdf

Thanks for reading!