Brian D. Colwell

Menu
  • Home
  • Blog
  • Contact
Menu
A digital brain composed of puzzle pieces connected to a futuristic circuit board, symbolizing AI and neural networks.

The Big List Of AI Data Poisoning Attack And Defense References And Resources 

Posted on June 8, 2025June 10, 2025 by Brian Colwell

Note that the below are in alphabetical order by title. Enjoy!

  1. A Backdoor Approach With Inverted Labels Using Dirty Label-Flipping Attacks – https://arxiv.org/html/2404.00076v1
  2. A Backdoor Attack Against LSTM-Based Text Classification Systems – https://ieeexplore.ieee.org/document/8836465 
  3. Advances In Neural Information Processing Systems 26 (NIPS 2013) – https://papers.nips.cc/paper_files/paper/2013 
  4. Adversarial Clean Label Backdoor Attacks And Defenses On Text Classification Systems – https://arxiv.org/abs/2305.19607 
  5. Adversarial Label Flips Attack On Support Vector Machines – https://www.sec.in.tum.de/i20/publications/adversarial-label-flips-attack-on-support-vector-machines 
  6. A Game-theoretic Analysis Of Label Flipping Attacks On Distributed Support Vector Machines – https://ieeexplore.ieee.org/document/7926118 
  7. A Label Flipping Attack On Machine Learning Model And Its Defense Mechanism – https://www.researchgate.net/publication/367031053_A_Label_Flipping_Attack_on_Machine_Learning_Model_and_Its_Defense_Mechanism 
  8. Analysis Of Causative Attacks Against SVMs Learning From Data Streams – https://faculty.washington.edu/lagesse/publications/CausativeSVM.pdf
  9. Analyzing Federated Learning Through An Adversarial Lens – https://arxiv.org/abs/1811.12470
  10. An Embarrassingly Simple Approach For Trojan Attack In Deep Neural Networks – https://arxiv.org/abs/2006.08131
  11. Anti-backdoor Learning: Training Clean Models On Poisoned Data – https://arxiv.org/abs/2110.11571 
  12. Artificial Intelligence Crime: An Overview of Malicious Use And Abuse Of AI – https://ieeexplore.ieee.org/document/9831441
  13. A Semantic And Clean-label Backdoor Attack Against Graph Convolutional Networks –  https://arxiv.org/pdf/2503.14922
  14. Awesome Learning With Noisy Labels – https://github.com/subeeshvasu/Awesome-Learning-with-Label-Noise 
  15. BAAAN: Backdoor Attacks Against Autoencoder And GAN-based Machine Learning Models – https://arxiv.org/abs/2010.03007
  16. Backdoor Attacks Against Deep Learning Systems In The Physical World – https://ieeexplore.ieee.org/document/9577800
  17. Backdoor Embedding In Convolutional Neural Network Models Via Invisible Perturbation – https://arxiv.org/abs/1808.10307
  18. Backdooring And Poisoning Neural Networks With Image-scaling Attacks – https://arxiv.org/abs/2003.08633
  19. Backdoor Scanning For Deep Neural Networks Through K-arm Optimization – https://arxiv.org/abs/2102.05123
  20. BadNets: Identifying Vulnerabilities In The Machine Learning Model Supply Chain – https://machine-learning-and-security.github.io/papers/mlsec17_paper_51.pdf 
  21. BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements – https://arxiv.org/abs/2006.01043
  22. Bagging Classifiers For Fighting Poisoning Attacks In Adversarial Classification Tasks – https://dl.acm.org/doi/abs/10.5555/2040895.2040945
  23. Be Careful About Poisoned Word Embeddings: Exploring The Vulnerability Of The Embedding Layers In NLP Models – https://arxiv.org/pdf/2103.15543
  24. Blind Backdoors In Deep Learning Models – https://arxiv.org/abs/2005.03823 
  25. Blockwise p-tampering Attacks On Cryptographic Primitives, Extractors, And Learners – https://eprint.iacr.org/2017/950.pdf
  26. Bullseye Polytope: A Scalable Clean-label Poisoning Attack With Improved Transferability – https://arxiv.org/pdf/2005.00191
  27. Casting Out Demons: Sanitizing Training Data For Anomaly Sensors – https://ieeexplore.ieee.org/abstract/document/4531146 
  28. Certified Robustness To Label-Flipping Attacks Via Randomized Smoothing – https://arxiv.org/abs/2002.03018
  29. Clean-label Backdoor Attack And Defense: An Examination Of Language Model Vulnerability – https://dl.acm.org/doi/10.1016/j.eswa.2024.125856
  30. Clean-label Backdoor Attacks By Selectively Poisoning With Limited Information From Target Class – https://openreview.net/pdf?id=JvUuutHa2s 
  31. Clean-Label Backdoor Attacks On Video Recognition Models – https://openaccess.thecvf.com/content_CVPR_2020/html/Zhao_Clean-Label_Backdoor_Attacks_on_Video_Recognition_Models_CVPR_2020_paper.html 
  32. Clean-Label Feature Collision Attacks On A Keras Classifier – https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/main/notebooks/poisoning_attack_feature_collision.ipynb 
  33. COMBAT: Alternated Training For Effective Clean-Label Backdoor Attacks – https://ojs.aaai.org/index.php/AAAI/article/view/28019
  34. Concealed Data Poisoning Attacks On NLP Models – https://arxiv.org/pdf/2010.12563 
  35. Customizing Triggers With Concealed Data Poisoning – https://pdfs.semanticscholar.org/6d8d/d81f2d18e86b2fa23d52ef14dbcba39864b4.pdf
  36. DarkMind: Latent Chain-Of-Thought Backdoor In Customized LLMs – https://arxiv.org/abs/2501.18617
  37. DarkMind: A New Backdoor Attack That Leverages The Reasoning Capabilities Of LLMs – https://techxplore.com/news/2025-02-darkmind-backdoor-leverages-capabilities-llms.html
  38. Data Poisoning Against Differentially-private Learners: Attacks And Defenses – https://arxiv.org/abs/1903.09860
  39. Data Poisoning Attacks On Factorization-based Collaborative Filtering – https://papers.nips.cc/paper_files/paper/2016/file/83fa5a432ae55c253d0e60dbfa716723-Paper.pdf
  40. Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, And Defenses – https://arxiv.org/pdf/2012.10544
  41. DeepInspect: A Black-box Trojan Detection And Mitigation Framework For Deep Neural Networks – https://www.ijcai.org/proceedings/2019/0647.pdf
  42. Deep k-NN Defense Against Clean-Label Data Poisoning Attacks – https://dl.acm.org/doi/10.1007/978-3-030-66415-2_4
  43. Defending Neural Backdoors Via Generative Distribution Modeling – https://proceedings.neurips.cc/paper_files/paper/2019/file/78211247db84d96acf4e00092a7fba80-Paper.pdf
  44. Demon In The Variant: Statistical Analysis Of DNNs For Robust Backdoor Contamination Detection – https://arxiv.org/abs/1908.00686
  45. Design Of Intentional Backdoors In Sequential Models – https://arxiv.org/abs/1902.09972
  46. Detecting AI Trojans Using Meta Neural Analysis – https://ieeexplore.ieee.org/document/9519467
  47. Detecting Backdoor Attacks On Deep Neural Networks By Activation Clustering – https://arxiv.org/abs/1811.03728
  48. Detection Of Adversarial Training Examples In Poisoning Attacks Through Anomaly Detection – https://arxiv.org/abs/1802.03041
  49. Dp-InstaHide: Provably Defusing Poisoning And Backdoor Attacks With Differentially Private Data Augmentations – https://arxiv.org/pdf/2103.02079
  50. Dynamic Backdoor Attacks Against Machine Learning Models – https://arxiv.org/abs/2003.03675
  51. Effective Clean-Label Backdoor Attacks On Graph Neural Networks – https://dl.acm.org/doi/10.1145/3627673.3679905
  52. Efficient Label Contamination Attacks Against Black-Box Learning Models – https://www.researchgate.net/publication/317252983_Efficient_Label_Contamination_Attacks_Against_Black-Box_Learning_Models 
  53. Explanation-guided Backdoor Poisoning Attacks Against Malware Classifiers – https://www.usenix.org/system/files/sec21-severi.pdf
  54. Fast Adversarial Label-Flipping Attack On Tabular Data – https://arxiv.org/abs/2310.10744
  55. Fawkes: Protecting Privacy Against Unauthorized Deep Learning Models – https://arxiv.org/abs/2002.08327
  56. Fine-pruning: Defending Against Backdooring Attacks On Deep Neural Networks – https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13 
  57. GangSweep: Sweep Out Neural Backdoors By GAN – https://dl.acm.org/doi/pdf/10.1145/3394171.3413546
  58. Generative AI Misuse: A Taxonomy Of Tactics And Insights From Real-World Data – https://arxiv.org/abs/2406.13843
  59. Generative Poisoning Attack Method Against Neural Networks – https://arxiv.org/pdf/1703.01340 
  60. Google SAIF “Top Risks Of Generative AI Systems” – https://saif.google/secure-ai-framework/risks 
  61. Handcrafted Backdoors In Deep Neural Networks – https://arxiv.org/pdf/2106.04690
  62. Hardware Trojan Attacks On Neural Networks – https://arxiv.org/abs/1806.05768 
  63. Hidden Killer: Invisible Textual Backdoor Attacks With Syntactic Trigger – https://arxiv.org/abs/2105.12400
  64. Hidden Trigger Backdoor Attacks – https://arxiv.org/abs/1910.00033 
  65. How To Backdoor Federated Learning – https://proceedings.mlr.press/v108/bagdasaryan20a.html
  66. ImageNet – https://www.image-net.org/
  67. InfoGAN: Interpretable Representation Learning By Information Maximizing Generative Adversarial Nets – https://arxiv.org/abs/1606.03657
  68. Influence Functions In Deep Learning Are Fragile – https://arxiv.org/abs/2006.14651
  69. Influence Function Based Data Poisoning Attacks To Top-n Recommender Systems – https://arxiv.org/abs/2002.08025
  70. Invisible Black-Box Backdoor Attack Against Deep Cross-Modal Hashing Retrieval – https://dl.acm.org/doi/10.1145/3650205 
  71. Invisible Backdoor Attack With Sample-specific Triggers – https://arxiv.org/abs/2012.03816
  72. Is Feature Selection Secure Against Training Data Poisoning? – https://arxiv.org/abs/1804.07933
  73. Label-consistent Backdoor Attacks – https://arxiv.org/abs/1912.02771
  74. Learning To Confuse: Generating Training Time Adversarial Data With Auto-encoder – https://arxiv.org/abs/1905.09027
  75. Learning Under p-tampering Attacks – https://proceedings.mlr.press/v83/mahloujifar18a/mahloujifar18a.pdf
  76. Learning With Noisy Labels – https://papers.nips.cc/paper_files/paper/2013/hash/3871bd64012152bfb53fdf04b401193f-Abstract.html
  77. Less Is More: Stealthy And Adaptive Clean-Image Backdoor Attacks With Few Poisoned – https://openreview.net/forum?id=LsTIW9VAF7 
  78. LFGurad: A Defense Against Label Flipping Attack In Federated Learning For Vehicular Network – https://www.sciencedirect.com/science/article/abs/pii/S1389128624006005
  79. Local Model Poisoning Attacks To Byzantine-robust Federated Learning – https://www.usenix.org/conference/usenixsecurity20/presentation/fang
  80. Malicious ML Models Discovered On Hugging Face Platform – https://www.reversinglabs.com/blog/rl-identifies-malware-ml-model-hosted-on-hugging-face
  81. Manipulating Machine Learning: Poisoning Attacks And Countermeasures For Regression Learning – https://ieeexplore.ieee.org/document/8418594 
  82. Mapping The Misuse Of generative AI – https://deepmind.google/discover/blog/mapping-the-misuse-of-generative-ai/ 
  83. MetaPoison: Practical General-purpose Clean-label Data Poisoning – https://proceedings.neurips.cc/paper_files/paper/2020/file/8ce6fc704072e351679ac97d4a985574-Paper.pdf
  84. Mitigating Poisoning Attacks On Machine Learning Models: A Data Provenance Based Approach – https://www.cs.purdue.edu/homes/bb/nit/20_nathalie-Mitigating_Poisoning_Attacks_on_Machine_Learning_Models_A_Data_Provenance_Based_Approach.pdf
  85. ML Attack Models: Adversarial Attacks And Data Poisoning Attacks – https://arxiv.org/abs/2112.02797 
  86. Narcissus: A Practical Clean-Label Backdoor Attack With Limited Information – https://arxiv.org/abs/2204.05255 
  87. Neural Cleanse: Identifying And Mitigating Backdoor Attacks In Neural Networks – https://ieeexplore.ieee.org/abstract/document/8835365
  88. Neural Trojans – https://arxiv.org/pdf/1710.00942
  89. NIC: Detecting Adversarial Samples With Neural Network Invariant Checking – https://par.nsf.gov/servlets/purl/10139597 
  90. On Defending Against Label Flipping Attacks On Malware Detection Systems – https://arxiv.org/abs/1908.04473
  91. One-pixel Signature: Characterizing CNN Models For Backdoor Detection – https://arxiv.org/abs/2008.07711
  92. On The Effectiveness Of Mitigating Data Poisoning Attacks With Gradient Shaping – https://arxiv.org/abs/2002.11497
  93. Poison Forensics: Traceback Of Data Poisoning Attacks In Neural Networks – ​​https://www.usenix.org/system/files/sec22-shan.pdf 
  94. Poison Frogs! Targeted Clean-label Poisoning Attacks On Neural Networks – https://arxiv.org/abs/1804.00792
  95. Poisoned Classifiers Are Not Only Backdoored, They Are Fundamentally Broken – https://arxiv.org/abs/2010.09080
  96. Poisoning And Backdooring Contrastive Learning – https://arxiv.org/abs/2106.09667 
  97. Poisoning Attack In Federated Learning Using Generative Adversarial Nets – https://ieeexplore.ieee.org/document/8887357
  98. Poisoning Attacks Against Support Vector Machines – ​​https://arxiv.org/abs/1206.6389
  99. Poisoning Attacks On Algorithmic Fairness – https://arxiv.org/abs/2004.07401
  100. Poisoning Attacks With Generative Adversarial Nets – https://arxiv.org/abs/1906.07773 
  101. Poisoning Deep Reinforcement Learning Agents With In-distribution Triggers – https://arxiv.org/abs/2106.07798
  102. Poisoning Language Models During Instruction Tuning – https://arxiv.org/pdf/2305.00944 
  103. Poisoning Web-Scale Training Datasets Is Practical – https://arxiv.org/abs/2302.10149 
  104. Practical Detection Of Trojan Neural Networks: Data-limited And Data-free Cases – https://arxiv.org/abs/2007.15802
  105. Practical Poisoning Attacks On Neural Networks – https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720137.pdf
  106. Preventing Unauthorized Use Of Proprietary Data: Poisoning For Secure Dataset Release – https://arxiv.org/abs/2103.02683
  107. Protecting Intellectual Property Of Deep Neural Networks With Watermarking – https://dl.acm.org/doi/10.1145/3196494.3196550
  108. Protecting The Public From Abusive AI-Generated Content – https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/msc/documents/presentations/CSR/Protecting-Public-Abusive-AI-Generated-Content.pdf 
  109. Provable Robustness Against Backdoor Attacks – https://arxiv.org/abs/2003.08904
  110. Radioactive Data: Tracing Through Training – https://arxiv.org/abs/2002.00937
  111. REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data – https://arxiv.org/pdf/1911.07205
  112. Removing Backdoor-based Watermarks In Neural Networks With Limited Data – https://arxiv.org/pdf/2008.00407
  113. Robust Linear Regression Against Training Data Poisoning – https://dl.acm.org/doi/10.1145/3128572.3140447 
  114. Seal Your Backdoor With Variational Defense – https://arxiv.org/pdf/2503.08829
  115. SentiNet: Detecting Localized Universal Attack Against Deep Learning Systems – https://ieeexplore.ieee.org/document/9283822
  116. Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack – https://arxiv.org/abs/2301.02615 
  117. Smart Lexical Search For Label Flipping Adversial Attack – https://aclanthology.org/2024.privatenlp-1.11.pdf
  118. Spectral Signatures In backdoor attacks – https://proceedings.neurips.cc/paper_files/paper/2018/file/280cf18baf4311c92aa5a042336587d3-Paper.pdf 
  119. Stop-and-Go: Exploring Backdoor Attacks On Deep Reinforcement Learning-based Traffic Congestion Control Systems – https://arxiv.org/abs/2003.07859
  120. STRIP: A defence Against Trojan Attacks On Deep Neural Networks – https://dl.acm.org/doi/abs/10.1145/3359789.3359790 
  121. Strong Data Augmentation Sanitizes Poisoning And Backdoor Attacks Without An Accuracy Tradeoff – https://arxiv.org/pdf/2011.09527
  122. Stronger Data Poisoning Attacks Break Data Sanitization Defenses – https://arxiv.org/pdf/1811.00741 
  123. Support Vector Machines Under Adversarial Label Noise – https://proceedings.mlr.press/v20/biggio11.html
  124. TABOR: A Highly Accurate Approach To Inspecting And Restoring Trojan Backdoors In AI systems – https://arxiv.org/pdf/1908.01763
  125. Targeted Backdoor Attacks On Deep Learning Systems Using Data Poisoning – https://arxiv.org/abs/1712.05526 
  126. Targeted Poisoning Attacks On Social Recommender Systems – https://ieeexplore.ieee.org/document/9013539
  127. TensorClog: An Imperceptible Poisoning Attack On Deep Neural Network Applications – https://ieeexplore.ieee.org/document/8668758
  128. The Art Of Deception: Robust Backdoor Attack Using Dynamic Stacking Of Triggers – https://arxiv.org/html/2401.01537v4 
  129. The Curse Of Concentration In Robust Learning: Evasion And Poisoning Attacks From Concentration Of Measure – https://ojs.aaai.org/index.php/AAAI/article/view/4373
  130. The Path To Defence: A Roadmap To Characterising Data Poisoning Attacks On Victim Models – https://dl.acm.org/doi/10.1145/3627536
  131. Top Risks Of Generative AI Systems – https://saif.google/secure-ai-framework/risks 
  132. Towards Clean-Label Backdoor Attacks In The Physical World – https://arxiv.org/html/2407.19203v1
  133. Towards Data Poisoning Attacks In Crowd Sensing Systems – https://cse.buffalo.edu/~lusu/papers/MobiHoc2018.pdf
  134. Towards Poisoning Of Deep Learning Algorithms With Back-gradient Optimization – https://arxiv.org/pdf/1708.08689
  135. Transferable Clean-label Poisoning Attacks On Deep Neural Nets – https://arxiv.org/abs/1905.05897
  136. Triggerless Backdoor Attack For NLP Tasks With Clean Labels – https://arxiv.org/abs/2111.07970 
  137. Trojan Attack On Deep Generative Models In Autonomous Driving – https://link.springer.com/chapter/10.1007/978-3-030-37228-6_15
  138. Trojaning Attack On Neural Networks – https://www.ndss-symposium.org/wp-content/uploads/2018/02/ndss2018_03A-5_Liu_paper.pdf
  139. Trojaning Language Models For Fun And Profit – https://arxiv.org/abs/2008.00312
  140. TrojDRL: Evaluation Of Backdoor Attacks On Deep Reinforcement Learning – https://dl.acm.org/doi/10.5555/3437539.3437570
  141. Truth Serum: Poisoning Machine Learning Models To Reveal Their Secrets – https://arxiv.org/abs/2204.00032
  142. Turning Your Weakness Into A Strength: Watermarking Deep Neural Networks By Backdooring – https://www.usenix.org/conference/usenixsecurity18/presentation/adi
  143. Understanding Black-box Predictions Via Influence Functions – https://proceedings.mlr.press/v70/koh17a/koh17a.pdf
  144. Universal Litmus Patterns: Revealing Backdoor Attacks In CNNs – https://arxiv.org/abs/1906.10842
  145. Universal Multi-Party Poisoning Attacks – https://arxiv.org/abs/1809.03474
  146. Unlearnable Examples: Making Personal Data Unexploitable – https://arxiv.org/abs/2101.04898
  147. Using Machine Teaching To Identify Optimal Training-set Attacks On Machine Learners – https://ojs.aaai.org/index.php/AAAI/article/view/9569
  148. Weight Poisoning Attacks On Pretrained Models – https://aclanthology.org/2020.acl-main.249.pdf 
  149. What Doesn’t Kill You Makes You Robust(er): How To Adversarially Train Against Data Poisoning – https://arxiv.org/abs/2102.13624
  150. Wicked Oddities: Selectively Poisoning For Effective Clean-Label Backdoor Attacks – https://arxiv.org/abs/2407.10825
  151. Witches’ Brew: Industrial Scale Data Poisoning Via Gradient Matching – https://arxiv.org/abs/2009.02276
  152. You Autocomplete Me: Poisoning Vulnerabilities In Neural Code Completion – https://www.usenix.org/system/files/sec21-schuster.pdf

Thanks for reading!

Browse Topics

  • Artificial Intelligence
    • Adversarial Examples
    • Alignment & Ethics
    • Backdoor & Trojan Attacks
    • Data Poisoning
    • Federated Learning
    • Model Extraction
    • Model Inversion
    • Prompt Injection & Jailbreaking
    • Sensitive Information Disclosure
    • Supply Chain
    • Training Data Extraction
    • Watermarking
  • Biotech & Agtech
  • Commodities
    • Agricultural
    • Energies & Energy Metals
    • Gases
    • Gold
    • Industrial Metals
    • Minerals & Metalloids
  • Economics & Game Theory
  • Management
  • Marketing
  • Philosophy
  • Robotics
  • Sociology
    • Group Dynamics
    • Political Science
    • Religious Sociology
    • Sociological Theory
  • Web3 Studies
    • Bitcoin & Cryptocurrencies
    • Blockchain & Cryptography
    • DAOs & Decentralized Organizations
    • NFTs & Digital Identity

Recent Posts

  • The Bitcoin Whitepaper – Satoshi Nakamoto

    The Bitcoin Whitepaper – Satoshi Nakamoto

    June 13, 2025
  • The Big List Of AI Supply Chain Attack Resources

    The Big List Of AI Supply Chain Attack Resources

    June 11, 2025
  • AI Supply Chain Attacks Are A Pervasive Threat

    AI Supply Chain Attacks Are A Pervasive Threat

    June 11, 2025
©2025 Brian D. Colwell | Theme by SuperbThemes