Note that the below are in alphabetical order by title. Enjoy!
- A Backdoor Approach With Inverted Labels Using Dirty Label-Flipping Attacks – https://arxiv.org/html/2404.00076v1
- A Backdoor Attack Against LSTM-Based Text Classification Systems – https://ieeexplore.ieee.org/document/8836465
- A Brief Introduction To AI Data Poisoning – https://cryptopunk4762.com/f/a-brief-introduction-to-ai-data-poisoning
- A Brief Introduction To Backdoor AI Data Poisoning Attacks – https://cryptopunk4762.com/f/a-brief-introduction-to-backdoor-ai-data-poisoning-attacks
- A Brief Introduction To Clean Label AI Data Poisoning – https://cryptopunk4762.com/f/a-brief-introduction-to-clean-label-ai-data-poisoning-attacks
- A Brief Introduction To Dirty-Label AI Data Poisoning Attacks – https://cryptopunk4762.com/f/a-brief-introduction-to-dirty-label-ai-data-poisoning-attacks
- A Brief Introduction To AI Data Poisoning Defenses – https://cryptopunk4762.com/f/a-brief-introduction-to-ai-data-poisoning-defenses
- A Brief Taxonomy Of AI Data Poisoning Attacks – https://cryptopunk4762.com/f/a-brief-taxonomy-of-ai-data-poisoning-attacks
- A Brief Taxonomy Of AI Data Poisoning Defenses – https://cryptopunk4762.com/f/a-brief-taxonomy-of-ai-data-poisoning-defenses
- Advances In Neural Information Processing Systems 26 (NIPS 2013) – https://papers.nips.cc/paper_files/paper/2013
- Adversarial Clean Label Backdoor Attacks And Defenses On Text Classification Systems – https://arxiv.org/abs/2305.19607
- Adversarial Label Flips Attack On Support Vector Machines – https://www.sec.in.tum.de/i20/publications/adversarial-label-flips-attack-on-support-vector-machines
- A Game-theoretic Analysis Of Label Flipping Attacks On Distributed Support Vector Machines – https://ieeexplore.ieee.org/document/7926118
- A Label Flipping Attack On Machine Learning Model And Its Defense Mechanism – https://www.researchgate.net/publication/367031053_A_Label_Flipping_Attack_on_Machine_Learning_Model_and_Its_Defense_Mechanism
- Analysis Of Causative Attacks Against SVMs Learning From Data Streams – https://faculty.washington.edu/lagesse/publications/CausativeSVM.pdf
- Analyzing Federated Learning Through An Adversarial Lens – https://arxiv.org/abs/1811.12470
- An Embarrassingly Simple Approach For Trojan Attack In Deep Neural Networks – https://arxiv.org/abs/2006.08131
- Anti-backdoor Learning: Training Clean Models On Poisoned Data – https://arxiv.org/abs/2110.11571
- Artificial Intelligence Crime: An Overview of Malicious Use And Abuse Of AI – https://ieeexplore.ieee.org/document/9831441
- A Semantic And Clean-label Backdoor Attack Against Graph Convolutional Networks – https://arxiv.org/pdf/2503.14922
- Awesome Learning With Noisy Labels – https://github.com/subeeshvasu/Awesome-Learning-with-Label-Noise
- BAAAN: Backdoor Attacks Against Autoencoder And GAN-based Machine Learning Models – https://arxiv.org/abs/2010.03007
- Backdoor Attacks Against Deep Learning Systems In The Physical World – https://ieeexplore.ieee.org/document/9577800
- Backdoor Embedding In Convolutional Neural Network Models Via Invisible Perturbation – https://arxiv.org/abs/1808.10307
- Backdooring And Poisoning Neural Networks With Image-scaling Attacks – https://arxiv.org/abs/2003.08633
- Backdoor Scanning For Deep Neural Networks Through K-arm Optimization – https://arxiv.org/abs/2102.05123
- BadNets: Identifying Vulnerabilities In The Machine Learning Model Supply Chain – https://machine-learning-and-security.github.io/papers/mlsec17_paper_51.pdf
- BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements – https://arxiv.org/abs/2006.01043
- Bagging Classifiers For Fighting Poisoning Attacks In Adversarial Classification Tasks – https://dl.acm.org/doi/abs/10.5555/2040895.2040945
- Be Careful About Poisoned Word Embeddings: Exploring The Vulnerability Of The Embedding Layers In NLP Models – https://arxiv.org/pdf/2103.15543
- Blind Backdoors In Deep Learning Models – https://arxiv.org/abs/2005.03823
- Blockwise p-tampering Attacks On Cryptographic Primitives, Extractors, And Learners – https://eprint.iacr.org/2017/950.pdf
- Bullseye Polytope: A Scalable Clean-label Poisoning Attack With Improved Transferability – https://arxiv.org/pdf/2005.00191
- Casting Out Demons: Sanitizing Training Data For Anomaly Sensors – https://ieeexplore.ieee.org/abstract/document/4531146
- Certified Robustness To Label-Flipping Attacks Via Randomized Smoothing – https://arxiv.org/abs/2002.03018
- Clean-label Backdoor Attack And Defense: An Examination Of Language Model Vulnerability – https://dl.acm.org/doi/10.1016/j.eswa.2024.125856
- Clean-label Backdoor Attacks By Selectively Poisoning With Limited Information From Target Class – https://openreview.net/pdf?id=JvUuutHa2s
- Clean-Label Backdoor Attacks On Video Recognition Models – https://openaccess.thecvf.com/content_CVPR_2020/html/Zhao_Clean-Label_Backdoor_Attacks_on_Video_Recognition_Models_CVPR_2020_paper.html
- Clean-Label Feature Collision Attacks On A Keras Classifier – https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/main/notebooks/poisoning_attack_feature_collision.ipynb
- COMBAT: Alternated Training For Effective Clean-Label Backdoor Attacks – https://ojs.aaai.org/index.php/AAAI/article/view/28019
- Concealed Data Poisoning Attacks On NLP Models – https://arxiv.org/pdf/2010.12563
- Customizing Triggers With Concealed Data Poisoning – https://pdfs.semanticscholar.org/6d8d/d81f2d18e86b2fa23d52ef14dbcba39864b4.pdf
- DarkMind: Latent Chain-Of-Thought Backdoor In Customized LLMs – https://arxiv.org/abs/2501.18617
- DarkMind: A New Backdoor Attack That Leverages The Reasoning Capabilities Of LLMs – https://techxplore.com/news/2025-02-darkmind-backdoor-leverages-capabilities-llms.html
- Data Poisoning Against Differentially-private Learners: Attacks And Defenses – https://arxiv.org/abs/1903.09860
- Data Poisoning Attacks On Factorization-based Collaborative Filtering – https://papers.nips.cc/paper_files/paper/2016/file/83fa5a432ae55c253d0e60dbfa716723-Paper.pdf
- Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, And Defenses – https://arxiv.org/pdf/2012.10544
- DeepInspect: A Black-box Trojan Detection And Mitigation Framework For Deep Neural Networks – https://www.ijcai.org/proceedings/2019/0647.pdf
- Deep k-NN Defense Against Clean-Label Data Poisoning Attacks – https://dl.acm.org/doi/10.1007/978-3-030-66415-2_4
- Defending Neural Backdoors Via Generative Distribution Modeling – https://proceedings.neurips.cc/paper_files/paper/2019/file/78211247db84d96acf4e00092a7fba80-Paper.pdf
- Demon In The Variant: Statistical Analysis Of DNNs For Robust Backdoor Contamination Detection – https://arxiv.org/abs/1908.00686
- Design Of Intentional Backdoors In Sequential Models – https://arxiv.org/abs/1902.09972
- Detecting AI Trojans Using Meta Neural Analysis – https://ieeexplore.ieee.org/document/9519467
- Detecting Backdoor Attacks On Deep Neural Networks By Activation Clustering – https://arxiv.org/abs/1811.03728
- Detection Of Adversarial Training Examples In Poisoning Attacks Through Anomaly Detection – https://arxiv.org/abs/1802.03041
- Dp-InstaHide: Provably Defusing Poisoning And Backdoor Attacks With Differentially Private Data Augmentations – https://arxiv.org/pdf/2103.02079
- Dynamic Backdoor Attacks Against Machine Learning Models – https://arxiv.org/abs/2003.03675
- Effective Clean-Label Backdoor Attacks On Graph Neural Networks – https://dl.acm.org/doi/10.1145/3627673.3679905
- Efficient Label Contamination Attacks Against Black-Box Learning Models – https://www.researchgate.net/publication/317252983_Efficient_Label_Contamination_Attacks_Against_Black-Box_Learning_Models
- Explanation-guided Backdoor Poisoning Attacks Against Malware Classifiers – https://www.usenix.org/system/files/sec21-severi.pdf
- Fast Adversarial Label-Flipping Attack On Tabular Data – https://arxiv.org/abs/2310.10744
- Fawkes: Protecting Privacy Against Unauthorized Deep Learning Models – https://arxiv.org/abs/2002.08327
- Fine-pruning: Defending Against Backdooring Attacks On Deep Neural Networks – https://link.springer.com/chapter/10.1007/978-3-030-00470-5_13
- GangSweep: Sweep Out Neural Backdoors By GAN – https://dl.acm.org/doi/pdf/10.1145/3394171.3413546
- Generative AI Misuse: A Taxonomy Of Tactics And Insights From Real-World Data – https://arxiv.org/abs/2406.13843
- Generative Poisoning Attack Method Against Neural Networks – https://arxiv.org/pdf/1703.01340
- Google SAIF “Top Risks Of Generative AI Systems” – https://saif.google/secure-ai-framework/risks
- Handcrafted Backdoors In Deep Neural Networks – https://arxiv.org/pdf/2106.04690
- Hardware Trojan Attacks On Neural Networks – https://arxiv.org/abs/1806.05768
- Hidden Killer: Invisible Textual Backdoor Attacks With Syntactic Trigger – https://arxiv.org/abs/2105.12400
- Hidden Trigger Backdoor Attacks – https://arxiv.org/abs/1910.00033
- How To Backdoor Federated Learning – https://proceedings.mlr.press/v108/bagdasaryan20a.html
- ImageNet – https://www.image-net.org/
- InfoGAN: Interpretable Representation Learning By Information Maximizing Generative Adversarial Nets – https://arxiv.org/abs/1606.03657
- Influence Functions In Deep Learning Are Fragile – https://arxiv.org/abs/2006.14651
- Influence Function Based Data Poisoning Attacks To Top-n Recommender Systems – https://arxiv.org/abs/2002.08025
- Invisible Black-Box Backdoor Attack Against Deep Cross-Modal Hashing Retrieval – https://dl.acm.org/doi/10.1145/3650205
- Invisible Backdoor Attack With Sample-specific Triggers – https://arxiv.org/abs/2012.03816
- Is AGI An Asymmetric Threat? – https://cryptopunk4762.com/f/is-agi-an-asymmetric-threat
- Is Feature Selection Secure Against Training Data Poisoning? – https://arxiv.org/abs/1804.07933
- Label-consistent Backdoor Attacks – https://arxiv.org/abs/1912.02771
- Learning To Confuse: Generating Training Time Adversarial Data With Auto-encoder – https://arxiv.org/abs/1905.09027
- Learning Under p-tampering Attacks – https://proceedings.mlr.press/v83/mahloujifar18a/mahloujifar18a.pdf
- Learning With Noisy Labels – https://papers.nips.cc/paper_files/paper/2013/hash/3871bd64012152bfb53fdf04b401193f-Abstract.html
- Less Is More: Stealthy And Adaptive Clean-Image Backdoor Attacks With Few Poisoned – https://openreview.net/forum?id=LsTIW9VAF7
- LFGurad: A Defense Against Label Flipping Attack In Federated Learning For Vehicular Network – https://www.sciencedirect.com/science/article/abs/pii/S1389128624006005
- Local Model Poisoning Attacks To Byzantine-robust Federated Learning – https://www.usenix.org/conference/usenixsecurity20/presentation/fang
- Malicious ML Models Discovered On Hugging Face Platform – https://www.reversinglabs.com/blog/rl-identifies-malware-ml-model-hosted-on-hugging-face
- Manipulating Machine Learning: Poisoning Attacks And Countermeasures For Regression Learning – https://ieeexplore.ieee.org/document/8418594
- Mapping The Misuse Of generative AI – https://deepmind.google/discover/blog/mapping-the-misuse-of-generative-ai/
- MetaPoison: Practical General-purpose Clean-label Data Poisoning – https://proceedings.neurips.cc/paper_files/paper/2020/file/8ce6fc704072e351679ac97d4a985574-Paper.pdf
- Mitigating Poisoning Attacks On Machine Learning Models: A Data Provenance Based Approach – https://www.cs.purdue.edu/homes/bb/nit/20_nathalie-Mitigating_Poisoning_Attacks_on_Machine_Learning_Models_A_Data_Provenance_Based_Approach.pdf
- ML Attack Models: Adversarial Attacks And Data Poisoning Attacks – https://arxiv.org/abs/2112.02797
- Narcissus: A Practical Clean-Label Backdoor Attack With Limited Information – https://arxiv.org/abs/2204.05255
- Neural Cleanse: Identifying And Mitigating Backdoor Attacks In Neural Networks – https://ieeexplore.ieee.org/abstract/document/8835365
- Neural Trojans – https://arxiv.org/pdf/1710.00942
- NIC: Detecting Adversarial Samples With Neural Network Invariant Checking – https://par.nsf.gov/servlets/purl/10139597
- On Defending Against Label Flipping Attacks On Malware Detection Systems – https://arxiv.org/abs/1908.04473
- One-pixel Signature: Characterizing CNN Models For Backdoor Detection – https://arxiv.org/abs/2008.07711
- On The Effectiveness Of Mitigating Data Poisoning Attacks With Gradient Shaping – https://arxiv.org/abs/2002.11497
- Poison Forensics: Traceback Of Data Poisoning Attacks In Neural Networks – https://www.usenix.org/system/files/sec22-shan.pdf
- Poison Frogs! Targeted Clean-label Poisoning Attacks On Neural Networks – https://arxiv.org/abs/1804.00792
- Poisoned Classifiers Are Not Only Backdoored, They Are Fundamentally Broken – https://arxiv.org/abs/2010.09080
- Poisoning And Backdooring Contrastive Learning – https://arxiv.org/abs/2106.09667
- Poisoning Attack In Federated Learning Using Generative Adversarial Nets – https://ieeexplore.ieee.org/document/8887357
- Poisoning Attacks Against Support Vector Machines – https://arxiv.org/abs/1206.6389
- Poisoning Attacks On Algorithmic Fairness – https://arxiv.org/abs/2004.07401
- Poisoning Attacks With Generative Adversarial Nets – https://arxiv.org/abs/1906.07773
- Poisoning Deep Reinforcement Learning Agents With In-distribution Triggers – https://arxiv.org/abs/2106.07798
- Poisoning Language Models During Instruction Tuning – https://arxiv.org/pdf/2305.00944
- Poisoning Web-Scale Training Datasets Is Practical – https://arxiv.org/abs/2302.10149
- Practical Detection Of Trojan Neural Networks: Data-limited And Data-free Cases – https://arxiv.org/abs/2007.15802
- Practical Poisoning Attacks On Neural Networks – https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720137.pdf
- Preventing Unauthorized Use Of Proprietary Data: Poisoning For Secure Dataset Release – https://arxiv.org/abs/2103.02683
- Protecting Intellectual Property Of Deep Neural Networks With Watermarking – https://dl.acm.org/doi/10.1145/3196494.3196550
- Protecting The Public From Abusive AI-Generated Content – https://cdn-dynmedia-1.microsoft.com/is/content/microsoftcorp/microsoft/msc/documents/presentations/CSR/Protecting-Public-Abusive-AI-Generated-Content.pdf
- Provable Robustness Against Backdoor Attacks – https://arxiv.org/abs/2003.08904
- Radioactive Data: Tracing Through Training – https://arxiv.org/abs/2002.00937
- REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data – https://arxiv.org/pdf/1911.07205
- Removing Backdoor-based Watermarks In Neural Networks With Limited Data – https://arxiv.org/pdf/2008.00407
- Robust Linear Regression Against Training Data Poisoning – https://dl.acm.org/doi/10.1145/3128572.3140447
- Seal Your Backdoor With Variational Defense – https://arxiv.org/pdf/2503.08829
- SentiNet: Detecting Localized Universal Attack Against Deep Learning Systems – https://ieeexplore.ieee.org/document/9283822
- Silent Killer: A Stealthy, Clean-Label, Black-Box Backdoor Attack – https://arxiv.org/abs/2301.02615
- Smart Lexical Search For Label Flipping Adversial Attack – https://aclanthology.org/2024.privatenlp-1.11.pdf
- Spectral Signatures In backdoor attacks – https://proceedings.neurips.cc/paper_files/paper/2018/file/280cf18baf4311c92aa5a042336587d3-Paper.pdf
- Stop-and-Go: Exploring Backdoor Attacks On Deep Reinforcement Learning-based Traffic Congestion Control Systems – https://arxiv.org/abs/2003.07859
- STRIP: A defence Against Trojan Attacks On Deep Neural Networks – https://dl.acm.org/doi/abs/10.1145/3359789.3359790
- Strong Data Augmentation Sanitizes Poisoning And Backdoor Attacks Without An Accuracy Tradeoff – https://arxiv.org/pdf/2011.09527
- Stronger Data Poisoning Attacks Break Data Sanitization Defenses – https://arxiv.org/pdf/1811.00741
- Support Vector Machines Under Adversarial Label Noise – https://proceedings.mlr.press/v20/biggio11.html
- TABOR: A Highly Accurate Approach To Inspecting And Restoring Trojan Backdoors In AI systems – https://arxiv.org/pdf/1908.01763
- Targeted Backdoor Attacks On Deep Learning Systems Using Data Poisoning – https://arxiv.org/abs/1712.05526
- Targeted Poisoning Attacks On Social Recommender Systems – https://ieeexplore.ieee.org/document/9013539
- TensorClog: An Imperceptible Poisoning Attack On Deep Neural Network Applications – https://ieeexplore.ieee.org/document/8668758
- The Art Of Deception: Robust Backdoor Attack Using Dynamic Stacking Of Triggers – https://arxiv.org/html/2401.01537v4
- The Curse Of Concentration In Robust Learning: Evasion And Poisoning Attacks From Concentration Of Measure – https://ojs.aaai.org/index.php/AAAI/article/view/4373
- The Path To Defence: A Roadmap To Characterising Data Poisoning Attacks On Victim Models – https://dl.acm.org/doi/10.1145/3627536
- Top Risks Of Generative AI Systems – https://saif.google/secure-ai-framework/risks
- Towards Clean-Label Backdoor Attacks In The Physical World – https://arxiv.org/html/2407.19203v1
- Towards Data Poisoning Attacks In Crowd Sensing Systems – https://cse.buffalo.edu/~lusu/papers/MobiHoc2018.pdf
- Towards Poisoning Of Deep Learning Algorithms With Back-gradient Optimization – https://arxiv.org/pdf/1708.08689
- Transferable Clean-label Poisoning Attacks On Deep Neural Nets – https://arxiv.org/abs/1905.05897
- Triggerless Backdoor Attack For NLP Tasks With Clean Labels – https://arxiv.org/abs/2111.07970
- Trojan Attack On Deep Generative Models In Autonomous Driving – https://link.springer.com/chapter/10.1007/978-3-030-37228-6_15
- Trojaning Attack On Neural Networks – https://www.ndss-symposium.org/wp-content/uploads/2018/02/ndss2018_03A-5_Liu_paper.pdf
- Trojaning Language Models For Fun And Profit – https://arxiv.org/abs/2008.00312
- TrojDRL: Evaluation Of Backdoor Attacks On Deep Reinforcement Learning – https://dl.acm.org/doi/10.5555/3437539.3437570
- Truth Serum: Poisoning Machine Learning Models To Reveal Their Secrets – https://arxiv.org/abs/2204.00032
- Turning Your Weakness Into A Strength: Watermarking Deep Neural Networks By Backdooring – https://www.usenix.org/conference/usenixsecurity18/presentation/adi
- Understanding Black-box Predictions Via Influence Functions – https://proceedings.mlr.press/v70/koh17a/koh17a.pdf
- Universal Litmus Patterns: Revealing Backdoor Attacks In CNNs – https://arxiv.org/abs/1906.10842
- Universal Multi-Party Poisoning Attacks – https://arxiv.org/abs/1809.03474
- Unlearnable Examples: Making Personal Data Unexploitable – https://arxiv.org/abs/2101.04898
- Using Machine Teaching To Identify Optimal Training-set Attacks On Machine Learners – https://ojs.aaai.org/index.php/AAAI/article/view/9569
- Weight Poisoning Attacks On Pretrained Models – https://aclanthology.org/2020.acl-main.249.pdf
- What Are The Ethical Risks Of Strong AI? – https://cryptopunk4762.com/f/what-are-the-ethical-risks-of-strong-ai
- What Doesn’t Kill You Makes You Robust(er): How To Adversarially Train Against Data Poisoning – https://arxiv.org/abs/2102.13624
- Wicked Oddities: Selectively Poisoning For Effective Clean-Label Backdoor Attacks – https://arxiv.org/abs/2407.10825
- Witches’ Brew: Industrial Scale Data Poisoning Via Gradient Matching – https://arxiv.org/abs/2009.02276
- You Autocomplete Me: Poisoning Vulnerabilities In Neural Code Completion – https://www.usenix.org/system/files/sec21-schuster.pdf
Thanks for reading!