Adversarial robustness refers to the robustness of models against the "worst-case" attacks. The goal is to explore potential methods that enhance the robustness of machine learning with a special emphasis on defense against adversarial examples.
Attack on TRADES is a project where we design our own evasion attack to fool TRADES, the state-of-the-art method for training an adversarially robust deep neural networks. Our attack utilizes a gradient-based approach with an iterative method that includes randomization in each step. The attack achieves a robust accuracy of 95.1% on the MNIST dataset (6th place on leaderboard) and a robust accuracy of 54.65% on the CIFAR-10 dataset (9th place on leaderboard).
Currently, we are working on another project where we investigate the feasibility of incorporating self-supervised learning methods into the pre-training stage of a model to further enhance its robustness against adversarial examples.
- Core Features
- Gradient-based iterative approach
- Evasion attack designed to fool state-of-the-art classifier
- Investigation of self-supervised learning methods to enhance robustness
- SkillsPython3, Pytorch, Adversarial Machine Learning, Evasion Attack
- Sourcegithub.com/yyou22/TRADES
- KeywordsMachine Learning, Adversarial Attack, Self-supervised Learning