Adversarial robustness
What is it ?
2 min readFeb 12, 2022
- Supervised ML Systems are often brittle i.e they fail spectacularly on slight adjustments to data. The general thesis behind it is large deep learning models rely a lot of memorization and might be poor in generalization.
Why do we care ?
- Brittle ML systems being used in mission critical applications such as healthcare, self driving, lending etc is a huge risk.

- Brittle ML Systems are also susceptible to poison attacks, where a segment of training data is manipulated to fool the system and produce bad results.
- Deep learning models due to massive memorization are prone to being fooled by imperceptible perturbations known as adversarial attacks.
How to check for adversarial robustness ?
- The simplest way is to collect adversarial datasets. These are specially curated datasets built through perturbations of training examples. There can be multiple such datasets depending on classes of perturbations.
- Run the adversarial datasets through the trained model and measure performance. If there’s drastic drop in performance, the model does not generalize well.
How to prevent it ?
Well this is an active area of research ! The first step is to designing robust models.
- To generalize, focus on datasets which are collected with a focus on minimizing bias.
- Self supervised learning : The first step is to create a pre-trained network without labels. Generally, this pre-trained network is robust if it is able to learn the underlying representation well.
- Use ensembles : Instead of using a single model, use an ensemble of diverse model to generalize better.