attacks#

Note

See the Glossary for the meaning of the acronyms used in this guide.

fgm.py#

A task plugin module for the Fast Gradient Method evasion attack.

The Fast Gradient Method (FGM) [goodfellow2015] is an evasion attack that attempts to fool a trained classifier by perturbing a test image using the gradient of the classifier’s neural network. This task plugin uses the Adversarial Robustness Toolbox’s [art2019] implementation of the Fast Gradient Method.

References

art2019

M.-I. Nicolae et al., “Adversarial Robustness Toolbox v1.0.0,” Nov. 2019. [Online]. Available: arXiv:1807.01069v4 [cs.LG].

goodfellow2015

I. Goodfellow, J. Shlens, and C. Szegedy. (May 2015). Explaining and Harnessing Adversarial Examples, Presented at the Int. Conf. on Learn. Represent. 2015, San Diego, California, United States. [Online]. Available: arXiv:1412.6572v3 [stat.ML].

create_adversarial_fgm_dataset(data_dir: str, adv_data_dir: Union[str, pathlib.Path], keras_classifier: art.estimators.classification.TensorFlowV2Classifier, image_size: Tuple[int, int, int], distance_metrics_list: Optional[List[Tuple[str, Callable[[...], numpy.ndarray]]]] = None, rescale: float = 0.00392156862745098, batch_size: int = 32, label_mode: str = 'categorical', eps: float = 0.3, eps_step: float = 0.1, minimal: bool = False, norm: Union[int, float, str] = numpy.inf) pandas.DataFrame[source]#

Generates an adversarial dataset using the Fast Gradient Method attack.

Each generated adversarial image is saved as an image file in the directory specified by adv_data_dir and the distance metric functions passed to distance_metrics_list are used to quantify the size of the perturbation applied to each image.

Parameters
  • data_dir – The directory containing the clean test images.

  • adv_data_dir – The directory to use when saving the generated adversarial images.

  • keras_classifier – A trained TensorFlowV2Classifier.

  • image_size – A tuple of integers (height, width, channels) used to preprocess the images so that they all have the same dimensions and number of color channels. channels=3 means RGB color images and channels=1 means grayscale images. Images with different dimensions will be resized. If channels=1, color images will be converted into grayscale.

  • distance_metrics_list – A list of distance metrics to compute after generating an adversarial image. If None, then no distance metrics will be calculated. The default is None.

  • rescale – The rescaling factor for the pixel vectors. If None or 0, no rescaling is applied, otherwise multiply the data by the value provided (after applying all other transformations). The default is 1.0 / 255.

  • batch_size – The size of the batch on which adversarial samples are generated. The default is 32.

  • label_mode – Determines how the label arrays for the dataset will be returned. The available choices are: “categorical”, “binary”, “sparse”, “input”, None. For information on the meaning of each choice, see the documentation for tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory(). The default is “categorical”.

  • eps – The attack step size. The default is 0.3.

  • eps_step – The step size of the input variation for minimal perturbation computation. The default is 0.1.

  • minimal – If True, compute the minimal perturbation, and use eps_step for the step size and eps for the maximum perturbation. The default is False.

  • norm – The norm of the adversarial perturbation. Can be “inf”, numpy.inf, 1, or 2. The default is numpy.inf.

Returns

A DataFrame containing the full distribution of the calculated distance metrics.

See also

  • tf.keras.preprocessing.image.ImageDataGenerator.flow_from_directory()