AFL.double_agent.Generator module#

Data generation tools for creating synthetic datasets and sampling spaces.

This module provides classes for generating various types of data structures commonly used in materials science and machine learning applications. The generators can create regular grids, compositional spaces, and specialized point distributions.

Key features: - Cartesian grid generation with flexible specifications - Barycentric grid generation for compositional spaces - Gaussian point distributions for exclusion zones - Support for multi-dimensional spaces - Integration with xarray data structures

class AFL.double_agent.Generator.BarycentricGrid(output_variable: str, components: List[str], sample_dim: str, pts_per_row: int = 50, basis: float = 1.0, dim: int = 3, eps: float = 1e-09, name='BarycentricGridGenerator')#

Bases: Generator

Generator that produces a grid in barycentric coordinates.

Creates a grid suitable for compositional spaces where the sum of components must equal a fixed value (typically 1.0). The grid is generated by systematically sampling points that satisfy the barycentric constraint.

Parameters:
  • output_variable (str) – The name of the variable to be inserted into the dataset

  • components (List[str]) – List of component names for the compositional space

  • sample_dim (str) – Name of the dimension for different samples/points

  • pts_per_row (int, default=50) – Number of points to sample along each row of the simplex

  • basis (float, default=1.0) – The sum constraint for the compositions (typically 1.0)

  • dim (int, default=3) – Number of dimensions in the compositional space

  • eps (float, default=1e-9) – Small value for numerical stability in equality comparisons

  • name (str, default="BarycentricGridGenerator") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Generate the barycentric grid.

Creates a grid of points that satisfy the barycentric constraint by systematically sampling the simplex space.

Parameters:

dataset (xr.Dataset) – The input dataset (not used by this generator)

Returns:

The generator instance with the created barycentric grid

Return type:

Self

class AFL.double_agent.Generator.CartesianGrid(output_variable: str, grid_spec: Dict[str, Dict[str, int | float]], sample_dim: str, component_dim: str = 'component', name: str = 'CartesianGridGenerator')#

Bases: Generator

Generator that produces a cartesian grid according to user-provided specifications.

Creates a regular grid in N-dimensional space where each dimension can have its own min, max, and step size specifications. The resulting grid contains all possible combinations of points along each dimension.

Parameters:
  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset

  • grid_spec (Dict[str, Dict[str, int | float]]) – Dictionary where each top-level key corresponds to a component in the system. Each top-level key points to a subdictionary that defines the minimum, maximum, and step size for that component with keys: min, max, steps.

  • sample_dim (str) – Name of the dimension for different samples/points in the grid

  • component_dim (str, default='component') – Name of the dimension for different components

  • name (str, default="CartesianGridGenerator") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Generate the cartesian grid based on specifications.

Creates a grid by taking the cartesian product of points along each dimension as specified in the grid_spec.

Parameters:

dataset (xr.Dataset) – The input dataset (not used by this generator)

Returns:

The generator instance with the created grid

Return type:

Self

class AFL.double_agent.Generator.GaussianPoints(input_variable: str, sample_dim: str, output_variable: str, grid_variable: str, grid_dim: str, comps_dim: str = 'component', exclusion_depth: float = 0.001, exclusion_radius: float = 0.001, name: str = 'GaussianPointsGenerator')#

Bases: Generator

Generator that creates Gaussian-distributed points for exclusion zones.

This generator places Gaussian distributions centered at specified points, useful for creating exclusion zones or smooth transitions around specific locations in the sampling space.

Parameters:
  • input_variable (str) – The name of the variable containing points to center Gaussians around

  • sample_dim (str) – Name of the dimension for different samples/points

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • grid_variable (str) – The name of the grid variable to evaluate Gaussians on

  • grid_dim (str) – Name of the grid dimension

  • comps_dim (str, default="component") – Name of the components dimension

  • exclusion_depth (float, default=1e-3) – Maximum value of the Gaussian distributions

  • exclusion_radius (float, default=1e-3) – Width parameter for the Gaussian distributions

  • name (str, default="GaussianPointsGenerator") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Generate Gaussian-distributed points.

Places multivariate normal distributions centered at each input point, creating a field of Gaussian peaks that can be used for exclusion zones or smooth transitions.

Parameters:

dataset (xr.Dataset) – The input dataset containing points to center Gaussians around and the grid to evaluate them on

Returns:

The generator instance with the created Gaussian field

Return type:

Self

class AFL.double_agent.Generator.Generator(output_variable: str, input_variable: str = 'Generator', name: str = 'GeneratorBase')#

Bases: PipelineOp

Base class for all data generation operations.

This abstract base class provides common functionality for generating synthetic data or sampling spaces. Unlike most PipelineOps, Generators typically don’t require input data but instead create new data based on parameters.

Parameters:
  • input_variable (str) – Generators generally do not use input variables but this can be used to name the input node for a generator

  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset by this PipelineOp

  • name (str) – The name to use when added to a Pipeline. This name is used when calling Pipeline.search()

calculate(dataset: Dataset) Self#

Apply this generator to the supplied dataset.

This method must be implemented by subclasses to define how the data generation is performed.

Parameters:

dataset (xr.Dataset) – The input dataset (typically not used by generators)

Returns:

The generator instance with generated outputs

Return type:

Self