AFL.double_agent.AcquisitionFunction module#

Acquisition functions for active learning and optimization.

This module provides various acquisition functions that help guide the selection of new samples in active learning and optimization tasks. Each acquisition function evaluates a set of candidate points and determines which points would be most valuable to sample next.

Key features: - Support for single and multi-point acquisition - Ability to exclude previously sampled points - Random and maximum value acquisition strategies - Upper confidence bound acquisition - Support for multi-modal acquisition with masking

exception AFL.double_agent.AcquisitionFunction.AcquisitionError#

Bases: Exception

Exception raised when an error in the acquisition decision occurs

class AFL.double_agent.AcquisitionFunction.AcquisitionFunction(input_variables: List[str], grid_variable: str, grid_dim: str = 'grid', output_prefix: str | None = None, output_variable: str = 'next_compositions', decision_rtol: float = 0.05, excluded_comps_variables: List[str] | None = None, excluded_comps_dim: str | None = None, exclusion_radius: float = 0.001, count: int = 1, name: str = 'AcquisitionFunctionBase')#

Bases: PipelineOp

Base acquisition function for selecting next points to sample.

Acquisition functions gather one or more inputs and use that information to choose one or more points from a supplied composition grid. This base class provides common functionality for implementing specific acquisition strategies.

Parameters:
  • input_variables (List[str]) – The name of the xarray.Dataset data variables to extract from the input xarray.Dataset

  • grid_variable (str) – The name of the xarray.Dataset data variable to use as a evaluation grid.

  • grid_dim (str) – The xarray dimension over each grid_point. Grid equivalent to sample.

  • output_prefix (Optional[str]) – If provided, all outputs of this PipelineOp will be prefixed with this string

  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset by this PipelineOp

  • decision_rtol (float) – The next sample will be randomly chosen from all grid points that are within decision_rtol percent of the maximum of the decision surface. This

  • excluded_comps_variables (Optional[List[str]]) – A list of xarray.Dataset composition variables to use in building an exclusion surface that is added to the decision surface. This exclusion surface is built but placing multidimensional inverted Gaussians at every composition point specified in the excluded_comps_variables. This is done using the GaussianPoints generator.

  • excluded_comps_dim (str) – The xarray dimension over the components of a composition.

  • exclusion_radius (float) – The width of the Gaussian placed by the GaussianPoints generator. See that documentation for more details.

  • count (int) – The number of samples to pull from the grid.

  • name (str) – The name to use when added to a Pipeline. This name is used when calling Pipeline.search()

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.dataset

This method must be implemented by subclasses to define how the acquisition function evaluates and selects points from the grid.

Parameters:

dataset (xr.Dataset) – The input dataset containing variables to evaluate

Returns:

The acquisition function instance with updated outputs

Return type:

Self

exclude_previous_samples(dataset: Dataset) Dataset#

Modify the decision surface by placing Gaussian exclusion zones.

This method modifies the decision surface by adding Gaussian exclusion zones around previously measured compositions to prevent resampling of similar points.

Parameters:

dataset (xr.Dataset) – Dataset containing the decision surface and compositions to exclude

Returns:

Modified dataset with updated decision surface including exclusion zones

Return type:

xr.Dataset

Raises:

AcquisitionError – If required variables are missing from the dataset

get_next_samples(dataset: Dataset) None#

Choose the next compositions by evaluating the decision surface.

This method finds all compositions that are within decision_rtol of the maximum values of the decision surface. From this set of compositions, it randomly chooses count compositions as the next sample compositions.

Parameters:

dataset (xr.Dataset) – Dataset containing the decision surface and composition grid

Raises:

AcquisitionError – If required variables are missing or if no valid points are found

class AFL.double_agent.AcquisitionFunction.MaxValueAF(input_variables: List[str], grid_variable: str, grid_dim: str = 'grid', combine_coeffs: List[float] | None = None, output_prefix: str | None = None, output_variable: str = 'next_samples', decision_rtol: float = 0.05, excluded_comps_variables: List[str] | None = None, excluded_comps_dim: str | None = None, exclusion_radius: float = 0.001, count: int = 1, name: str = 'MaxValueAF')#

Bases: AcquisitionFunction

Acquisition function that selects points based on maximum values.

This acquisition function chooses points by finding the maximum values in the decision surface. It can combine multiple input variables with optional scaling coefficients and supports exclusion of previously measured points.

Parameters:
  • input_variables (List[str]) – The name of the xarray.Dataset data variables to extract from the input xarray.Dataset

  • grid_variable (str) – The name of the xarray.Dataset data variable to use as a evaluation grid.

  • grid_dim (str) – The xarray dimension over each grid_point. Grid equivalent to sample.

  • combine_coeffs (Optional[List[float]]) – If provided, the self.input_variables will be scaled by these coefficients before being summed.

  • output_prefix (Optional[str]) – If provided, all outputs of this PipelineOp will be prefixed with this string

  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset by this PipelineOp

  • decision_rtol (float) – The next sample will be randomly chosen from all grid points that are within decision_rtol percent of the maximum of the decision surface. This

  • excluded_comps_variables (Optional[List[str]]) – A list of xarray.Dataset composition variables to use in building an exclusion surface that is added to the decision surface. This exclusion surface is built but placing multidimensional inverted Gaussians at every composition point specified in the excluded_comps_variables. This is done using the GaussianPoints generator.

  • excluded_comps_dim (str) – The xarray dimension over the components of a composition.

  • exclusion_radius (float) – The width of the Gaussian placed by the GaussianPoints generator. See that documentation for more details.

  • count (int) – The number of samples to pull from the grid.

  • name (str) – The name to use when added to a Pipeline. This name is used when calling Pipeline.search()

calculate(dataset: Dataset) Self#

Apply this MaxValueAF to the supplied dataset.

Combines multiple input variables with optional scaling coefficients to create a decision surface based on maximum values.

Parameters:

dataset (xr.Dataset) – The input dataset containing variables to evaluate

Returns:

The MaxValueAF instance with updated outputs

Return type:

Self

class AFL.double_agent.AcquisitionFunction.MultimodalMask_MaxValueAF(decision_variable: str, mask_label_variable: str, phase_select_coords: dict[str, float], grid_variable: str, grid_dim: str = 'grid', combine_coeffs: List[float] | None = None, output_prefix: str | None = None, output_variable: str = 'next_samples', decision_rtol: float = 0.05, excluded_comps_variables: List[str] | None = None, excluded_comps_dim: str | None = None, exclusion_radius: float = 0.001, count: int = 1, name: str = 'MaxValueAF')#

Bases: AcquisitionFunction

Acquisition function that selects points based on maximum values within specific phase regions.

This acquisition function extends MaxValueAF by adding the ability to focus sampling in specific phase regions of the composition space. It uses a mask label variable to identify different phase regions and selects points based on their proximity to target phase coordinates.

Parameters:
  • decision_variable (str) – The name of the variable containing the decision surface values

  • mask_label_variable (str) – The name of the variable containing phase region labels

  • phase_select_coords (dict[str,float]) – Dictionary mapping phase component names to target coordinates

  • grid_variable (str) – The name of the xarray.Dataset data variable to use as a evaluation grid

  • grid_dim (str, default="grid") – The xarray dimension over each grid_point. Grid equivalent to sample.

  • combine_coeffs (Optional[List[float]], default=None) – If provided, the input variables will be scaled by these coefficients before being summed

  • output_prefix (Optional[str], default=None) – If provided, all outputs of this PipelineOp will be prefixed with this string

  • output_variable (str, default="next_samples") – The name of the variable to be inserted into the dataset

  • decision_rtol (float, default=0.05) – The next sample will be randomly chosen from all grid points that are within decision_rtol percent of the maximum of the decision surface

  • excluded_comps_variables (Optional[List[str]], default=None) – List of composition variables to use in building an exclusion surface

  • excluded_comps_dim (Optional[str], default=None) – The dimension over the components of a composition

  • exclusion_radius (float, default=1e-3) – The width of the Gaussian placed by the GaussianPoints generator

  • count (int, default=1) – The number of samples to pull from the grid

  • name (str, default="MaxValueAF") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this MultimodalMask_MaxValueAF to the supplied dataset.

Creates a decision surface by combining the decision variable with phase region masking based on proximity to target phase coordinates.

Parameters:

dataset (xr.Dataset) – The input dataset containing variables to evaluate

Returns:

The MultimodalMask_MaxValueAF instance with updated outputs

Return type:

Self

class AFL.double_agent.AcquisitionFunction.PseudoUCB(input_variables: List[str], grid_variable: str, grid_dim: str = 'grid', lambdas=None, output_prefix: str | None = None, output_variable: str = 'next_samples', decision_rtol: float = 0.05, excluded_comps_variables: List[str] | None = None, excluded_comps_dim: str | None = None, exclusion_radius: float = 0.001, count: int = 1, name: str = 'PseudoUCB')#

Bases: AcquisitionFunction

Upper Confidence Bound (UCB) acquisition function.

This acquisition function implements a pseudo Upper Confidence Bound strategy where points are selected based on a weighted combination of mean predictions and uncertainty estimates. The weights (lambdas) control the exploration-exploitation trade-off.

Parameters:
  • input_variables (List[str]) – The name of the xarray.Dataset data variables to extract from the input xarray.Dataset

  • grid_variable (str) – The name of the xarray.Dataset data variable to use as a evaluation grid.

  • grid_dim (str) – The xarray dimension over each grid_point. Grid equivalent to sample.

  • lambdas (List[float]) – Scaling parameters for each input variable to control exploration-exploitation trade-off

  • output_prefix (Optional[str]) – If provided, all outputs of this PipelineOp will be prefixed with this string

  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset by this PipelineOp

  • decision_rtol (float) – The next sample will be randomly chosen from all grid points that are within decision_rtol percent of the maximum of the decision surface. This

  • excluded_comps_variables (Optional[List[str]]) – A list of xarray.Dataset composition variables to use in building an exclusion surface that is added to the decision surface. This exclusion surface is built but placing multidimensional inverted Gaussians at every composition point specified in the excluded_comps_variables. This is done using the GaussianPoints generator.

  • excluded_comps_dim (str) – The xarray dimension over the components of a composition.

  • exclusion_radius (float) – The width of the Gaussian placed by the GaussianPoints generator. See that documentation for more details.

  • count (int) – The number of samples to pull from the grid.

  • name (str) – The name to use when added to a Pipeline. This name is used when calling Pipeline.search()

calculate(dataset: Dataset) Self#

Apply this PseudoUCB to the supplied dataset.

Creates a decision surface by combining input variables with lambda weights to balance exploration and exploitation.

Parameters:

dataset (xr.Dataset) – The input dataset containing variables to evaluate

Returns:

The PseudoUCB instance with updated outputs

Return type:

Self

class AFL.double_agent.AcquisitionFunction.RandomAF(grid_variable: str, grid_dim: str = 'grid', output_prefix: str | None = None, output_variable: str = 'next_samples', decision_rtol: float = 0.05, excluded_comps_variables: str | None = None, excluded_comps_dim: str | None = None, exclusion_radius: float = 0.001, count: int = 1, name: str = 'RandomAF')#

Bases: AcquisitionFunction

Randomly choose points from the grid with optional exclusion of previous measurements.

This acquisition function implements a simple random sampling strategy where points are chosen uniformly at random from the available grid points. It can optionally exclude previously measured points using Gaussian exclusion zones.

Parameters:
  • grid_variable (str) – The name of the xarray.Dataset data variable to use as a evaluation grid.

  • grid_dim (str) – The xarray dimension over each grid_point. Grid equivalent to sample.

  • output_prefix (Optional[str]) – If provided, all outputs of this PipelineOp will be prefixed with this string

  • output_variable (str) – The name of the variable to be inserted into the xarray.Dataset by this PipelineOp

  • decision_rtol (float) – The next sample will be randomly chosen from all grid points that are within decision_rtol percent of the maximum of the decision surface. This

  • excluded_comps_variables (Optional[List[str]]) – A list of xarray.Dataset composition variables to use in building an exclusion surface that is added to the decision surface. This exclusion surface is built but placing multidimensional inverted Gaussians at every composition point specified in the excluded_comps_variables. This is done using the GaussianPoints generator.

  • excluded_comps_dim (str) – The xarray dimension over the components of a composition.

  • exclusion_radius (float) – The width of the Gaussian placed by the GaussianPoints generator. See that documentation for more details.

  • count (int) – The number of samples to pull from the grid.

  • name (str) – The name to use when added to a Pipeline. This name is used when calling Pipeline.search()

calculate(dataset: Dataset) Self#

Apply this RandomAF to the supplied dataset.

Creates a random decision surface and optionally applies exclusion zones around previously measured points.

Parameters:

dataset (xr.Dataset) – The input dataset containing the grid to sample from

Returns:

The RandomAF instance with updated outputs

Return type:

Self