AFL.double_agent.PairMetric module#

PipelineOps for Pairwise Metrics

This module contains operations that compute pairwise relationships between samples. PairMetrics generate matrices that capture similarity, distance, or other relationships between pairs of data points.

These metrics are useful for: - Measuring similarity or distance between samples - Constructing adjacency matrices for graph-based algorithms - Identifying clusters or patterns in data - Quantifying relationships between different observations

Each PairMetric is implemented as a PipelineOp that can be composed with others in a processing pipeline.

class AFL.double_agent.PairMetric.CombineMetric(input_variables: List[str], output_variable: str, sample_dim: str, combine_by: str, combine_by_powers: List[Number] | None = None, combine_by_coeffs: List[Number] | None = None, params: str | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='CombineMetric')#

Bases: PairMetric

Combines multiple similarity/distance matrices into a single matrix

This class allows for the combination of multiple similarity or distance matrices using either product or sum operations. Each matrix can be weighted differently using powers (for product) or coefficients (for sum).

Parameters:
  • input_variables (List[str]) – List of variable names containing similarity/distance matrices to combine

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str) – The dimension containing different samples

  • combine_by (str) – Method to combine matrices, either “prod” (product) or “sum”

  • combine_by_powers (Optional[List[Number]], default=None) – List of powers to apply to each matrix when using “prod” combination

  • combine_by_coeffs (Optional[List[Number]], default=None) – List of coefficients to multiply each matrix by when using “sum” combination

  • params (Optional[str], default=None) – Additional parameters

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="CombineMetric") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset

prod(data_list: List) ndarray#
sum(data_list: List) ndarray#
class AFL.double_agent.PairMetric.Delaunay(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='DelaunayMetric')#

Bases: PairMetric

Creates a similarity matrix based on Delaunay triangulation

This class constructs a binary adjacency matrix where samples that share an edge in the Delaunay triangulation have a similarity of 1.0, and all other pairs have a similarity of 0.0. This is useful for identifying natural neighbors in the data.

Parameters:
  • input_variable (str) – The name of the data variable to extract from the input dataset

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str) – The dimension containing different samples

  • params (Optional[Dict[str, Any]], default=None) – Additional parameters (not used in this class)

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="DelaunayMetric") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset

class AFL.double_agent.PairMetric.Distance(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='DistanceMetric')#

Bases: PairMetric

Computes pairwise distances between samples

This class uses scikit-learn’s pairwise_distances to compute distance matrices between samples. Various distance metrics can be specified through the params dictionary (e.g., ‘euclidean’, ‘manhattan’, ‘cosine’).

For details on available distance metrics and their parameters, see: https://scikit-learn.org/stable/modules/metrics.html#metrics

Parameters:
  • input_variable (str) – The name of the data variable to extract from the input dataset

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str) – The dimension containing different samples

  • params (Optional[Dict[str, Any]], default=None) – Parameters for the distance function, including ‘metric’ to specify the distance type

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="DistanceMetric") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset

class AFL.double_agent.PairMetric.Dummy(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name: str = 'DummyMetric')#

Bases: PairMetric

PairMetric that returns only self-similarity (identity matrix)

This simple metric creates an identity matrix where diagonal elements (self-similarity) are 1.0 and all off-diagonal elements are 0.0. This can be useful as a baseline or for testing purposes.

Parameters:
  • input_variable (str) – The name of the data variable to extract from the input dataset

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str) – The dimension containing different samples

  • params (Optional[Dict[str, Any]], default=None) – Additional parameters for metric calculation (not used in this class)

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="DummyMetric") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset

class AFL.double_agent.PairMetric.PairMetric(input_variable: str | List[str], output_variable: str, sample_dim: str = 'sample', params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name: str = 'PairMetric')#

Bases: PipelineOp

Base class for all PairMetrics that produce similarity or distance matrices

This abstract base class provides common functionality for computing and manipulating pairwise metrics between samples. It handles similarity constraints, normalization, and provides a framework for different metric implementations.

Parameters:
  • input_variable (str | List[str]) – The name of the data variable to extract from the input dataset

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str, default="sample") – The dimension containing different samples

  • params (Optional[Dict[str, Any]], default=None) – Additional parameters for metric calculation

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="PairMetric") – The name to use when added to a Pipeline

apply_constraints()#

Constrain pairs in the similarity matrix to be perfectly similar (S[i,j]=1.0) or perfectly dissimilar (S[i,j]=0.0).

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset

normalize1() ndarray#

Normalize similarity matrix such that the diagonal values are all equal to 1

normalize2()#

Normalize similarity matrix such that all values are between 0 and 1

class AFL.double_agent.PairMetric.Similarity(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='SimilarityMetric')#

Bases: PairMetric

Computes pairwise similarity between samples using kernel functions

This class uses scikit-learn’s pairwise_kernels to compute similarity matrices between samples. Various kernel functions can be specified through the params dictionary (e.g., ‘linear’, ‘rbf’, ‘polynomial’).

For details on available kernel functions and their parameters, see: https://scikit-learn.org/stable/modules/metrics.html#metrics

Parameters:
  • input_variable (str) – The name of the data variable to extract from the input dataset

  • output_variable (str) – The name of the variable to be inserted into the dataset

  • sample_dim (str) – The dimension containing different samples

  • params (Optional[Dict[str, Any]], default=None) – Parameters for the kernel function, including ‘metric’ to specify the kernel type

  • constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity

  • constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity

  • name (str, default="SimilarityMetric") – The name to use when added to a Pipeline

calculate(dataset: Dataset) Self#

Apply this PipelineOp to the supplied xarray.Dataset