AFL.double_agent.PairMetric module#
PipelineOps for Pairwise Metrics
This module contains operations that compute pairwise relationships between samples. PairMetrics generate matrices that capture similarity, distance, or other relationships between pairs of data points.
These metrics are useful for: - Measuring similarity or distance between samples - Constructing adjacency matrices for graph-based algorithms - Identifying clusters or patterns in data - Quantifying relationships between different observations
Each PairMetric is implemented as a PipelineOp that can be composed with others in a processing pipeline.
- class AFL.double_agent.PairMetric.CombineMetric(input_variables: List[str], output_variable: str, sample_dim: str, combine_by: str, combine_by_powers: List[Number] | None = None, combine_by_coeffs: List[Number] | None = None, params: str | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='CombineMetric')#
Bases:
PairMetric
Combines multiple similarity/distance matrices into a single matrix
This class allows for the combination of multiple similarity or distance matrices using either product or sum operations. Each matrix can be weighted differently using powers (for product) or coefficients (for sum).
- Parameters:
input_variables (List[str]) – List of variable names containing similarity/distance matrices to combine
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str) – The dimension containing different samples
combine_by (str) – Method to combine matrices, either “prod” (product) or “sum”
combine_by_powers (Optional[List[Number]], default=None) – List of powers to apply to each matrix when using “prod” combination
combine_by_coeffs (Optional[List[Number]], default=None) – List of coefficients to multiply each matrix by when using “sum” combination
params (Optional[str], default=None) – Additional parameters
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="CombineMetric") – The name to use when added to a Pipeline
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset
- prod(data_list: List) ndarray #
- sum(data_list: List) ndarray #
- class AFL.double_agent.PairMetric.Delaunay(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='DelaunayMetric')#
Bases:
PairMetric
Creates a similarity matrix based on Delaunay triangulation
This class constructs a binary adjacency matrix where samples that share an edge in the Delaunay triangulation have a similarity of 1.0, and all other pairs have a similarity of 0.0. This is useful for identifying natural neighbors in the data.
- Parameters:
input_variable (str) – The name of the data variable to extract from the input dataset
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str) – The dimension containing different samples
params (Optional[Dict[str, Any]], default=None) – Additional parameters (not used in this class)
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="DelaunayMetric") – The name to use when added to a Pipeline
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset
- class AFL.double_agent.PairMetric.Distance(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='DistanceMetric')#
Bases:
PairMetric
Computes pairwise distances between samples
This class uses scikit-learn’s pairwise_distances to compute distance matrices between samples. Various distance metrics can be specified through the params dictionary (e.g., ‘euclidean’, ‘manhattan’, ‘cosine’).
For details on available distance metrics and their parameters, see: https://scikit-learn.org/stable/modules/metrics.html#metrics
- Parameters:
input_variable (str) – The name of the data variable to extract from the input dataset
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str) – The dimension containing different samples
params (Optional[Dict[str, Any]], default=None) – Parameters for the distance function, including ‘metric’ to specify the distance type
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="DistanceMetric") – The name to use when added to a Pipeline
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset
- class AFL.double_agent.PairMetric.Dummy(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name: str = 'DummyMetric')#
Bases:
PairMetric
PairMetric that returns only self-similarity (identity matrix)
This simple metric creates an identity matrix where diagonal elements (self-similarity) are 1.0 and all off-diagonal elements are 0.0. This can be useful as a baseline or for testing purposes.
- Parameters:
input_variable (str) – The name of the data variable to extract from the input dataset
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str) – The dimension containing different samples
params (Optional[Dict[str, Any]], default=None) – Additional parameters for metric calculation (not used in this class)
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="DummyMetric") – The name to use when added to a Pipeline
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset
- class AFL.double_agent.PairMetric.PairMetric(input_variable: str | List[str], output_variable: str, sample_dim: str = 'sample', params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name: str = 'PairMetric')#
Bases:
PipelineOp
Base class for all PairMetrics that produce similarity or distance matrices
This abstract base class provides common functionality for computing and manipulating pairwise metrics between samples. It handles similarity constraints, normalization, and provides a framework for different metric implementations.
- Parameters:
input_variable (str | List[str]) – The name of the data variable to extract from the input dataset
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str, default="sample") – The dimension containing different samples
params (Optional[Dict[str, Any]], default=None) – Additional parameters for metric calculation
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="PairMetric") – The name to use when added to a Pipeline
- apply_constraints()#
Constrain pairs in the similarity matrix to be perfectly similar (S[i,j]=1.0) or perfectly dissimilar (S[i,j]=0.0).
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset
- normalize1() ndarray #
Normalize similarity matrix such that the diagonal values are all equal to 1
- normalize2()#
Normalize similarity matrix such that all values are between 0 and 1
- class AFL.double_agent.PairMetric.Similarity(input_variable: str, output_variable: str, sample_dim: str, params: Dict[str, Any] | None = None, constrain_same: List | None = None, constrain_different: List | None = None, name='SimilarityMetric')#
Bases:
PairMetric
Computes pairwise similarity between samples using kernel functions
This class uses scikit-learn’s pairwise_kernels to compute similarity matrices between samples. Various kernel functions can be specified through the params dictionary (e.g., ‘linear’, ‘rbf’, ‘polynomial’).
For details on available kernel functions and their parameters, see: https://scikit-learn.org/stable/modules/metrics.html#metrics
- Parameters:
input_variable (str) – The name of the data variable to extract from the input dataset
output_variable (str) – The name of the variable to be inserted into the dataset
sample_dim (str) – The dimension containing different samples
params (Optional[Dict[str, Any]], default=None) – Parameters for the kernel function, including ‘metric’ to specify the kernel type
constrain_same (Optional[List], default=None) – List of pairs that should have perfect similarity
constrain_different (Optional[List], default=None) – List of pairs that should have zero similarity
name (str, default="SimilarityMetric") – The name to use when added to a Pipeline
- calculate(dataset: Dataset) Self #
Apply this PipelineOp to the supplied xarray.Dataset