pymcr package¶

Subpackages¶

pymcr.tests package

Submodules¶

pymcr.condition module¶

Functions to condition / preprocess data

pymcr.condition.standardize(X, mean_ctr=True, with_std=True, axis=-1, copy=True)[source]¶

Standardization of data

Parameters

X (ndarray) – Data array
mean_ctr (bool) – Mean-center data
with_std (bool) – Normalize by the standard deviation of the data
axis (int) – Axis from which to calculate mean and standard deviation
copy (bool) – Copy data (X) if True, overwite if False

pymcr.constraints module¶

Built-in constraints

All classes need a transform class. Note, unlike sklearn, transform can copy or overwrite input depending on copy attribute.

class pymcr.constraints.Constraint(copy=True)[source]¶

Bases: abc.ABC

Abstract class for constraints

Parameters: copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

abstract transform(A)[source]¶: Transform A input based on constraint

class pymcr.constraints.ConstraintNonneg(copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Non-negativity constraint. All negative entries made 0.

Parameters: copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply nonnegative constraint

class pymcr.constraints.ConstraintCumsumNonneg(axis=-1, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Cumulative-Summation non-negativity constraint. All negative: entries made 0.

Parameters: copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply cumsum nonnegative constraint

class pymcr.constraints.ConstraintZeroEndPoints(axis=-1, span=1, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Enforce the endpoints (or the mean over a range) is zero

Parameters

copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)
axis (int) – Axis to operate on
span (int) – Number of pixels along the ends to average.

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply cumsum nonnegative constraint

class pymcr.constraints.ConstraintZeroCumSumEndPoints(nodes=None, axis=-1, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Enforce the endpoints of the cumsum (or the mean over a range) is near-zero. Note: this is an approximation.

Parameters

copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)
nodes (list of int) – In addition to end-points, other points to ensure are approximately 0
axis (int) – Axis to operate on
span (int) – Number of pixels along the ends to average.

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply cumsum nonnegative constraint

class pymcr.constraints.ConstraintNorm(axis=-1, fix=None, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Normalization constraint.

Parameters

axis (int) – Which axis of input matrix A to apply normalization acorss.
fix (list) – Keep fix-axes as-is and normalize the remaining axes based on the residual of the fixed axes.
set_zeros_to_feature (int) –

Set all samples which sum-to-zero across axis to 1 for a particular
feature (See Notes)
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

Notes

For set_zeros_to_feature, assuming the data represents concentration
with a matrix [n_samples, n_features] and the axis is across the features, for every sample that sums to 0 across axis, would be replaced with a vector [n_features] of zeros except at set_zeros_to_feature, which would equal 1. I.e., this pixel is now pure substance of index value set_zeros_to_feature.

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply normalization constraint

class pymcr.constraints.ConstraintCutBelow(value=0, axis_sumnz=None, exclude=None, exclude_axis=-1, copy=False)[source]¶

Bases: pymcr.constraints._CutExclude

Cut values below (and not-equal to) a certain threshold.

Parameters

value (float) – Cutoff value
axis_sumnz (int) – If not None, cut below value only applied where sum across specified axis does not go to 0, i.e. all values cut.
exclude (int, list , tuple, ndarray) – Exclude targets
exclude_axis (int) – Along which axis to enumerate targets
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply cut-below value constraint

class pymcr.constraints.ConstraintCutAbove(value=0, axis_sumnz=None, exclude=None, exclude_axis=-1, copy=False)[source]¶

Bases: pymcr.constraints._CutExclude

Cut values above (and not-equal to) a certain threshold

Parameters

value (float) – Cutoff value
axis_sumnz (int) – If not None, cut above value only applied where sum across specified axis does not go to 0, i.e. all values cut.
exclude (int, list , tuple, ndarray) – Exclude targets
exclude_axis (int) – Along which axis to enumerate targets
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply cut-above value constraint

class pymcr.constraints.ConstraintCompressBelow(value=0, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Compress values below (and not-equal to) a certain threshold (set to value)

Parameters

value (float) – Cutoff value
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply compress-below value constraint

class pymcr.constraints.ConstraintCutAbove(value=0, axis_sumnz=None, exclude=None, exclude_axis=-1, copy=False)[source]

Bases: pymcr.constraints._CutExclude

Cut values above (and not-equal to) a certain threshold

Parameters

value (float) – Cutoff value
axis_sumnz (int) – If not None, cut above value only applied where sum across specified axis does not go to 0, i.e. all values cut.
exclude (int, list , tuple, ndarray) – Exclude targets
exclude_axis (int) – Along which axis to enumerate targets
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>

transform(A)[source]: Apply cut-above value constraint

class pymcr.constraints.ConstraintCompressAbove(value=0, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Compress values above (and not-equal to) a certain threshold (set to value)

Parameters

value (float) – Cutoff value
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply compress-above value constraint

class pymcr.constraints.ConstraintReplaceZeros(axis=-1, feature=None, fval=1, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Samples that sum-to-zero across axis are replaced with a vector of 0’s except for a 1 at feature if a single value. In a concentration context, e.g., samples with no concentration are replaced with 100% concentration of a set feature. If multiple features given, equal amounts of each feature (summing to 1) are used.

Parameters

axis (int) – Which axis of input matrix A to apply normalization acorss.
feature (int, list, tuple) –

Set all samples which sum-to-zero across axis to fval for a particular
feature (or fractional) for multiple features.
fval (float) – Value of summation across axis of replacement vector.
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

_abc_impl = <_abc_data object>¶

transform(A)[source]¶: Apply constraint

class pymcr.constraints.ConstraintPlanarize(target, shape, use_vals_above=None, use_vals_below=None, lims_to_plane=True, scaler=None, recalc_scaler=False, copy=False)[source]¶

Bases: pymcr.constraints.Constraint

Set a particular target to a plane

Parameters

target (int, list, tuple) – Target numbers to set to a fitted plane
shape (tuple, list) – Shape of array (M,N) which is (Y,X)
use_vals_above (float) – Only calculate based on values above (not including)
use_vals_below (float) – Only calculate based on values below (not including)
lims_to_plane (bool) – The returned plane will be limited to the range of the optionally supplied use_vals_below, use_vals above.
scaler (float) – A large value that is much bigger than any values in the input array. Needed to ensure SVD properly creates plane. If None, auto-calculates.
recalc_scaler (bool) – Auto-calculate for every new input (does not use previously provided or calculated value)
copy (bool) – Make copy of input data, A; otherwise, overwrite (if mutable)

Notes

This uses an SVD to calculate the vector normal to the plane that fits the input data. It assumes that the 3rd singular vector is the normal; thus, the x and y vectors for the data need be larger than the variance of the input data. Scaler enables this by scaling the auto-generated x and y vectors to be much larger than the max-min of the input data

_abc_impl = <_abc_data object>¶

_setup_xy(scaler)[source]¶

transform(A)[source]¶: Set targets, t, to fit planes

pymcr.mcr module¶

MCR Main Class for Computation

class pymcr.mcr.McrAR(c_regr=<pymcr.regressors.OLS object>, st_regr=<pymcr.regressors.OLS object>, c_fit_kwargs={}, st_fit_kwargs={}, c_constraints=[<pymcr.constraints.ConstraintNonneg object>], st_constraints=[<pymcr.constraints.ConstraintNonneg object>], max_iter=50, err_fcn=<function mse>, tol_increase=0.0, tol_n_increase=10, tol_err_change=None, tol_n_above_min=10)[source]¶

Bases: object

Multivariate Curve Resolution - Alternating Regression

D = CS^T

Parameters

c_regr (str, class) – Instantiated regression class (or string, see Notes) for calculating the C matrix
st_regr (str, class) – Instantiated regression class (or string, see Notes) for calculating the S^T matrix
c_fit_kwargs (dict) – kwargs sent to c_regr.fit method
st_fit_kwargs (dict) – kwargs sent to st_regr.fit method
c_constraints (list) – List of constraints applied to calculation of C matrix
st_constraints (list) – List of constraints applied to calculation of S^T matrix
max_iter (int) – Maximum number of iterations. One iteration calculates both C and S^T
err_fcn (function) – Function to calculate error/differences after each least squares calculation (ie twice per iteration). Outputs to err attribute.
tol_increase (float) – Factor increase to allow in err attribute. Set to 0 for no increase allowed. E.g., setting to 1.0 means the err can double per iteration.
tol_n_increase (int) – Number of consecutive iterations for which the err attribute can increase
tol_err_change (float) – If err changes less than tol_err_change, per iteration, break.
tol_n_above_min (int) – Number of half-iterations that can be performed without reaching a new error-minimum

err¶

List of calculated errors (from err_fcn) after each least squares (ie twice per iteration)

Type: list

C_¶

Most recently calculated C matrix (that did not cause a tolerance failure)

Type: ndarray [n_samples, n_targets]

ST_¶

Most recently calculated S^T matrix (that did not cause a tolerance failure)

Type: ndarray [n_targets, n_features]

C_opt_¶

[Optimal] C matrix for lowest err attribute

Type: ndarray [n_samples, n_targets]

ST_opt_¶

[Optimal] ST matrix for lowest err attribute

Type: ndarray [n_targets, n_features]

n_iter¶

Total number of iterations performed

Type: int

n_features¶

Total number of features, e.g. spectral frequencies.

Type: int

n_samples¶

Total number of samples (e.g., pixels)

Type: int

n_targets¶

Total number of targets (e.g., pure analytes)

Type: int

n_iter_opt¶

Iteration when optimal C and ST calculated

Type: int

exit_max_iter_reached¶

Exited iterations due to maximum number of iteration reached (max_iter parameter)

Type: bool

exit_tol_increase¶

Exited iterations due to maximum fractional increase in error metric (via err_fcn)

Type: bool

exit_tol_n_increase¶

Exited iterations due to maximum number of consecutive increases in error metric (via err fcn)

Type: bool

exit_tol_err_change¶

Exited iterations due to error metric change that is smaller than tol_err_change

Type: bool

exit_tol_n_above_min¶

Exited iterations due to maximum number of half-iterations for which the error metric increased above the minimum error

Type: bool

Notes

Built-in regressor classes (str can be used): OLS (ordinary least squares), NNLS (non-negatively constrained least squares). See mcr.regressors.
Built-in regressor methods can be given as a string to c_regr, st_regr; though instantiating an imported class gives more flexibility.
Setting any tolerance to None turns that check off

property D_¶: D matrix with current C and S^T matrices

property D_opt_¶: D matrix with optimal C and S^T matrices

_check_regr(mth)[source]¶: Check regressor method. If acceptable strings, instantiate and return object. If instantiated class, make sure it has a fit attribute.

_ismin_err(val)[source]¶: Is the current error the minimum

fit(D, C=None, ST=None, st_fix=None, c_fix=None, c_first=True, verbose=False, post_iter_fcn=None, post_half_fcn=None)[source]¶

Perform MCR-AR. D = CS^T. Solve for C and S^T iteratively.

Parameters

D (ndarray) – D matrix
C (ndarray) – Initial C matrix estimate. Only provide initial C OR S^T.
ST (ndarray) – Initial S^T matrix estimate. Only provide initial C OR S^T.
st_fix (list) – The spectral component numbers to keep fixed.
c_fix (list) – The concentration component numbers to keep fixed.
c_first (bool) – Calculate C first when both C and ST are provided. c_fix and st_fix must also be provided in this circumstance.
verbose (bool) – Log iteration and per-least squares err results. See Notes.
post_iter_fcn (function) – Function to perform after each iteration
post_half_fcn (function) – Function to perform after half-iteration

Notes

pyMCR (>= 0.3.1) uses the native Python logging module rather than print statements; thus, to see the messages, one will need to log-to-file or stream to stdout. More info is available in the docs.

property n_features: Number of features

property n_samples: Number of samples

property n_targets: Number of targets

pymcr.metrics module¶

Metrics used in pyMCR

All functions must take C, ST, D_actual, D_calculated

pymcr.metrics.mse(C, ST, D_actual, D_calculated)[source]¶: Mean square error

pymcr.regressors module¶

Built-in least squares / regression methods.

All models will follow the formalism, AX = B, solve for X.

NOTE: coef_ will be X.T, which is the formalism that scikit-learn follows

class pymcr.regressors.LinearRegression[source]¶

Bases: abc.ABC

Abstract class for linear regression methods

_abc_impl = <_abc_data object>¶

property coef_¶: The transposed form of X. This is the formalism of scikit-learn

abstract fit(A, B)[source]¶: AX = B, solve for X

class pymcr.regressors.NNLS(*args, **kwargs)[source]¶

Bases: pymcr.regressors.LinearRegression

Non-negative constrained least squares regression

AX = B, solve for X (coeffients.T)

coef_¶

Regression coefficients

Type: ndarray

residual_¶

Residual (sum-of-squares)

Type: ndarray

Notes

This is simply a wrapped version of NNLS (scipy.optimize.nnls).

coef_ is X.T, which is the formalism of scikit-learn

_abc_impl = <_abc_data object>¶

fit(A, B)[source]¶: Solve for X: AX = B

class pymcr.regressors.OLS(*args, **kwargs)[source]¶

Bases: pymcr.regressors.LinearRegression

Ordinary least squares regression

AX = B, solve for X (coefficients.T)

coef_¶

Regression coefficients (X.T)

Type: ndarray

residual_¶

Residual (sum-of-squares)

Type: ndarray

rank_¶

Effective rank of matrix A

Type: int

svs_¶

Singular values of matrix A

Type: ndarray

Notes

This is simply a wrapped version of Ordinary Least Squares (scipy.linalg.lstsq).

coef_ is X.T, which is the formalism of scikit-learn

_abc_impl = <_abc_data object>¶

fit(A, B)[source]¶: Solve for X: AX = B

Module contents¶

pyMCR: Pythonic Multivariate Curve Resolution - Alternating Least Squares

pymcr package¶

Subpackages¶

Submodules¶

pymcr.condition module¶

pymcr.constraints module¶

pymcr.mcr module¶

pymcr.metrics module¶

pymcr.regressors module¶

Module contents¶

Table of Contents

Previous topic

Next topic

This Page