Routines to perform central moments reduction (reduction
)#
Functions:
|
Reduce values to central (co)moments. |
|
Reduce central moments array along axis. |
|
Factor by to codes and groups. |
|
Reduce data by group. |
Transform group_idx to quantities to be used with |
|
|
Reduce data by index |
|
Resample using indexed reduction. |
- cmomy.reduction.reduce_vals(x, *y, mom, weight=None, axis=MISSING, order=None, parallel=None, dtype=None, out=None, dim=MISSING, mom_dims=None, keep_attrs=None)[source]#
Reduce values to central (co)moments.
- Parameters:
*y (array-like or
DataArray
) – Seconda value. Must specify iflen(mom) == 2.
mom (
int
ortuple
ofint
) – Order or moments. If integer or length one tuple, then moments are for a single variable. If length 2 tuple, then comoments of two variablesweight (scalar or array-like or
DataArray
) – Weights for each point.axis (
int
) – Axis to reduce along.order (
{"C", "F", "A", "K"}
, optional) – Order argument tonumpy.asarray()
.out (
ndarray
) – Optional output array. If specified, output will be a reference to this array.dim (hashable) – Dimension to reduce along.
mom_dims (hashable or
tuple
of hashable) – Name of moment dimensions. Defaults to("mom_0",)
formom_ndim==1
and(mom_0, mom_1)
formom_ndim==2
keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}
orbool
, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
- Returns:
out (
ndarray
orDataArray
) – Central moments array of same type asx
.out.shape = (...,shape[axis-1], shape[axis+1], ..., mom0, ...)
whereshape = args[0].shape
.
- cmomy.reduction.reduce_data(data, *, mom_ndim, dim=MISSING, axis=MISSING, order=None, parallel=None, keep_attrs=None, out=None, dtype=None)[source]#
Reduce central moments array along axis.
- Parameters:
data (
ndarray
orDataArray
) – Moments collection array. It is assumed moment dimensions are last.mom_ndim (
{1, 2}
) – Value indicates if moments (mom_ndim = 1
) or comoments (mom_ndim=2
).axis (
int
, optional) – Axis to reduce along. Note that negative values are relative todata.ndim - mom_ndim
. It is assumed that the last dimensions are for moments. For example, ifdata.shape == (1,2,3)
withmom_ndim=1
,axis = -1 `` would be equivalent to ``axis = 1
. Defaults toaxis=-1
.dim (hashable) – Dimension to reduce along.
order (
{"C", "F", "A", "K"}
, optional) – Order argument tonumpy.asarray()
.keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}
orbool
, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
out (
ndarray
) – Optional output array. If specified, output will be a reference to this array.
- Returns:
out (
ndarray
orDataArray
) – Reduced data array with shapedata.shape
withaxis
removed. Same type as inputdata
.
- cmomy.reduction.factor_by(by, sort=True)[source]#
Factor by to codes and groups.
- Parameters:
by (sequence) – Values to group by. Negative or
None
values indicate to skip this value. Note that ifby
is a pandaspandas.Index
object, missing values should be marked withNone
only.sort (
bool
, defaultTrue
) – IfTrue
(default), sortgroups
. IfFalse
, return groups in order of first appearance.
- Returns:
groups (
list
orpandas.Index
) – Unique group names (excluding negative orNone
Values.)
Examples
>>> by = [1, 1, 0, -1, 0, 2, 2] >>> groups, codes = factor_by(by, sort=False) >>> groups [1, 0, 2] >>> codes array([ 0, 0, 1, -1, 1, 2, 2])
Note that with sort=False, groups are in order of first appearance.
>>> groups, codes = factor_by(by) >>> groups [0, 1, 2] >>> codes array([ 1, 1, 0, -1, 0, 2, 2])
This also works for sequences of non-intengers.
>>> by = ["a", "a", None, "c", "c", -1] >>> groups, codes = factor_by(by) >>> groups ['a', 'c'] >>> codes array([ 0, 0, -1, 1, 1, -1])
And for
pandas.Index
objects>>> import pandas as pd >>> by = pd.Index(["a", "a", None, "c", "c", None]) >>> groups, codes = factor_by(by) >>> groups Index(['a', 'c'], dtype='object') >>> codes array([ 0, 0, -1, 1, 1, -1])
- cmomy.reduction.reduce_data_grouped(data, *, mom_ndim, by, axis=MISSING, order=None, parallel=None, out=None, dtype=None, dim=MISSING, group_dim=None, groups=None, keep_attrs=None)[source]#
Reduce data by group.
- Parameters:
data (
ndarray
orDataArray
) – Moments collection array. It is assumed moment dimensions are last.mom_ndim (
{1, 2}
) – Value indicates if moments (mom_ndim = 1
) or comoments (mom_ndim=2
).by (array-like of
int
) – Groupby values of same length asdata
along sampled dimension. Negative values indicate no group (i.e., skip this index).axis (
int
, optional) – Axis to reduce along. Note that negative values are relative todata.ndim - mom_ndim
. It is assumed that the last dimensions are for moments. For example, ifdata.shape == (1,2,3)
withmom_ndim=1
,axis = -1 `` would be equivalent to ``axis = 1
. Defaults toaxis=-1
.order (
{"C", "F", "A", "K"}
, optional) – Order argument tonumpy.asarray()
.out (
ndarray
) – Optional output array. If specified, output will be a reference to this array.dim (hashable) – Dimension to reduce along.
group_dim (
str
, optional) – Name of the output group dimension. Defaults todim
.groups (sequence, optional) – Sequence of length
by.max() + 1
to assign as coordinates forgroup_dim
.keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}
orbool
, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
- Returns:
out (
ndarray
orDataArray
) – Reduced data of same type as inputdata
. The last dimensions are “group”, followed by moments.out.shape = (..., shape[axis-1], shape[axis+1], ..., ngroup, mom0, ...)
whereshape = data.shape
and ngroups =by.max() + 1
.
See also
Examples
>>> data = np.ones((5, 3)) >>> by = [0, 0, -1, 1, -1] >>> reduce_data_grouped(data, mom_ndim=1, axis=0, by=by) array([[2., 1., 1.], [1., 1., 1.]])
This also works for
DataArray
objects. In this case, the groups are added as coordinates togroup_dim
>>> xdata = xr.DataArray(data, dims=["rec", "mom"]) >>> reduce_data_grouped(xdata, mom_ndim=1, dim="rec", by=by, group_dim="group") <xarray.DataArray (group: 2, mom: 3)> Size: 48B array([[2., 1., 1.], [1., 1., 1.]]) Dimensions without coordinates: group, mom
Note that if
by
skips some groups, they will still be included in The output. For example the followingby
skips the value 0.>>> by = [1, 1, -1, 2, 2] >>> reduce_data_grouped(xdata, mom_ndim=1, dim="rec", by=by) <xarray.DataArray (rec: 3, mom: 3)> Size: 72B array([[0., 0., 0.], [2., 1., 1.], [2., 1., 1.]]) Dimensions without coordinates: rec, mom
If you want to ensure that only included groups are used, use
factor_by()
. This has the added benefit of working with non integer groups as well>>> by = ["a", "a", None, "b", "b"] >>> groups, codes = factor_by(by) >>> reduce_data_grouped(xdata, mom_ndim=1, dim="rec", by=codes, groups=groups) <xarray.DataArray (rec: 2, mom: 3)> Size: 48B array([[2., 1., 1.], [2., 1., 1.]]) Coordinates: * rec (rec) <U1 8B 'a' 'b' Dimensions without coordinates: mom
- cmomy.reduction.factor_by_to_index(by)[source]#
Transform group_idx to quantities to be used with
reduce_data_indexed()
.- Parameters:
by (array-like) – Values to factor.
exclude_missing (
bool
, defaultTrue
) – IfTrue
(default), filter Negative andNone
values fromgroup_idx
.
- Returns:
groups (
list
orpandas.Index
) – Unique groups in group_idx (excluding Negative orNone
values ingroup_idx
ifexclude_negative
isTrue
).index (
ndarray
) – Indexing array.index[start[k]:end[k]]
are the index with groupgroups[k]
.start (
ndarray
) – Seeindex
end (
ndarray
) – Seeindex
.
See also
Examples
>>> factor_by_to_index([0, 1, 0, 1]) ([0, 1], array([0, 2, 1, 3]), array([0, 2]), array([2, 4]))
>>> factor_by_to_index(["a", "b", "a", "b"]) (['a', 'b'], array([0, 2, 1, 3]), array([0, 2]), array([2, 4]))
Also, missing values (None or negative) are excluded:
>>> factor_by_to_index([None, "a", None, "b"]) (['a', 'b'], array([1, 3]), array([0, 1]), array([1, 2]))
You can also pass
pandas.Index
objects:>>> factor_by_to_index(pd.Index([None, "a", None, "b"], name="my_index")) (Index(['a', 'b'], dtype='object', name='my_index'), array([1, 3]), array([0, 1]), array([1, 2]))
- cmomy.reduction.reduce_data_indexed(data, *, mom_ndim, index, group_start, group_end, scale=None, axis=MISSING, order=None, parallel=None, out=None, dtype=None, dim=MISSING, coords_policy='first', group_dim=None, groups=None, keep_attrs=None)[source]#
Reduce data by index
- Parameters:
data (
ndarray
) – Moments collection array. It is assumed moment dimensions are last.mom_ndim (
{1, 2}
) – Value indicates if moments (mom_ndim = 1
) or comoments (mom_ndim=2
).index (
ndarray
) – Index into data.shape[axis].group_start (
ndarray
) – Start, end of index for a group.index[group_start[group]:group_end[group]]
are the indices for groupgroup
.group_end (
ndarray
) – Start, end of index for a group.index[group_start[group]:group_end[group]]
are the indices for groupgroup
.scale (
ndarray
, optional) – Weights of same size asindex
.axis (
int
, optional) – Axis to reduce along. Note that negative values are relative todata.ndim - mom_ndim
. It is assumed that the last dimensions are for moments. For example, ifdata.shape == (1,2,3)
withmom_ndim=1
,axis = -1 `` would be equivalent to ``axis = 1
. Defaults toaxis=-1
.order (
{"C", "F", "A", "K"}
, optional) – Order argument tonumpy.asarray()
.out (
ndarray
) – Optional output array. If specified, output will be a reference to this array.dim (hashable) – Dimension to reduce along.
coords_policy (
{'first', 'last', 'group', None}
) –Policy for handling coordinates along
dim
ifby
is specified forDataArray
data. If no coordinates do nothing, otherwise use:’first’: select first value of coordinate for each block.
’last’: select last value of coordinate for each block.
’group’: Assign unique groups from
group_idx
todim
None: drop any coordinates.
Note that if
coords_policy
is one offirst
orlast
, parametergroups
will be ignored.group_dim (
str
, optional) – Name of the output group dimension. Defaults todim
.groups (sequence, optional) – Sequence of length
by.max() + 1
to assign as coordinates forgroup_dim
.keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}
orbool
, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
- Returns:
out (
ndarray
orDataArray
) – Reduced data of same type as inputdata
. The last dimensions are group and moments.out.shape = (..., shape[axis-1], shape[axis+1], ..., ngroup, mom0, ...)
, whereshape = data.shape
andngroup = len(group_start)
.
See also
Examples
This is a more general reduction than
reduce_data_grouped()
, but it can be used similarly.>>> data = np.ones((5, 3)) >>> by = ["a", "a", "b", "b", "c"] >>> groups, index, start, end = factor_by_to_index(by) >>> reduce_data_indexed( ... data, mom_ndim=1, axis=0, index=index, group_start=start, group_end=end ... ) array([[2., 1., 1.], [2., 1., 1.], [1., 1., 1.]])
This also works for
DataArray
objects>>> xdata = xr.DataArray(data, dims=["rec", "mom"]) >>> reduce_data_indexed( ... xdata, ... mom_ndim=1, ... dim="rec", ... index=index, ... group_start=start, ... group_end=end, ... group_dim="group", ... groups=groups, ... coords_policy="group", ... ) <xarray.DataArray (group: 3, mom: 3)> Size: 72B array([[2., 1., 1.], [2., 1., 1.], [1., 1., 1.]]) Coordinates: * group (group) <U1 12B 'a' 'b' 'c' Dimensions without coordinates: mom