Routine to perform resampling (resample)#
Classes:
|
Wrapper around indices and freq resample arrays |
Functions:
|
Factory method to create sampler. |
Convert frequency table to index, group_start, group_end, scale arrays. |
|
|
Convert a frequency array to indices array. |
|
Convert indices to frequency array. |
|
Perform jackknife resample and moments data. |
|
Frequency array for jackknife resampling. |
|
Jackknife by value. |
|
Create frequencies for random resampling (bootstrapping). |
|
Create indices for random resampling (bootstrapping). |
|
Resample and reduce data. |
|
Resample and reduce values. |
|
Determine ndat from array. |
- class cmomy.resample.IndexSampler(*, indices=None, freq=None, ndat=None, parallel=None, shuffle=False, rng=None, fastpath=False)[source]#
Bases:
Generic[SamplerArrayT]Wrapper around indices and freq resample arrays
This a convenience wrapper class to make working with resampling indices straightforward.
cmomyprimarily performs resampling using frequency tables instead of the more standard resampling indices arrays. This class keeps track of both.- Parameters:
indices (
ndarray,DataArray, orDataset) – Indices resampling array.freq (
ndarray,DataArray, orDataset) – Frequency resampling table.ndat (
int) – Size of data along resampled axis.parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.shuffle (
bool) – IfTrue, shuffleindicescreated fromfreqfor each row.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedfastpath (
bool) – Internal variable.
Methods:
from_params(nrep, ndat[, nsamp, rng, ...])Create sampler from parameters
from_data(data, *, nrep[, nsamp, axis, dim, ...])Create sampler for
data.- classmethod from_params(nrep, ndat, nsamp=None, rng=None, replace=True, parallel=None)[source]#
Create sampler from parameters
- Parameters:
nrep (
int) – Number of resample replicates.ndat (
int) – Size of data along resampled axis.nsamp (
int) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedresample_replace (
bool) – If True, do resampling with replacement.parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.
- Returns:
resample (
IndexSampler) – Wrapped object will be anndarrayof integers.
- classmethod from_data(data, *, nrep, nsamp=None, axis=MISSING, dim=MISSING, mom_ndim=None, mom_axes=None, mom_dims=None, mom_params=None, rep_dim='rep', paired=True, rng=None, replace=True, parallel=None)[source]#
Create sampler for
data.- Parameters:
nrep (
int) – Number of resample replicates.nsamp (
int) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.axis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
mom_ndim (
{1, 2}, optional) – Ifmom_ndimis notNone, then wrap axis relative tomom_ndim. For Example, with mom_ndim=``2``,axis = -1will be transformed toaxis = -3. Ifmom_dimsis passed and data is anxarrayobject, infermom_n=ndimfrommom_dims.mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. If specified, infermom_ndimfrommom_dims. If also passmom_ndim, check thatmom_dimsis consistent withmom_dims. If not specified, defaults todata.dims[-mom_ndim:]. This is primarily used ifdatais aDataset, or ifmom_dimsare not the last dimensions.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
paired (
bool) – IfFalseand generatingfreqfromnrepwithdataof typeDataset, Generate uniquefreqfor each variable indata. IfTrue, treat all variables indataas paired, and use samefreqfor each.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedresample_replace (
bool) – If True, do resampling with replacement.parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.
- Returns:
sampler (
IndexSampler) – Type of wrapped array depends on the passed parameters. In all cases, ifdatais an array,samplerwill wrap an array, ifdatais anDataArray,samplerwill wrap anDataArray. Ifdatais anDataset, return a wrappedDataArrayifpaired=Trueor if the resulting Dataset has only one variable, and aDatasetotherwise.
- cmomy.resample.factory_sampler(sampler=None, *, freq=None, indices=None, nrep=None, ndat=None, nsamp=None, paired=True, rng=None, replace=True, shuffle=False, data=None, axis=MISSING, dim=MISSING, mom_ndim=None, mom_axes=None, mom_dims=None, mom_params=None, rep_dim='rep', parallel=None)[source]#
Factory method to create sampler.
The main intent of the function is to be called by other functions/method that need a sampler. For example, it is used in .resample_data. You can pass in a frequency array, an
IndexSampler, or a mapping to create anIndexSampler. The order of evaluation is as follows:sampleris aIndexSampler: returnsampler.samplerisNone:if specify
ndat: returnIndexSampler.from_param(...)if specify
data: returnIndexSampler.from_data(...)
sampleris array-like: returnIndexSampler(freq=sampler, ...)sampleris an int, returnIndexSampler.from_data(..., nrep=sampler)sampleris a mapping: returnfactory_sampler(**sampler, data=data, axis=axis, dim=dims, mom_ndim=mom_ndim, mom_dims=mom_dims, rep_dim=rep_dim).
- Parameters:
sampler (
intor array-like orIndexSampleror mapping) – Passed throughcmomy.resample.factory_sampler()to create anIndexSampler. Value can either benrep(the number of replicates),freq(frequency array), aIndexSamplerobject, or a mapping of parameters. The mapping can have form ofFactoryIndexSamplerKwargs. Allowable keys arefreq,indices,ndat,nrep,nsamp,paired,rng,replace,shuffle.freq (array-like,
DataArray, orDatasetofint) – Array of shape(nrep, size)where nrep is the number of replicates andsize = self.shape[axis]. freq is the weight that each sample contributes to a replicate. Iffreqis anxarrayobject, it should have dimensionsrep_dimanddim.indices (array of
int) – Array of shape(nrep, size). If passed, create freq from indices.nrep (
int) – Number of resample replicates.ndat (
int) – Size of data along resampled axis.nsamp (
int) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.paired (
bool) – IfFalseand generatingfreqfromnrepwithdataof typeDataset, Generate uniquefreqfor each variable indata. IfTrue, treat all variables indataas paired, and use samefreqfor each.rng (RngTypes | None, default:
None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedresample_replace (
bool) – If True, do resampling with replacement.shuffle (
bool)data (array-like) – If needed, extract
ndatfrom data. Also used ifpaired = True.axis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
mom_ndim (
{1, 2}, optional) – Ifmom_ndimis notNone, then wrap axis relative tomom_ndim. For Example, with mom_ndim=``2``,axis = -1will be transformed toaxis = -3. Ifmom_dimsis passed and data is anxarrayobject, infermom_n=ndimfrommom_dims.mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. If specified, infermom_ndimfrommom_dims. If also passmom_ndim, check thatmom_dimsis consistent withmom_dims. If not specified, defaults todata.dims[-mom_ndim:]. This is primarily used ifdatais aDataset, or ifmom_dimsare not the last dimensions.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.
- Returns:
See also
Examples
>>> a = factory_sampler(nrep=3, ndat=2, rng=0)
>>> b = factory_sampler(dict(nrep=3, ndat=2, rng=0)) >>> c = factory_sampler(dict(freq=a.freq)) >>> d = factory_sampler(a) >>> for other in [b, c, d]: ... np.testing.assert_equal(a.freq, other.freq) >>> assert d is a
To instead just pass indices, use:
>>> e = factory_sampler(dict(indices=a.indices)) >>> assert a.indices is e.indices
- cmomy.resample.freq_to_index_start_end_scales(freq)[source]#
Convert frequency table to index, group_start, group_end, scale arrays.
- Parameters:
freq (
ndarray) – Frequency array- Returns:
index, start, end, scale (
ndarray) – Arrays to be used withindexedroutines
See also
- cmomy.resample.freq_to_indices(freq, *, shuffle=False, rng=None, parallel=None)[source]#
Convert a frequency array to indices array.
This creates an “indices” array that is compatible with “freq” array. Note that by default, the indices for a single sample (along output[k, :]) are in sorted order (something like [[0, 0, …, 1, 1, …], …]). Pass
shuffle = Trueto randomly shuffle indices alongaxis=1.- Parameters:
freq (array-like,
DataArray, orDatasetofint) – Array of shape(nrep, size)where nrep is the number of replicates andsize = self.shape[axis]. freq is the weight that each sample contributes to a replicate. Iffreqis anxarrayobject, it should have dimensionsrep_dimanddim.shuffle (
bool) – IfTrue, shuffleindicescreated fromfreqfor each row.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedparallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.
- Returns:
ndarray– Indices array of shape(nrep, nsamp)wherensamp = freq[k, :].sum()where k is any row.
- cmomy.resample.indices_to_freq(indices, *, ndat=None, parallel=None)[source]#
Convert indices to frequency array.
It is assumed that
indices.shape == (nrep, nsamp)withnsamp == ndat. For cases thatnsamp != ndat, pass inndatexplicitly.
- cmomy.resample.jackknife_data(data, data_reduced=None, *, axis=MISSING, dim=MISSING, mom_ndim=None, mom_axes=None, mom_axes_reduced=None, mom_dims=None, mom_params=None, rep_dim='rep', out=None, dtype=None, casting='same_kind', order=None, parallel=None, axes_to_end=False, keep_attrs=None, apply_ufunc_kwargs=None)[source]#
Perform jackknife resample and moments data.
This uses moments addition/subtraction to speed up jackknife resampling.
- Parameters:
data (
ndarrayorDataArrayorDataset) – Moments array(s). It is assumed moment dimensions are last.data_reduced (array-like or
DataArray, optional) –datareduced alongaxisordim. This will be calculated usingreduce_data()if not passed.axis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
mom_ndim (
{1, 2}, optional) – Value indicates if moments (mom_ndim = 1) or comoments (mom_ndim=2). If not specified and data is anxarrayobject attempt to infermom_ndimfrommom_dims. Otherwise, default tomom_ndim = 1.mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_axes_reduced (
intor sequence ofint) – Location(s) of moment dimensions indata_reduced. This option is only needed ifdata_reducedis passed in and is an array. Defaults tomom_axes, or last dimensions ofdata_reduced.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. If specified, infermom_ndimfrommom_dims. If also passmom_ndim, check thatmom_dimsis consistent withmom_dims. If not specified, defaults todata.dims[-mom_ndim:]. This is primarily used ifdatais aDataset, or ifmom_dimsare not the last dimensions.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
out (
ndarray) – Optional output array. If specified, output will be a reference to this array. Note that if the output if method returns aDataset, then this option is ignored.casting (
{'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional) –Controls what kind of data casting may occur.
’no’ means the data types should not be cast at all.
’equiv’ means only byte-order changes are allowed.
’safe’ means only casts which can preserve values are allowed.
’same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
’unsafe’ (default) means any data conversions may be done.
order (
{"C", "F", "A", "K"}, optional) – Order argument. Seenumpy.asarray().parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.axes_to_end (
bool) – IfTrue, place sampled dimension (if exists in output) and moment dimensions at end of output. Otherwise, place sampled dimension (if exists in output) at same position as inputaxisand moment dimensions at same position as input (if input does not contain moment dimensions, place them at end of array).keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}orbool, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
apply_ufunc_kwargs (dict-like) – Extra parameters to
xarray.apply_ufunc(). One useful option ison_missing_core_dim, which can take the value"copy"(the default),"raise", or"drop"and controls what to do with variables of aDatasetmissing core dimensions. Other options arejoin,dataset_join,dataset_fill_value, anddask_gufunc_kwargs. Unlisted options are handled internally.
- Returns:
out (
ndarrayorDataArray) – Jackknife resampled alongaxis. That is,out[...,axis=i, ...]isreduced_data(out[...,axis=[...,i-1,i+1,...], ...]).
Examples
>>> import cmomy >>> data = cmomy.default_rng(0).random((4, 3)) >>> out_jackknife = jackknife_data(data, mom_ndim=1, axis=0) >>> out_jackknife array([[1.5582, 0.7822, 0.2247], [2.1787, 0.6322, 0.22 ], [1.5886, 0.5969, 0.0991], [1.2601, 0.4982, 0.3478]])
Note that this is equivalent to (but typically faster than) resampling with a frequency table from :func:
cmomy.resample.jackknife_freq>>> freq = cmomy.resample.jackknife_freq(4) >>> resample_data(data, sampler=dict(freq=freq), mom_ndim=1, axis=0) array([[1.5582, 0.7822, 0.2247], [2.1787, 0.6322, 0.22 ], [1.5886, 0.5969, 0.0991], [1.2601, 0.4982, 0.3478]])
To speed up the calculation even further, pass in
data_reduced>>> data_reduced = cmomy.reduce_data(data, mom_ndim=1, axis=0) >>> jackknife_data(data, mom_ndim=1, axis=0, data_reduced=data_reduced) array([[1.5582, 0.7822, 0.2247], [2.1787, 0.6322, 0.22 ], [1.5886, 0.5969, 0.0991], [1.2601, 0.4982, 0.3478]])
Also works with
DataArrayobjects>>> xdata = xr.DataArray(data, dims=["samp", "mom"]) >>> jackknife_data(xdata, mom_ndim=1, dim="samp", rep_dim="jackknife") <xarray.DataArray (jackknife: 4, mom: 3)> Size: 96B array([[1.5582, 0.7822, 0.2247], [2.1787, 0.6322, 0.22 ], [1.5886, 0.5969, 0.0991], [1.2601, 0.4982, 0.3478]]) Dimensions without coordinates: jackknife, mom
- cmomy.resample.jackknife_freq(ndat)[source]#
Frequency array for jackknife resampling.
Use this frequency array to perform jackknife [1] resampling
- Parameters:
ndat (
int) – Size of data along resampled axis.- Returns:
freq (
ndarray) – Frequency array for jackknife resampling.
References
Examples
>>> jackknife_freq(4) array([[0, 1, 1, 1], [1, 0, 1, 1], [1, 1, 0, 1], [1, 1, 1, 0]])
- cmomy.resample.jackknife_vals(x, *y, data_reduced=None, mom, axis=MISSING, dim=MISSING, weight=None, mom_dims=None, mom_axes=None, mom_axes_reduced=None, mom_params=None, rep_dim='rep', out=None, dtype=None, casting='same_kind', order=None, parallel=None, axes_to_end=False, keep_attrs=None, apply_ufunc_kwargs=None)[source]#
Jackknife by value.
- Parameters:
x (array-like or
DataArrayorDataset) – Values to reduce.*y (array-like or
DataArrayorDataset) – Additional values (needed iflen(mom)==2).yhas same type restrictions and broadcasting rules asweight.data_reduced (array-like or
DataArray, optional) –datareduced alongaxisordim. This will be calculated usingreduce_vals()if not passed. Same type restrictions asweight.mom (
intortupleofint) – Order or moments. If integer or length one tuple, then moments are for a single variable. If length 2 tuple, then comoments of two variablesaxis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
weight (array-like or
DataArrayorDataset) –Optional weight. The type of
weightmust be “less than” the type ofx.xisDataset:weightcan be aDataset,DataArray, or array-likexis array-like:weightcan be array-like
In the case that
weightis array-like, it must broadcast toxusing usual broadcasting rules (seenumpy.broadcast_to()), with the following exceptions: Ifweightis a 1d array of lengthx.shape[axis]], it will be formatted to broadcast along the other dimensions ofx. For example, ifxhas shape(10, 2, 3)andweighthas shape(10,), thenweightwill be converted to the broadcastable shape(10, 1, 1). Ifweightis a scalar, it will be broadcast tox.shape.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. Defaults to("mom_0",)formom_ndim==1and(mom_0, mom_1)formom_ndim==2mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_axes_reduced (
intor sequence ofint) – Location(s) of moment dimensions indata_reduced. This option is only needed ifdata_reducedis passed in and is an array. Defaults tomom_axes, or last dimensions ofdata_reduced.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
out (
ndarray) – Optional output array. If specified, output will be a reference to this array. Note that if the output if method returns aDataset, then this option is ignored.casting (
{'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional) –Controls what kind of data casting may occur.
’no’ means the data types should not be cast at all.
’equiv’ means only byte-order changes are allowed.
’safe’ means only casts which can preserve values are allowed.
’same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
’unsafe’ (default) means any data conversions may be done.
order (
{"C", "F", "A", "K"}, optional) – Order argument. Seenumpy.asarray().parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.axes_to_end (
bool) – IfTrue, place sampled dimension (if exists in output) and moment dimensions at end of output. Otherwise, place sampled dimension (if exists in output) at same position as inputaxisand moment dimensions at same position as input (if input does not contain moment dimensions, place them at end of array).keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}orbool, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
apply_ufunc_kwargs (dict-like) – Extra parameters to
xarray.apply_ufunc(). One useful option ison_missing_core_dim, which can take the value"copy"(the default),"raise", or"drop"and controls what to do with variables of aDatasetmissing core dimensions. Other options arejoin,dataset_join,dataset_fill_value, anddask_gufunc_kwargs. Unlisted options are handled internally.
- Returns:
out (
ndarrayorDataArray) – Resampled Central moments array.out.shape = (...,shape[axis-1], shape[axis+1], ..., shape[axis], mom0, ...)whereshape = x.shape. That is, the resampled dimension is moved to the end, just before the moment dimensions.
Notes
Note that the resampled axis (
resamp_axis) is at position-(len(mom) + 1), just before the moment axes. This is opposed to the behavior of resampling moments arrays (e.g., func:cmomy.resample_data), where the resampled axis is the same as the argumentaxis. This is because the shape of the output array when resampling values is dependent the result of broadcastingxandyandweight.
- cmomy.resample.random_freq(nrep, ndat, nsamp=None, rng=None, replace=True, parallel=None)[source]#
Create frequencies for random resampling (bootstrapping).
- Parameters:
nrep (
int) – Number of resample replicates.ndat (
int) – Size of data along resampled axis.nsamp (
int) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedreplace (
bool, default:True) – Whether to allow replacement.parallel (
bool|None, default:None) – The description is missing.
- Returns:
freq (
ndarray) – Frequency array.freq[rep, k]is the number of times to sample from the k`th observation for replicate `rep.parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.
See also
- cmomy.resample.random_indices(nrep, ndat, nsamp=None, rng=None, replace=True)[source]#
Create indices for random resampling (bootstrapping).
- Parameters:
nrep (
int) – Number of resample replicates.ndat (
int) – Size of data along resampled axis.nsamp (
int) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.rng (
Union[int,Sequence[int],SeedSequence,BitGenerator,Generator,None], default:None) – Random number generator object. Defaults to output ofdefault_rng(). If pass in a seed value, create a newGeneratorobject with this seedreplace (
bool, default:True) – Whether to allow replacement.
- Returns:
indices (
ndarray) – Index array of integers of shape(nrep, nsamp).
- cmomy.resample.resample_data(data, *, sampler, axis=MISSING, dim=MISSING, mom_ndim=None, mom_axes=None, mom_dims=None, mom_params=None, rep_dim='rep', out=None, dtype=None, casting='same_kind', order=None, parallel=None, axes_to_end=False, keep_attrs=None, apply_ufunc_kwargs=None)[source]#
Resample and reduce data.
- Parameters:
data (
ndarrayorDataArrayorDataset) – Moments array(s). It is assumed moment dimensions are last.sampler (
intor array-like orIndexSampleror mapping) – Passed throughcmomy.resample.factory_sampler()to create anIndexSampler. Value can either benrep(the number of replicates),freq(frequency array), aIndexSamplerobject, or a mapping of parameters. The mapping can have form ofFactoryIndexSamplerKwargs. Allowable keys arefreq,indices,ndat,nrep,nsamp,paired,rng,replace,shuffle.axis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
mom_ndim (
{1, 2}, optional) – Value indicates if moments (mom_ndim = 1) or comoments (mom_ndim=2). If not specified and data is anxarrayobject attempt to infermom_ndimfrommom_dims. Otherwise, default tomom_ndim = 1.mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. If specified, infermom_ndimfrommom_dims. If also passmom_ndim, check thatmom_dimsis consistent withmom_dims. If not specified, defaults todata.dims[-mom_ndim:]. This is primarily used ifdatais aDataset, or ifmom_dimsare not the last dimensions.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
out (
ndarray) – Optional output array. If specified, output will be a reference to this array. Note that if the output if method returns aDataset, then this option is ignored.casting (
{'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional) –Controls what kind of data casting may occur.
’no’ means the data types should not be cast at all.
’equiv’ means only byte-order changes are allowed.
’safe’ means only casts which can preserve values are allowed.
’same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
’unsafe’ (default) means any data conversions may be done.
order (
{"C", "F", "A", "K"}, optional) – Order argument. Seenumpy.asarray().parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.axes_to_end (
bool) – IfTrue, place sampled dimension (if exists in output) and moment dimensions at end of output. Otherwise, place sampled dimension (if exists in output) at same position as inputaxisand moment dimensions at same position as input (if input does not contain moment dimensions, place them at end of array).keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}orbool, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
apply_ufunc_kwargs (dict-like) – Extra parameters to
xarray.apply_ufunc(). One useful option ison_missing_core_dim, which can take the value"copy"(the default),"raise", or"drop"and controls what to do with variables of aDatasetmissing core dimensions. Other options arejoin,dataset_join,dataset_fill_value, anddask_gufunc_kwargs. Unlisted options are handled internally.
- Returns:
out (
ndarrayorDataArray) – Resampled central moments.out.shape = (..., shape[axis-1], nrep, shape[axis+1], ...), whereshape = data.shapeandnrep = sampler.nrep.
See also
- cmomy.resample.resample_vals(x, *y, sampler, mom, weight=None, axis=MISSING, dim=MISSING, mom_dims=None, mom_axes=None, mom_params=None, rep_dim='rep', out=None, dtype=None, casting='same_kind', order=None, parallel=None, axes_to_end=False, keep_attrs=None, apply_ufunc_kwargs=None)[source]#
Resample and reduce values.
- Parameters:
x (array-like or
DataArrayorDataset) – Values to reduce.*y (array-like or
DataArrayorDataset) – Additional values (needed iflen(mom)==2).yhas same type restrictions and broadcasting rules asweight.sampler (
intor array-like orIndexSampleror mapping) – Passed throughcmomy.resample.factory_sampler()to create anIndexSampler. Value can either benrep(the number of replicates),freq(frequency array), aIndexSamplerobject, or a mapping of parameters. The mapping can have form ofFactoryIndexSamplerKwargs. Allowable keys arefreq,indices,ndat,nrep,nsamp,paired,rng,replace,shuffle.mom (
intortupleofint) – Order or moments. If integer or length one tuple, then moments are for a single variable. If length 2 tuple, then comoments of two variablesaxis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
weight (array-like or
DataArrayorDataset) –Optional weight. The type of
weightmust be “less than” the type ofx.xisDataset:weightcan be aDataset,DataArray, or array-likexis array-like:weightcan be array-like
In the case that
weightis array-like, it must broadcast toxusing usual broadcasting rules (seenumpy.broadcast_to()), with the following exceptions: Ifweightis a 1d array of lengthx.shape[axis]], it will be formatted to broadcast along the other dimensions ofx. For example, ifxhas shape(10, 2, 3)andweighthas shape(10,), thenweightwill be converted to the broadcastable shape(10, 1, 1). Ifweightis a scalar, it will be broadcast tox.shape.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. Defaults to("mom_0",)formom_ndim==1and(mom_0, mom_1)formom_ndim==2mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).rep_dim (hashable) – Name of new ‘replicated’ dimension:
out (
ndarray) – Optional output array. If specified, output will be a reference to this array. Note that if the output if method returns aDataset, then this option is ignored.casting (
{'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional) –Controls what kind of data casting may occur.
’no’ means the data types should not be cast at all.
’equiv’ means only byte-order changes are allowed.
’safe’ means only casts which can preserve values are allowed.
’same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
’unsafe’ (default) means any data conversions may be done.
order (
{"C", "F"}, optional) – Order argument. Seenumpy.zeros().parallel (
bool, optional) – IfTrue, use parallel numbanumba.njitornumba.guvectorizedcode if possible. IfNone, use a heuristic to determine if should attempt to use parallel method.axes_to_end (
bool) – IfTrue, place sampled dimension (if exists in output) and moment dimensions at end of output. Otherwise, place sampled dimension (if exists in output) at same position as inputaxisand moment dimensions at same position as input (if input does not contain moment dimensions, place them at end of array).keep_attrs (
{"drop", "identical", "no_conflicts", "drop_conflicts", "override"}orbool, optional) –‘drop’ or False: empty attrs on returned xarray object.
’identical’: all attrs must be the same on every object.
’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.
’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.
’override’ or True: skip comparing and copy attrs from the first object to the result.
apply_ufunc_kwargs (dict-like) – Extra parameters to
xarray.apply_ufunc(). One useful option ison_missing_core_dim, which can take the value"copy"(the default),"raise", or"drop"and controls what to do with variables of aDatasetmissing core dimensions. Other options arejoin,dataset_join,dataset_fill_value, anddask_gufunc_kwargs. Unlisted options are handled internally.
- Returns:
out (
ndarrayorDataArrayorDataset) – Resampled Central moments array.out.shape = (...,shape[axis-1], nrep, shape[axis+1], ...)whereshape = x.shape. andnrep = sampler.nrep. This can be overridden by setting axes_to_end.
Notes
Note that the resampled axis (
resamp_axis) is at position-(len(mom) + 1), just before the moment axes. This is opposed to the behavior of resampling moments arrays (e.g., func:cmomy.resample_data), where the resampled axis is the same as the argumentaxis. This is because the shape of the output array when resampling values is dependent the result of broadcastingxandyandweight.See also
- cmomy.resample.select_ndat(data, *, axis=MISSING, dim=MISSING, mom_ndim=None, mom_axes=None, mom_dims=None, mom_params=None)[source]#
Determine ndat from array.
- Parameters:
axis (
int) – Axis to reduce/sample along.dim (hashable) – Dimension to reduce/sample along.
mom_ndim (
{1, 2}, optional) – Ifmom_ndimis notNone, then wrap axis relative tomom_ndim. For Example, with mom_ndim=``2``,axis = -1will be transformed toaxis = -3. Ifmom_dimsis passed and data is anxarrayobject, infermom_n=ndimfrommom_dims.mom_axes (
intortupleofint, optional) – Location of the moment dimensions. Default to(-mom_ndim, -mom_ndim+1, ...). If specified andmom_ndimis None, setmom_ndimtolen(mom_axes). Note that ifmom_axesis specified, negative values are relative to the end of the array. This is also the case foraxesifmom_axesis specified.mom_dims (hashable or
tupleof hashable) – Name of moment dimensions. If specified, infermom_ndimfrommom_dims. If also passmom_ndim, check thatmom_dimsis consistent withmom_dims. If not specified, defaults todata.dims[-mom_ndim:]. This is primarily used ifdatais aDataset, or ifmom_dimsare not the last dimensions.mom_params (
MomParamsorMomParamsDictordict, optional) – Moment parameters. You can set moment parametersaxesanddimsusing this option. For example, passingmom_params={"dim": ("a", "b")}is equivalent to passingmom_dims=("a", "b"). You can also pass as aMomParamsobject withmom_params=cmomy.MomParams(dims=("a", "b")).
- Returns:
int– size ofdataalong specifiedaxisordim
Examples
>>> data = np.zeros((2, 3, 4)) >>> select_ndat(data, axis=1) 3
To wrap relative to the last
mom_ndimdimensions ofdata, use complex axes>>> select_ndat(data, axis=-1j, mom_ndim=2) 2
>>> xdata = xr.DataArray(data, dims=["x", "y", "mom"]) >>> select_ndat(xdata, dim="y") 3 >>> select_ndat(xdata, dim="mom", mom_ndim=1) Traceback (most recent call last): ... ValueError: Cannot select moment dimension. dim='mom', axis=2.