Routine to perform resampling (cmomy.resample
)#
Functions:
|
Convert a frequency array to indices array. |
|
Convert indices to frequency array. |
|
Create indices for random resampling (bootstrapping). |
|
Create frequencies for random resampling (bootstrapping). |
|
Produce a random sample for bootstrapping. |
|
Resample data according to frequency table. |
|
Resample data according to frequency table. |
|
Calculate the error bounds. |
|
Bootstrap xarray object. |
- cmomy.resample.freq_to_indices(freq, shuffle=True, rng=None)[source]#
Convert a frequency array to indices array.
This creates an “indices” array that is compatible with “freq” array. Note that by default, the indices for a single sample (along output[k, :]) are randomly shuffled. If you pass shuffle=False, then the output will be something like [[0,0,…, 1,1,…, 2,2, …]].
- Parameters:
freq (array of
int
) – Array of shape(nrep, size)
where nrep is the number of replicates andsize = self.shape[axis]
. freq is the weight that each sample contributes to resamples values. Seerandsamp_freq()
shuffle (
bool
, default:True
) – IfTrue
(default), shuffle values for each row.rng (
Generator
) – Random number generator object. Defaults to output ofdefault_rng()
.
- Returns:
ndarray
– Indices array of shape(nrep, nsamp)
wherensamp = freq[k, :].sum()
where k is any row.
- cmomy.resample.indices_to_freq(indices, ndat=None)[source]#
Convert indices to frequency array.
It is assumed that
indices.shape == (nrep, nsamp)
withnsamp == ndat
. For cases thatnsamp != ndat
, pass inndat
.
- cmomy.resample.random_indices(nrep, ndat, nsamp=None, rng=None, replace=True)[source]#
Create indices for random resampling (bootstrapping).
- Parameters:
nrep (
int
) – Number of resample replicates.ndat (
int
) – Size of data along resampled axis.nsamp (
int
) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.rng (
Generator
) – Random number generator object. Defaults to output ofdefault_rng()
.replace (
bool
, default:True
) – Whether to allow replacement.
- Returns:
indices (
ndarray
) – Index array of integers of shape(nrep, nsamp)
.
- cmomy.resample.random_freq(nrep, ndat, nsamp=None, rng=None, replace=True)[source]#
Create frequencies for random resampling (bootstrapping).
- Parameters:
nrep (
int
) – Number of resample replicates.ndat (
int
) – Size of data along resampled axis.nsamp (
int
) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.rng (
Generator
) – Random number generator object. Defaults to output ofdefault_rng()
.replace (
bool
, default:True
) – Whether to allow replacement.
- Returns:
freq (
ndarray
) – Frequency array.freq[rep, k]
is the number of times to sample from the k`th observation for replicate `rep.
See also
- cmomy.resample.randsamp_freq(ndat, nrep=None, nsamp=None, indices=None, freq=None, check=False, rng=None)[source]#
Produce a random sample for bootstrapping.
In order, the return will be one of
freq
, frequencies fromindices
or new sample fromrandom_freq()
.- Parameters:
nrep (
int
) – Number of resample replicates.ndat (
int
) – Size of data along resampled axis.nsamp (
int
) – Number of samples in a single resampled replicate. Defaults to size of data along sampled axis.freq (array of
int
) – Array of shape(nrep, size)
where nrep is the number of replicates andsize = self.shape[axis]
. freq is the weight that each sample contributes to resamples values. Seerandsamp_freq()
indices (array of
int
) – Array of shape(nrep, size)
. If passed, create freq from indices. Seerandsamp_freq()
.check (
bool
, defaultFalse
) – if check is True, then check freq and indices against ndat and nrep
- Returns:
freq (
ndarray
) – Frequency array.
See also
- cmomy.resample.resample_data(data, freq, mom, axis=0, dtype=None, order=None, parallel=True, out=None)[source]#
Resample data according to frequency table.
- Parameters:
data (array-like) – central mom array to be resampled
freq (array of
int
) – Array of shape(nrep, size)
where nrep is the number of replicates andsize = self.shape[axis]
. freq is the weight that each sample contributes to resamples values. Seerandsamp_freq()
mom (
int
ortuple
ofint
) – Order or moments. If integer or length one tuple, then moments are for a single variable. If length 2 tuple, then comoments of two variablesout (
ndarray
, optional) – optional output array.
- Returns:
output (array) – output shape is (nrep,) + shape + mom, where shape is the shape of data less axis, and mom is the shape of the resulting mom.
- cmomy.resample.resample_vals(x, freq, mom, axis=0, w=None, mom_ndim=None, broadcast=False, dtype=None, order=None, parallel=True, out=None)[source]#
Resample data according to frequency table.
- Parameters:
freq (array of
int
) – Array of shape(nrep, size)
where nrep is the number of replicates andsize = self.shape[axis]
. freq is the weight that each sample contributes to resamples values. Seerandsamp_freq()
mom (
int
ortuple
ofint
) – Order or moments. If integer or length one tuple, then moments are for a single variable. If length 2 tuple, then comoments of two variablesaxis (
int
) – Axis to reduce along.w (
ndarray
[Any
,dtype
[Any
]] |None
, default:None
) – Weights array.mom_ndim (
{1, 2}
) – Value indicates if moments (mom_ndim = 1
) or comoments (mom_ndim=2
).broadcast (
bool
) – If True, andx=(x0, x1)
, then perform ‘smart’ broadcasting. In this case, ifx1.ndim = 1
andlen(x1) == x0.shape[axis]
, then broadcast x1 tox0.shape
.dtype (
dtype
) – Optionaldtype
for output data.order (
Literal
['C'
,'F'
,'A'
,'K'
,None
], default:None
) – Parameterorder
tonumpy.asarray()
.out (
ndarray
) – Optional output array.
- Returns:
ndarray
– Resampled central moments array.
- cmomy.resample.bootstrap_confidence_interval(distribution, stats_val='mean', axis=0, alpha=0.05, style=None, **kwargs)[source]#
Calculate the error bounds.
- Parameters:
distribution (array-like) – distribution of values to consider
stats_val (array-like,
{None, 'mean','median'}
, optional) –array: perform pivotal error bounds (correct) with this as value.
percentile: percentiles, with value as median
mean: pivotal error bounds with mean as value
median: pivotal error bounds with median as value
axis (
int
, default0
) – axis to analyze alongalpha (
float
) – alpha value for confidence interval. Percent confidence = 100 * (1 - alpha)style (
{None, 'delta', 'pm'}
) – controls style of output**kwargs (
Any
) – extra arguments to numpy.percentile
- Returns:
out (array) – fist dimension will be statistics. Other dimensions have shape of input less axis reduced over. Depending on style first dimension will be (note val is either stats_val or median):
None: [val, low, high]
delta: [val, val-low, high - val]
pm : [val, (high - low) / 2]
- cmomy.resample.xbootstrap_confidence_interval(x, stats_val='mean', axis=0, dim=None, alpha=0.05, style=None, bootstrap_dim='bootstrap', bootstrap_coords=None, **kwargs)[source]#
Bootstrap xarray object.
- Parameters:
dim (
str
) – if passed, use reduce along this dimensionbootstrap_dim (
str
, default'bootstrap'
) – name of new dimension. If bootstrap_dim conflicts, then new_name = dim + new_namebootstrap_coords (array-like or
str
) – coords of new dimension. If None, use default names If string, use this for the ‘values’ name