Routines to calculate confidence intervals from resampled data (cmomy.confidence_interval)#

Functions:

bootstrap_confidence_interval(theta_boot[, ...])

Create the bootstrap confidence interval.

Exceptions:

InstabilityWarning([msg])

Issued when results may be unstable.

cmomy.confidence_interval.bootstrap_confidence_interval(theta_boot, theta_hat=None, theta_jack=None, *, alpha=0.05, axis=MISSING, method='bca', dim=MISSING, ci_dim='alpha', keep_attrs=None, apply_ufunc_kwargs=None)[source]#

Create the bootstrap confidence interval.

The general idea is to analyze some function \(\theta\) which is a function of the central moments.

Parameters:
  • theta_boot (ndarray) – Bootstrapped resampled values of \(\theta\)

  • theta_hat (ndarray) – Results of \(\theta\) from original data set. Needed for method values ‘basic’` and 'bca'. Note that this array should have shape as theta_boot with axis either removed, or of size 1.

  • theta_jack (ndarray) – Jackknife resampled data. Needed for method 'bca'. Note that this array should have the same shape as theta_boot except along axis.

  • alphas (float or iterable of float) – The quantiles to use for confidence interval. If alpha is a float, then Use (alpha/2, 1-alpha/2) for confidence intervals (e.g., pass alpha=0.05 for the two-sided 95% confidence interval). If alpha is an iterable, use these values.

  • axis (int) – Axis to reduce/sample along.

  • method ({'percentile', 'basic', 'bca'}, default :class:``’bca’:class:``) – Whether to return the ‘percentile’ bootstrap confidence interval ('percentile'), the ‘basic’ (AKA ‘reverse’) bootstrap confidence interval ('basic'), or the bias-corrected and accelerated bootstrap confidence interval ('BCa').

  • dim (hashable) – Dimension to reduce/sample along.

  • ci_dim (str, default "alpha") – Name of confidence level dimension of DataArray output.

  • keep_attrs ({"drop", "identical", "no_conflicts", "drop_conflicts", "override"} or bool, optional) –

    • ‘drop’ or False: empty attrs on returned xarray object.

    • ’identical’: all attrs must be the same on every object.

    • ’no_conflicts’: attrs from all objects are combined, any that have the same name must also have the same value.

    • ’drop_conflicts’: attrs from all objects are combined, any that have the same name but different values are dropped.

    • ’override’ or True: skip comparing and copy attrs from the first object to the result.

Returns:

confindence_interval (ndarray) – Array of confidence intervals. confidence_interval[i, ...] corresponds alphas[i]. That is, confidence_interval.shape = (nalpha, shape[0], ..., shape[axis-1], shape[axis+1], ...) Where shape = theta_boot.shape.

See also

reduce_data

Create theta_hat from moments data

resample_data

Create theta_boot from moments data

jackknife_data

Create theta_jack from moments data

reduce_vals

Create theta_hat from values

resample_vals

Create theta_boot from values

jackknife_vals

Create theta_jack from values

scipy.stats.bootstrap

Scipy analog function

Examples

Calculate the bootstrap statistics of the log of the mean.

>>> import cmomy
>>> x = cmomy.default_rng(0).random((20))
>>> sampler = cmomy.resample.factory_sampler(nrep=50, ndat=20, rng=0)
>>> theta_boot = np.log(
...     cmomy.resample_vals(x, mom=1, axis=0, sampler=sampler)[..., 1]
... )
>>> bootstrap_confidence_interval(
...     theta_boot=theta_boot, axis=0, method="percentile"
... )
array([-1.0016, -0.4722])

To use the basic analysis, must also pass in theta_hat

>>> theta_hat = np.log(cmomy.reduce_vals(x, mom=1, axis=0)[..., 1])
>>> bootstrap_confidence_interval(
...     theta_boot=theta_boot, theta_hat=theta_hat, axis=0, method="basic"
... )
array([-0.8653, -0.3359])

To use bca, also need jackknife resampled data.

>>> theta_jack = np.log(cmomy.resample.jackknife_vals(x, mom=1, axis=0)[..., 1])
>>> bootstrap_confidence_interval(
...     theta_boot=theta_boot,
...     theta_hat=theta_hat,
...     theta_jack=theta_jack,
...     axis=0,
...     method="bca",
... )
array([-0.986 , -0.4517])

These results are the same as using scipy.stats.bootstrap(), but should be faster.

>>> from scipy.stats import bootstrap
>>> out = bootstrap(
...     [x],
...     lambda x, axis=None: np.log(np.mean(x, axis=axis)),
...     n_resamples=50,
...     axis=0,
...     random_state=np.random.default_rng(0),
...     method="bca",
... )
>>> np.array((out.confidence_interval.low, out.confidence_interval.high))
array([-0.986 , -0.4517])

Moreover, you can use pre-averaged data.

exception cmomy.confidence_interval.InstabilityWarning(msg=None)[source]#

Bases: UserWarning

Issued when results may be unstable.

Methods:

add_note

Exception.add_note(note) -- add a note to the exception

with_traceback

Exception.with_traceback(tb) -- set self.__traceback__ to tb and return self.

add_note()#

Exception.add_note(note) – add a note to the exception

with_traceback()#

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.