ParticlePDF class

class optbayesexpt.particlepdf.ParticlePDF(prior, a_param=0.98, resample_threshold=0.5, auto_resample=True, scale=True, use_jit=True)[source]

Bases: object

A probability distribution function.

A probability distribution \(P(\theta_0, \theta_1, \ldots, \theta_{n\_dims})\) over parameter variables \(\theta_i\) is represented by a large-ish number of samples from the distribution, each with a weight value. The distribution can be visualized as a cloud of particles in parameter space, with each particle corresponding to a weighted random draw from the distribution. The methods implemented here largely follow the algorithms published in Christopher E Granade et al. 2012 New J. Phys. 14 103013.

Warning

The number of samples (i.e. particles) required for good performance will depend on the application. Too many samples will slow down the calculations, but too few samples can produce incorrect results. With too few samples, the probability distribution can become overly narrow, and it may not include the “true” parameter values. See the resample() method documentation for details.

Arguments:

prior (2D array-like) – The Bayesian prior, which initializes the ParticlePDF distribution. Each of n_dims sub-arrays contains n_particles values of a single parameter, so that the j _th elements of the sub-arrays determine the coordinates of a point in parameter space. Users are encouraged to experiment with different n_particles sizes to assure consistent results.

Keyword Arguments:
  • a_param – (float) In resampling, determines the scale of random diffusion relative to the distribution covariance. After weighted sampling, some parameter values may have been chosen multiple times. To make the new distribution smoother, the parameters are given small ‘nudges’, random displacements much smaller than the overall parameter distribution, but with the same shape as the overall distribution. More precisely, the covariance of the nudge distribution is (1 - a_param ** 2) times the covariance of the parameter distribution. Default 0.98.

  • scale (bool) – determines whether resampling includes a contraction of the parameter distribution toward the distribution mean. The idea of this contraction is to compensate for the overall expansion of the distribution that is a by-product of random displacements. If true, parameter samples (particles) move a fraction a_param of the distance to the distribution mean. Default is True, but False is recommended.

  • resample_threshold (float) – Sets a threshold for automatic resampling. Resampling is triggered when the effective fraction of particles, \(1 / (N\sum_i^N w_i^2)\), is smaller than resample_threshold. Default 0.5.

  • auto_resample (bool) – Determines whether threshold testing and resampling are performed when bayesian_update() is called. Default True.

  • use_jit (bool) – Allows precompilation of some methods for a modest increase in speed. Only effective on systems where numba is installed. Default True

Attributes:

bayesian_update(likelihood)[source]

Performs a Bayesian update on the probability distribution.

Multiplies particle_weights by the likelihood and renormalizes the probability distribution. After the update, the distribution is tested for resampling depending on self.tuning_parameters['auto_resample'].

Parameters:

likelihood – (ndarray): An n_samples sized array describing the Bayesian likelihood of a measurement result calculated for each parameter combination.

covariance()[source]

Calculates the covariance of the probability distribution.

Returns:

The covariance of the parameter distribution as an n_dims X n_dims array. See also mean() and std().

just_resampled

A flag set by the resample_test() function. True if the last bayesian_update() resulted in resampling, else False.

Type:

bool

mean()[source]

Calculates the mean of the probability distribution.

The weighted mean of the parameter distribution. See also std() and covariance().

Returns:

Size n_dims array.

n_dims

The number of parameters, i.e. the dimensionality of parameter space. Determined from the leading dimension of prior.

Type:

int

n_particles

the number of parameter samples representing the probability distribution. Determined from the trailing dimension of prior.

Type:

int

particle_weights

Array of probability weights corresponding to the particles.

Type:

ndarray of float64

particles

Together with particle_weights,#: these n_particles points represent the parameter probability distribution. Initialized by the prior argument.

Type:

n_dims x n_particles ndarray of float64

randdraw(n_draws=1)[source]

Provides random parameter draws from the distribution

Particles are selected randomly with probabilities given by self.particle_weights.

Parameters:

n_draws (int) – the number of draws requested. Default 1.

Returns:

An n_dims x N_DRAWS ndarray of parameter draws.

resample()[source]

Performs a resampling of the distribution.

Resampling refreshes the random draws that represent the probability distribution. As Bayesian updates are made, the weights of low-probability particles can become very small. These particles consume memory and computation time, and they contribute little to values that are determined from the distribution. Resampling abandons some low-probability particles while allowing high-probability particles to multiply in higher-probability regions.

Sample impoverishment can occur if there are too few particles. In this phenomenon, a resampling step fails to sample particles from an important, but low-probability region, effectively removing that region from future consideration. The symptoms of this sample impoverishment phenomenon include:

  • Inconsistent results from repeated runs. Standard deviations from individual final distributions will be too small to explain the spread of individual mean values.

  • Sudden changes in the standard deviations or other measures of the distribution on resampling. The resampling is not supposed to change the distribution, just refresh its representation.

resample_test()[source]

Tests the distribution and performs a resampling if required.

If the effective number of particles falls below self.tuning_parameters['resample_threshold'] * n_particles, performs a resampling. Sets the just_resampled flag.

set_pdf(samples, weights=None)[source]

Re-initializes the probability distribution

Also resets n_particles and n_dims deduced from the dimensions of samples.

Parameters:
  • samples (array-like) – A representation of the new distribution comprising n_dims sub-arrays of n_particles samples of each parameter.

  • weights (ndarray) – If None, weights will be assigned uniform probability. Otherwise, an array of length n_particles

std()[source]

Calculates the standard deviation of the distribution.

Calculates the square root of the diagonal elements of the covariance matrix. See also covariance() and mean().

Returns:

The standard deviation as an n_dims array.

tuning_parameters

A package of parameters affecting the resampling algorithm

  • 'a_param' (float): Initially, the value of the a_param keyword argument. Default 0.98

  • 'scale' (bool): Initially, the value of the scale keyword argument. Default True

  • 'resample_threshold' (float): Initially, the value of the resample_threshold keyword argument. Default 0.5.

  • 'auto_resample' (bool): Initially, the value of the auto_resample keyword argument. Default True.

Type:

dict