ParticlePDF class¶
- class optbayesexpt.particlepdf.ParticlePDF(prior, a_param=0.98, resample_threshold=0.5, auto_resample=True, scale=True, use_jit=True)[source]¶
Bases:
object
A probability distribution function.
A probability distribution \(P(\theta_0, \theta_1, \ldots, \theta_{n\_dims})\) over parameter variables \(\theta_i\) is represented by a large-ish number of samples from the distribution, each with a weight value. The distribution can be visualized as a cloud of particles in parameter space, with each particle corresponding to a weighted random draw from the distribution. The methods implemented here largely follow the algorithms published in Christopher E Granade et al. 2012 New J. Phys. 14 103013.
Warning
The number of samples (i.e. particles) required for good performance will depend on the application. Too many samples will slow down the calculations, but too few samples can produce incorrect results. With too few samples, the probability distribution can become overly narrow, and it may not include the “true” parameter values. See the
resample()
method documentation for details.- Arguments:
prior (
2D array-like
) – The Bayesian prior, which initializes theParticlePDF
distribution. Each ofn_dims
sub-arrays containsn_particles
values of a single parameter, so that the j _th elements of the sub-arrays determine the coordinates of a point in parameter space. Users are encouraged to experiment with differentn_particles
sizes to assure consistent results.- Keyword Arguments:
a_param – (float) In resampling, determines the scale of random diffusion relative to the distribution covariance. After weighted sampling, some parameter values may have been chosen multiple times. To make the new distribution smoother, the parameters are given small ‘nudges’, random displacements much smaller than the overall parameter distribution, but with the same shape as the overall distribution. More precisely, the covariance of the nudge distribution is
(1 - a_param ** 2)
times the covariance of the parameter distribution. Default0.98
.scale (
bool
) – determines whether resampling includes a contraction of the parameter distribution toward the distribution mean. The idea of this contraction is to compensate for the overall expansion of the distribution that is a by-product of random displacements. If true, parameter samples (particles) move a fractiona_param
of the distance to the distribution mean. Default isTrue
, butFalse
is recommended.resample_threshold (
float
) – Sets a threshold for automatic resampling. Resampling is triggered when the effective fraction of particles, \(1 / (N\sum_i^N w_i^2)\), is smaller thanresample_threshold
. Default0.5
.auto_resample (
bool
) – Determines whether threshold testing and resampling are performed whenbayesian_update()
is called. DefaultTrue
.use_jit (
bool
) – Allows precompilation of some methods for a modest increase in speed. Only effective on systems wherenumba
is installed. DefaultTrue
Attributes:
- bayesian_update(likelihood)[source]¶
Performs a Bayesian update on the probability distribution.
Multiplies
particle_weights
by thelikelihood
and renormalizes the probability distribution. After the update, the distribution is tested for resampling depending onself.tuning_parameters['auto_resample']
.- Parameters:
likelihood – (
ndarray
): Ann_samples
sized array describing the Bayesian likelihood of a measurement result calculated for each parameter combination.
- just_resampled¶
A flag set by the
resample_test()
function.True
if the lastbayesian_update()
resulted in resampling, elseFalse
.- Type:
bool
- mean()[source]¶
Calculates the mean of the probability distribution.
The weighted mean of the parameter distribution. See also
std()
andcovariance()
.- Returns:
Size
n_dims
array.
- n_dims¶
The number of parameters, i.e. the dimensionality of parameter space. Determined from the leading dimension of
prior
.- Type:
int
- n_particles¶
the number of parameter samples representing the probability distribution. Determined from the trailing dimension of
prior
.- Type:
int
- particle_weights¶
Array of probability weights corresponding to the particles.
- Type:
ndarray of
float64
- particles¶
Together with
particle_weights
,#: thesen_particles
points represent the parameter probability distribution. Initialized by theprior
argument.- Type:
n_dims x n_particles ndarray
offloat64
- randdraw(n_draws=1)[source]¶
Provides random parameter draws from the distribution
Particles are selected randomly with probabilities given by
self.particle_weights
.- Parameters:
n_draws (
int
) – the number of draws requested. Default1
.- Returns:
An
n_dims
xN_DRAWS
ndarray
of parameter draws.
- resample()[source]¶
Performs a resampling of the distribution.
Resampling refreshes the random draws that represent the probability distribution. As Bayesian updates are made, the weights of low-probability particles can become very small. These particles consume memory and computation time, and they contribute little to values that are determined from the distribution. Resampling abandons some low-probability particles while allowing high-probability particles to multiply in higher-probability regions.
Sample impoverishment can occur if there are too few particles. In this phenomenon, a resampling step fails to sample particles from an important, but low-probability region, effectively removing that region from future consideration. The symptoms of this
sample impoverishment
phenomenon include:Inconsistent results from repeated runs. Standard deviations from individual final distributions will be too small to explain the spread of individual mean values.
Sudden changes in the standard deviations or other measures of the distribution on resampling. The resampling is not supposed to change the distribution, just refresh its representation.
- resample_test()[source]¶
Tests the distribution and performs a resampling if required.
If the effective number of particles falls below
self.tuning_parameters['resample_threshold'] * n_particles
, performs a resampling. Sets thejust_resampled
flag.
- set_pdf(samples, weights=None)[source]¶
Re-initializes the probability distribution
Also resets
n_particles
andn_dims
deduced from the dimensions ofsamples
.- Parameters:
samples (array-like) – A representation of the new distribution comprising n_dims sub-arrays of n_particles samples of each parameter.
weights (ndarray) – If
None
, weights will be assigned uniform probability. Otherwise, an array of lengthn_particles
- std()[source]¶
Calculates the standard deviation of the distribution.
Calculates the square root of the diagonal elements of the covariance matrix. See also
covariance()
andmean()
.- Returns:
The standard deviation as an n_dims array.
- tuning_parameters¶
A package of parameters affecting the resampling algorithm
'a_param'
(float
): Initially, the value of thea_param
keyword argument. Default0.98
'scale'
(bool
): Initially, the value of thescale
keyword argument. DefaultTrue
'resample_threshold'
(float
): Initially, the value of theresample_threshold
keyword argument. Default0.5
.'auto_resample'
(bool
): Initially, the value of theauto_resample
keyword argument. DefaultTrue
.
- Type:
dict