OptBayesExpt class¶
- class optbayesexpt.obe_base.OptBayesExpt(measurement_model, setting_values, parameter_samples, constants, n_draws=30, choke=None, use_jit=True, utility_method='variance_approx', selection_method='optimal', pickiness=15, default_noise_std=1.0, **kwargs)[source]¶
Bases:
ParticlePDF
An implementation of sequential Bayesian experiment design.
OptBayesExpt is a manager that calculates strategies for efficient measurement runs. OptBayesExpt incorporates measurement data, and uses that information to select settings for measurements with high predicted benefit / cost ratios.
The use cases are situations where the goal is to find the parameters of a parametric model.
The primary functions of this class are to interpret measurement data and to calculate effective settings. The corresponding methods that perform these functions are
OptBayesExpt.pdf_update()
for interpretation of new data and eitherOptBayesExpt.opt_setting()
orOptBayesExpt.good_setting()
for calculation of effective settings.Instances of OptBayesExpt may be used for cases where
Reported measurement data includes measurement uncertainty,
Every measurement is assumed to cost the same amount.
The measurement noise is assumed to be constant, independent of parameters and settings.
OptBayesExpt may be inherited by child classes to allow additional flexibility. Examples in the
demos
folder show several extensions including unknown noise, and setting-dependent costs.- Arguments:
measurement_model (
function
) – Evaluates the experimental model from (settings
,parameters
,constants
) arguments, returning single values or arrays depending on the arguments. Themodel_function
is very similar to the fit function in a least-squares regression. Themodel_function()
must allow evaluation in both of the following forms:model_function(tuple_of_single_settings, tuple_of_parameter_arrays, tuple_of_constants)
, returning an array with the same size as one of the parameter arrays.model_function(tuple_of_setting_arrays, tuple_of_single_parameters, tuple_of_constants)
, returning an array with the same size as one of the setting arrays.
The broadcasting feature of numpy arrays provides a convenient way to write this type of function for simple analytical models.
Version 1.1.0 and later support model functions that return multiple output channels, e. g. real and imaginary parts or vectors expressed as tuples, lists or arrays. The number of output channels,
n_channels
is deduced by evaluating the measurement model function.setting_values (
tuple
ofndarray
) – Each array in thesetting_values
tuple contains the allowed discrete values of a measurement setting. Applied voltage, excitation frequency, and a knob that goes to eleven are all examples of settings. For computational speed, it is important to keep setting arrays appropriately sized. Settings arrays that cover unused setting values, or that use overly fine discretization will slow the calculations. Settings that are held constant belong in theconstants
array.parameter_samples (
tuple
ofndarray
) – In a simple example model, \(y = m * x + b\), the parameters are \(m\) and \(b\). Each array in theparameter_samples
tuple contains samples from the prior distribution of a parameter. Traditionally, the prior is described as expressing the state of belief about the parameter value before measurement, so the prior can be used to include results of other measurements. For a mostly independent measurement, the prior samples should cover the full range of plausible values. Parameters that can be assumed constant belong in theconstants
array.constants (
tuple
offloat
) – Model constants. Examples include experimental settings that are rarely changed, and model parameters that are well-known from previous measurement results.
- Keyword Arguments:
n_draws (
int
) – specifies the number of parameter samples used in the utility calculation. Default 30.choke (
float
) – Ifchoke
is specified, the likelihood will be raised to thechoke
power. Occasionally, simulated measurement runs will “get stuck,” and converge to incorrect parameter values. Thechoke
argument provides a heuristic fix for better reliability at the expense of speed. For values0.0 < choke < 1.0
choking reduces the max/min ratio of the likelihood and allows more data to influence the parameter distribution between resampling events. DefaultNone
.use_jit (
Boolean
) – Ifnumba
is installed, pre-compile the likelihood calculation for faster execution. Arguse_jit
is also passed as a keyword arg to ParticlPDF. DefaultTrue
.utility_method (
string
) – ['variance_approx'
|'pseudo_utility'
|'full_kld_utility'
|'max_min'
]: Specifies the utility algorithm as described in [1]. With'max_min'
, n_draws=2 is recommended. Default'variance_approx'
.selection_method (
string
) – ['optimal'
|'good'
|'random'
]: Specifies how the setting is selected based on the utility. If'optimal'
, the setting at maximum utility is selected. If'good'
, the utility is raised to a power given bypickiness
parameter and normalized. The setting is selected with probability proportional toutility
**pickiness
. If'random
, the utility is disregarded and the setting is chosen randomly from the allowed settings.pickiness (
float
) – When selection_method is'normal'
, this parameter affects the probability of picking a setting near a maximum in the utilty function. Default 15.default_noise_std (
float
orndarray
) – Measurement noise standard deviation used in utility calculations. Iffloat
, the value populates entries of a \(n_{channels} \times 1\)ndarray
where \(n_{ channels}\) corresponds to the number of measurement channels, e.g. 2 if data is collected from \(X\) and \(Y\) outputs of an instrument. If \(n_{channels} \times 1\)ndarray
, entries are noise standard deviations corresponding to the measurement channels.**kwargs – Keyword arguments passed to the parent ParticlePDF class.
Attributes:
- N_DRAWS¶
Stores the
n_draws
argument.- Type:
int
- allsettings¶
Arrays containing all possible combinations of the : setting values provided in the`
setting_values
argument.- Type:
list
ofndarray
- choke¶
Stores the
choke
argument.- Type:
float
- cons¶
Stores the
constants
argument.- Type:
tuple
of:obj:float
- cost_estimate()[source]¶
A stub for estimating the cost of prospective measurements
An estimate of the cost of measurement resources. (e.g. setup time + data collection time). This estimate goes in the denominator of the utility function, yielding a benefit/cost ratio. Returns a single float if cost is the same for all settings, or an array with dimensions of
self.setting_indices
. :returns: 1.0. :rtype:float
, orndarray
Default
- default_noise_std¶
A noise level estimate for each channel used in setting selection used by
y_var_noise_model()
.- Type:
ndarray
- enforce_parameter_constraints()[source]¶
A stub for enforcing constraints on parameters
for example:
# find the particles with disallowed parameter values # (negative parameter values in this example) bad_ones = np.argwhere(self.parameters[3] < 0) for index in bad_ones: # setting a weight = 0 effectively eliminates the particle self.particle_weights[index] = 0 # renormalize self.particle_weights = self.particle_weights / np.sum(self.particle_weights)
- eval_over_all_parameters(onesettingset)[source]¶
Evaluates the experimental model.
Evaluates the model for one combination of measurement settings and all parameter combinations in
self.parameters
. Called bypdf_update()
forlikelihood()
and Bayesian inference processing of measurement results.This method and
eval_over_all_settings()
both callmodel_function()
, but with different argument types. If the broadcasting properties of numpy arrays are not able to resolve this polymorphism, this method may be replaced by a separate method for model evaluation.- Parameters:
onesettingset (
tuple
offloat
) – a single set of measurement settings- Returns:
(
ndarray
) array of model values with dimensions of one element ofself.allparams
.
- eval_over_all_settings(oneparamset)[source]¶
Evaluates the experimental model.
Evaluates the model for all combinations of measurement settings in
self.allsettings
and one set of parameters. CalledN_DRAWS
times byyvar_from_parameter_draws()
as part of theutility()
calculation- Parameters:
oneparamset (
tuple
offloat
) – a set of single model parameter values.- Returns:
(
ndarray
) array of model values with dimensionsself.setting_indices
.
- get_setting()[source]¶
Selects settings for the next measurement.
A wrapper for the method selected by the
selection_method
argument. Seeopt_setting
,good_setting()
andrandom()
.- Returns:
A settings tuple.
- good_setting(pickiness=None)[source]¶
Calculate a setting with a good utility
Selects settings using a weighted random selection using the utility function to calculate a weight. The weight function is
utility( )
raised to thepickiness
power. In comparison to theopt_setting()
method, where the measurements select only the very best setting,good_setting()
yields a more diverse series of settings. Selected byselection_method='good'
argument.- Parameters:
pickiness (float) – A setting selection tuning parameter. Pickiness=0 produces random settingss. With pickiness values greater than about 10 the behavior is similar to
opt_setting()
.- Returns:
A settings tuple.
- last_setting_index¶
The most recent setting choice as an index into the allsettings arrays.
- Type:
int
- likelihood(y_model, measurement_record)[source]¶
Calculates the likelihood of a measurement result.
For each parameter combination, estimate the probability of obtaining the results provided in
measurement_record
. This default method relies on several assumptions:The uncertainty in measurement results is well-described by normally-distributed (Gaussian) noise.
The the standard deviation of the noise, \(\sigma\) is known.
Under these assumptions, and model values \(y_{model}\) as a function of parameters, the likelihood is a Gaussian function proportional to \(\sigma^{-1} \exp [-(y_{model} - y_{meas})^2 / (2 \sigma^2)]\).
- Parameters:
y_model (
ndarray
) –model_function()
results evaluated for all parameters.measurement_record (
tuple
) –The measurement conditions and results, supplied by the user to
update_pdf()
. The elements ofmeasurement_record
are:settings (tuple)
measurement value (float or tuple)
std uncertainty (float or tuple)
- Returns:
an array of probabilities corresponding to the parameters in
self.allparameters
.
- measurement_results¶
list
Records of accumulated measurement results for output to data files and / or plotting.
- model_function¶
equal to the measurement model parameter above. with added text
- Type:
function
- n_channels¶
The number of measurement values per experiment, e.g. 2 for an : experiment that reports two voltages. Deduced from model outputs.
- Type:
int
- opt_setting()[source]¶
Find the setting with maximum utility
Selects settings based on the maximum value of the utility. Calls
utility()
for an estimate of the benfit/cost ratio for all allowed settings, and returns the settings corresponding to the maximum value. Selected byselection_method='optimal'
argument.- Returns:
A settings tuple.
- parameters¶
The most recently set of parameter samples the parameter distribution.
self.parameters
is a view ofPartcilePDF.particles
.- Type:
ndarray
ofndarray
- pdf_update(measurement_record, y_model_data=None)[source]¶
Refines the parameters’ probability distribution function given a measurement result.
This is where measurement results are entered. An implementation of Bayesian inference, uses the model to calculate the likelihood of obtaining the measurement result as a function of parameter values, and uses that likelihood to generate a refined posterior ( after-measurement) distribution from the prior ( pre-measurement) parameter distribution.
Warning
OptBayesExpt
requires the input data to contain good estimates of measurement uncertainty. The uncertainty values entered here can influence both mean values and widths of the inferred parameter distribution. When measurement uncertainty is not well-known,OptBayesExptNoiseParameter
is recommended to determine measurement uncertainty from the measured values.- Parameters:
measurement_record (
tuple
) –The measurement conditions and results, supplied by the user to
update_pdf()
. The elements ofmeasurement_record
are:- settings (tuple): the settings used for the
measurement. May be different from the requested settings.
- measurement result (float or tuple) Use a tuple for
multi-channel measurements
- std uncertainty (float or tuple) An uncertainty
estimate for the measurement result.
y_model_data (
ndarray
) – The result ofself.eval_over_all_parameters()
This argument allows model evaluation to run before measurement data is available, e.g. while measurements are being made. Default =None
.
- pickiness¶
Stores the pickiness argument
- Type:
float
- random_setting()[source]¶
Pick a random setting for the next measurement
Randomly selects a setting from all possible setting combinations. Selected by
selection_method='random'
argument.- Returns:
A settings tuple.
- set_n_draws(n_draws=None)[source]¶
Sets OptBayesExpt.N_DRAWS attribute.
Sets or queries the number of parameter samples to use in the utility calculation.
- Parameters:
n_draws (int or 'default' or None) – An integer argument sets N_DRAWS, ‘default’ sets the default value of 30, and
set_n_draws()
returns the current value.
Returns: N_DRAWS
- setting_indices¶
indices in to the allsettings arrays. Used in#:
opt_setting()
andgood_setting()
.- Type:
ndarray
ofint
- setting_values¶
A record of the setting_values argument.
- Type:
tuple
ofndarray
- utility()[source]¶
Estimate the utility as a function of setting options
The utility \(U(d)\) is the predicted benefit/cost ratio of proposed measurement designs \(d\).
Note
Traditionally, utility is given in terms of a change in the information entropy. However, information entropy is a logarithmic quantity, and we are accustomed to thinking about cost on a linear scale. To facilitate estimating benefit/cost, the utility algorithms below return a ‘linearized’ utility: \(exp(U(d))-1.0\)
- The
utility()
function is a wrapper for the algorithm selected by the
utility_method
argument.
Returns: linearized utility
- The
- utility_full_kld()[source]¶
Estimate the utility as a function of settings.
Used in selecting measurement settings. The utility is the predicted benefit/cost ratio of a new measurement where the benefit is given in terms of a change in the information entropy of the parameter distribution. This algorithm corresponds to the “full-KLD algorithm” of [1].
Among the provided utility algorithms,
utility_KLD
comes closest to the information-theoretic analytical result.- Returns:
Approximate utility as an
ndarray
with dimensions ofself.setting_indices
.
- utility_max_min()[source]¶
Estimate utility using the max-min algorithm
This algorithm corresponds to the “max-min algorithm” of [1].
In this algorithm, we use the maximum and minimum modeled outputs produced by
N_DRAWS
samples of the parameter distribution and the variance of the measurement noise are calculated separately.This algorithm provides slightly lower quality setting choices than the other utility algorithms, but it executes very fast. Speed and quality of choices are both best when
N_DRAWS = 2
.- Returns:
Linearized utility as an
ndarray
with dimensions ofself.setting_indices
.
- utility_pseudo()[source]¶
Estimate the utility as a function of settings.
Used in selecting measurement settings. The utility is the predicted benefit/cost ratio of a new measurement where the benefit is given in terms of a change in the information entropy of the parameter distribution. This algorithm corresponds to the “pseudo-H algorithm” of [1], and it is included here mostly for historical reasons.
In this algorithm, the idea is to mimic the
utility_KLD()
algorithm more closely thanutility_variance()
. We calculate the differential entropy of the model outputs produced byN_DRAWS
samples of the parameter distribution. We then compute the variance of a normal (Gaussian) distribution that has the same information entropy. This effective variance is combined with the noise variance as inutility_variance()
.- Returns:
Approximate utility as an
ndarray
with dimensions ofself.setting_indices
.
- utility_variance()[source]¶
Estimate the utility as a function of settings.
The utility is the predicted benefit/cost ratio of a new measurement where the benefit is given in terms of a change in the information entropy of the parameter distribution. This algorithm corresponds to the “variance algorithm” of [1].
In this algorithm, we use the logarithm of variance as an approximation for the information entropy. The variance of model outputs produced by
N_DRAWS
samples of the parameter distribution and the variance of the measurement noise are calculated separately.Execution of
utility_variance
is faster thanutility_variance
andutility_pseudo
and the decision quality is very similar toutility_KLD
.- Returns:
Approximate utility as an
ndarray
with dimensions ofself.setting_indices
.
- yvar_from_entropy()[source]¶
Models the entropy of the model values due to the parameter distributions
Evaluates the effect of the distribution of parameter values on the distribution of model outputs for every setting combination. This calculation is done as part of the utility calculation as an approximation to the information entropy. For each of
self.N_DRAWS
samples from the parameter distribution, this method models a noise-free experimental output for all setting combinations and returns the entropy of the model values for each setting combination, cast as a variance- Returns:
ndarray
with shape ofself.setting_indices
- yvar_from_parameter_draws()[source]¶
Models the measurement variance solely due to parameter distributions.
Evaluates the effect of the distribution of parameter values on the distribution of model outputs for every setting combination. This calculation is done as part of the utility calculation as an approximation to the information entropy. For each of
self.N_DRAWS
samples from the parameter distribution, this method models a noise-free experimental output for all setting combinations and returns the variance of the model values for each setting combination.- Returns:
ndarray
with shape ofself.setting_indices
- yvar_max_min()[source]¶
Crudely approximates the signal variance using max - min.
Returns:
ndarray
with shape ofself.setting_indices
- yvar_noise_model()[source]¶
A stub for models of the measurement noise
A model of measurement variance (noise) as a function of settings, averaged over parameters if parameter-dependent. Used in the utility calculation.
In general, the measurement noise could depend on both settings and parameters, and the model would require evaluation of the noise model over all parameters, averaged over draws from the parameter distribution. Measurement noise that depends on the measurement value, like root(N), Poisson-like counting noise is an example of such a situation. Fortunately, this noise estimate only affects the utility function, which only affects setting choices, where the “runs good” philosophy of the project allows a little approximation.
- Returns:
If measurement noise is independent of settings, a
float
, otherwise anndarray
with the shape of an element of allsettings. Default:default_noise_std ** 2
.
References