Module Reference¶
Function Summary¶
cross_validate ([xdata, ydata, PLS_model, …]) |
Conducts a cross-validation analysis on a set of data using a regression algorithm. |
simple_bootstrap ([xdata, ydata, PLS_model, …]) |
Conducts a simple residual bootstrap analysis on a set of data. |
bootstrap ([xdata, ydata, validdata, …]) |
Conducts a simple residual bootstrap analysis on a set of data. |
bootstrap_unc ([xdata, ydata, valid_data, …]) |
Computes the uncertainty in a bootstrap analysis by leave-one-out cross-validation. |
pca_bootstrap ([xdata, ydata, groups, …]) |
Conducts a residual bootstrap analysis on a set of data. |
misclass_probability (probability_zero, …) |
Estimate the misclassification probability of a sample, which is based on the confidence level of the prediction compared to the true value. |
Function documentation¶
-
ml_uncertainty.
cross_validate
(xdata=None, ydata=None, PLS_model=None, cv_object=None, sk_model=<class 'sklearn.cross_decomposition.pls_.PLSRegression'>, PLS_kw=None, class_value=0.5)[source]¶ Conducts a cross-validation analysis on a set of data using a regression algorithm.
This function is essentially a pass-through to
sklearn.model_selection.cross_val_predict
, and then does PLS-DA class assignmentsKey xdata: The X data used to fit the model (default None) Key ydata: The Y data used to fit the model (default None) Key PLS_model: The scikit-learn model that will be fit using X and Y Key cv_object: The cross-validation model that will be used for calculating cross-validation statistics Key sk_model: If PLS_model,PLS_cv,or PLS_bootstrap is None, this scikit-learn model will be used to create them Key PLS_kw: The keyword arguments that will be passed to sk_model Key class_value: The value separating the classes in PLS-DA Returns: class_assigned_cv, which is the dummy variable array in PLS-DA, and class_predicted_cv, the array of PLS predictions
-
ml_uncertainty.
simple_bootstrap
(xdata=None, ydata=None, PLS_model=None, PLS_cv=None, PLS_bootstrap=None, sk_model=<class 'sklearn.cross_decomposition.pls_.PLSRegression'>, cv_object=None, class_value=0.5, samples=1000, PLS_kw=None, return_boot=False)[source]¶ Conducts a simple residual bootstrap analysis on a set of data. Computes cross-validation uncertainty.
This function relies on the Y-data being bootstrapped to be one-dimensional. It also requires the model to be accept two-dimensional data. The bootstrapping is done by generating samples random variations on the Y-data and then concatenating them into a two-dimensional array.
If PLS_model is None, then PLS_cv and PLS_bootstrap are ignored. The function will create independent instances of :py:class:sk_model for each of PLS_model, PLS_cv, and PLS_boostrap.
If PLS_model is not None, then it will be reused for PLS_cv and PLS_bootstrap.
Key xdata: The X data used to fit the model (default None) Key ydata: The Y data used to fit the model (default None) Key PLS_model: The scikit-learn model that will be fit using X and Y Key PLS_cv: The scikit-learn model that will be used for cross-validation Key PLS_bootstrap: The scikit-learn model that will be used for bootstrapping Key sk_model: If PLS_model,PLS_cv,or PLS_bootstrap is None, this scikit-learn model will be used to create them Key cv_object: The cross-validation model that will be used for calculating cross-validation statistics Key class_value: The value separating the classes in PLS-DA Key samples: The number of samples for bootstrapping Key PLS_kw: The keyword arguments that will be passed to sk_model Key return_boot: If True, returns the PLS_bootstrap model as part of the output
-
ml_uncertainty.
bootstrap
(xdata=None, ydata=None, validdata=None, PLS_model=None, PLS_cv=None, PLS_bootstrap=None, sk_model=<class 'sklearn.cross_decomposition.pls_.PLSRegression'>, regression=False, cv_object=None, class_value=0.5, samples=1000, PLS_kw=None, return_scores=False, return_loadings=False, tq=True)[source]¶ Conducts a simple residual bootstrap analysis on a set of data. Computes cross-validation uncertainty.
This function performs a full bootstrap and makes no assumption about the shape or structure of the Y data. Each bootstrap sample will have an independent model fit to it.
If PLS_model is None, then PLS_cv and PLS_bootstrap are ignored. The function will create independent instances of :py:class:sk_model for each of PLS_model, PLS_cv, and PLS_boostrap.
If PLS_model is not None, then it will be reused for PLS_cv and PLS_bootstrap.
Key xdata: The X data used to fit the model (default None) Key ydata: The Y data used to fit the model (default None) Key validdata: Additional data not used to fit the model but for which uncertainty will be calculated Key PLS_model: The scikit-learn model that will be fit using X and Y. If None, a new model will be created from sk_model Key PLS_cv: The scikit-learn model that will be used for cross-validation. If None, same as PLS_model. Key PLS_bootstrap: The scikit-learn model that will be used for bootstrapping. If None, same as PLS_model. Key sk_model: If PLS_model,PLS_cv,or PLS_bootstrap is None, this scikit-learn model will be used to create them Key cv_object: The cross-validation model that will be used for calculating cross-validation statistics Key class_value: The value separating the classes in PLS-DA Key samples: The number of samples for bootstrapping Key PLS_kw: The keyword arguments that will be passed to sk_model Key return_scores: If True, returns the scores of the PLS_bootstrap model as part of the output Key return_loadings: If True, returns the loadings of the PLS_bootstrap model as part of the output
-
ml_uncertainty.
bootstrap_unc
(xdata=None, ydata=None, valid_data=None, cv_object=None, samples=1000, class_value=0.5, PLS_kw=None, return_scores=False, tq=True)[source]¶ Computes the uncertainty in a bootstrap analysis by leave-one-out cross-validation.
For each sample, the uncertainty is calculated by fitting the other samples to the model, calculating the bootstrap uncertainty and then calculating the uncertainty in the held-out sample.
Key xdata: The X data used to fit the model (default None) Key ydata: The Y data used to fit the model (default None) Key valid_data: Additional data not used to fit the model but for which uncertainty will be calculated Key cv_object: The cross-validation model that will be used for calculating cross-validation statistics Key samples: The number of samples for bootstrapping Key class_value: The value separating the classes in PLS-DA Key PLS_kw: The keyword arguments that will be passed to sk_model Key return_scores: If True, returns the scores of the PLS_bootstrap model as part of the output
-
ml_uncertainty.
pca_bootstrap
(xdata=None, ydata=None, groups=None, validdata=None, PCA_model=None, PCA_cv=None, PCA_bootstrap=None, skmodel=<class 'sklearn.decomposition.pca.PCA'>, scaler=None, cv_object=None, samples=1000, PCA_kw=None, tq=True)[source]¶ Conducts a residual bootstrap analysis on a set of data. Computes cross-validation uncertainty.
This function is the same as
bootstrap()
but works for unsupervised models such as PCAIf PLS_model is None, then PLS_cv and PLS_bootstrap are ignored. The function will create independent instances of :py:class:sk_model for each of PLS_model, PLS_cv, and PLS_boostrap.
If PLS_model is not None, then it will be reused for PLS_cv and PLS_bootstrap.
Key xdata: The X data used to fit the model (default None) Key ydata: The Y data used to fit the model (default None) Key PCA_model: The scikit-learn model that will be fit using X Key PCA_cv: The scikit-learn model that will be used for cross-validation Key PCA_bootstrap: The scikit-learn model that will be used for bootstrapping Key sk_model: If PLS_model,PLS_cv,or PLS_bootstrap is None, this scikit-learn model will be used to create them Key scaler: The scikit-learn preprocessing object used to preprocess the data. This will be put into a Key cv_object: The cross-validation model that will be used for calculating cross-validation statistics Key samples: The number of samples for bootstrapping Key PCA_kw: The keyword arguments that will be passed to sk_model Key return_boot: If True, returns the PLS_bootstrap model as part of the output