This module exposes the pseudocount (higher-order) function, which the associations module uses for general-purpose additive smoothing.
Note that it returns another function, which itself will take numerator and denominator arrays for applying additive smoothing.
priors
Functions:
-
pseudocount–Additive binomial smoothing via beta prior (beta-binomial)
Attributes:
-
ElemReduceFunc(TypeAlias) –merges two arrays into one e.g. a numerator and denominator
-
ElemWise(TypeAlias) –generic batched typed array
ElemReduceFunc
module-attribute
merges two arrays into one e.g. a numerator and denominator
pseudocount
pseudocount(prior: PsdCts) -> ElemReduceFunc
Additive binomial smoothing via beta prior (beta-binomial)
Can accept a variety of prior settings, which are described below, and handled via plum dispatch.
- single float (e.g.
0.5): Symmetric beta prior (a=b). Common choices of symmetric beta prior include a=0.0 (Haldane), a=1.0 (Laplace), and a=0.5 (Jeffrey's).affinistends to use the 0.5 Jeffrey's prior as default, unless otherwise noted (see here for a deeper discussion). - pair of floats (e.g.
(0.1,1.5)): asymmetric prior (see Beta Distribution Shapes). 'zero-sum': passed along with a float (e.g.('zero-sum',0.2)). Indicates that b should be derived from the provided parameter, such thata+b=1. This is a generalized arcsine distribution, Beta(a, 1-a). This can be useful to model expected fractions of a whole (ergo, "zero-sum"); see What is the Arcsine Law.'min-connect': can be used instead of a single float. This will give a differentaparameter, depending on the size of thefeatdimension. Intuitively for network recovery, a maximally-sparse (connected) graph should be a tree, which has a number of edges linear in node-count (n-1) while the number of possible edges is quadratic (n choose 2).
NOTE: Unlike the other options, 'min-connect' assumes that the passed arrays will have a shape that can be folded as a lower-triangle of a square matrix (i.e. a triangular number, n-choose-2)
Parameters:
-
(priorPsdCts) –PsdCts: beta priors (a,b), either explicit or implictly derived.
Returns:
-
ElemReduceFunc–ElemReduceFunc
Source code in affinis/priors.py
A note on zero-sum & min-connect priors
By design, min-connect will imply a zero-sum prior, with Beta parameters a=2/n,b=1-2/n for n feature dimensions.
To understand why we might want this, first consider the zero-sum option: recall that if observations are trials over an array of graph edges, then the number of edges that are on or off in a graph is "zero sum" (one extra "on" means one less "off).
So, the proportion of time we will be observing an "on" edge might be thought of as a Wiener Process, and thus follows a (generalized) arcsine distribution.
This means we need a "bathtub" prior a+b=1.
As it happens, the expected value for the posterior will be a.
Meanwhile, if a complete graph has n(n-2)/2 edges, while a min. connected one has n-1, then we can bias toward sparsity such that the expected ratio of edges to possible edges (and therefore the expected value of our bathtub prior) should be:
This comes out to a=2/n, b=1-2/n, so