Helpers

gsum.helpers.cartesian(*arrays)[source]

Makes the Cartesian product of arrays.

Parameters
*arraysarray group, shapes = (N_1,), (N_2,), …, (N_p,)

1D arrays where earlier arrays loop more slowly than later ones

Returns
array, shape = (N_1 * N_2 * … * N_p, p)

The cartesian product

gsum.helpers.coefficients(y, ratio, ref=1, orders=None)[source]

Returns the coefficients of a power series

Parameters
yarray, shape = (n_samples, n_curves)
ratioscalar or array, shape = (n_samples,)
refscalar or array, shape = (n_samples,)
orders1d array, optional

The orders at which y was computed. Defaults to 0, 1, …, n_curves-1

Returns
An (n_samples, n_curves) array of the extracted coefficients
gsum.helpers.partials(coeffs, ratio, ref=1, orders=None)[source]

Returns the partial sums of a power series given the coefficients

The ``k``th partial sum is given by

\[y_k = y_{\mathrm{ref}} \sum_{n=0}^k c_n Q^n\]
Parameters
coeffs(n_samples, n_curves) array

The n lowest order coefficients in a power series

ratioscalar, or (n_samples,) array

The ratio variable that is raised to the nth power in the nth term of the sum

ref(n_samples,) array, optional

The overall multiplicative scale of the series, default is 1

orders(n_curves,) array, optional

The orders corresponding to the given coefficients. All not given orders are assumed to have coefficients equal to zero. The default assumes that the n lowest order coefficients are given: [0, 1, ..., n_curves-1].

Returns
(n_samples, n_curves) array

The partial sums

gsum.helpers.geometric_sum(x, start, end, excluded=None)[source]

The geometric sum of x from i=start to i=end (inclusive)

\[S = \sum_{i=start}^{end} x^i\]

with the i in exclude excluded from the sum.

Parameters
xarray

The value to be summed

startint

The start index of the sum

endint

The end index of the sum (inclusive)

excludedint or 1d array

The indices to exclude from the sum

Returns
Sarray

The geometric sum

gsum.helpers.predictions(dist, dob=None)[source]

Return the mean and set of degree of belief intervals for a distribution

Parameters
distdistribution object
dobscalar or 1D array
Returns
array or tuple

If dob is None, just the mean is returned, else a tuple of the mean and degree of belief intervals is returned. The interval array is shaped (len(dob), 2, len(mean)) and is then squeezed to remove all axes of length 1.

gsum.helpers.gaussian(X, Xp=None, ls=1)[source]

A gaussian correlation function

Parameters
X(N, d) array
Xp(M, d) array, optional
lsscalar
gsum.helpers.hpd(dist, alpha, *args)[source]

Returns the highest probability density interval of scipy dist.

Inspired by this answer https://stackoverflow.com/a/25777507

gsum.helpers.kl_gauss(mu0, cov0, mu1, cov1=None, chol1=None)[source]

The Kullbeck-Liebler divergence between two mv Gaussians

The divergence from \(\mathcal{N}_1\) to \(\mathcal{N}_0\) is given by

\[D_\text{KL}(\mathcal{N}_0 \| \mathcal{N}_1) = \frac{1}{2} \left[ \mathrm{Tr} \left( \Sigma_1^{-1} \Sigma_0 \right) + \left( \mu_1 - \mu_0\right)^\text{T} \Sigma_1^{-1} ( \mu_1 - \mu_0 ) - k + \ln \left( \frac{\det \Sigma_1}{\det \Sigma_0} \right) \right],\]

which can be thought of as the amount of information lost when \(\mathcal{N}_1\) is used to approximate \(\mathcal{N}_0\).

Parameters
mu0Scalar or 1d array

The mean of the posterior

cov0Scalar or 2d array

The covariance of the posterior

mu1Scalar or 1d array

The mean of the prior

cov1Scalar or 2d array

The covariance of the prior

chol1Scalar or 2d array

The Cholesky decomposition of the prior

Returns
number

The KL divergence

Raises
ValueError

Exactly one of cov1 or chol1 must be given

gsum.helpers.default_attributes(**kws)[source]

Sets None or empty *args/**kwargs arguments to attributes already stored in a class.

This is a handy decorator to avoid if statements at the beginning of method calls:

def func(self, x=None):
if x is None:

x = self.x

but is particularly useful when the function uses a cache to avoid unnecessary computations. Caches don’t recognize when the attributes change, so could result in incorrect returned values. This decorator must be put outside of the cache decorator though.

Parameters
kwsdict

The key must match the parameter name in the decorated function, and the value corresponds to the name of the attribute to use as the default

class gsum.helpers.VariogramFourthRoot(X, z, bin_bounds)[source]

Computes the empirical semivariogram and uncertainties via the fourth root transformation.

Based mostly on the theory developed in Bowman & Crujeiras (2013) and Cressie & Hawkins (1980). Their original code was implemented in the sm R package, but was rewritten in Python as a check and to gain a better understanding of the implementation. There are unresolved discrepancies with their code to date.

Parameters
Xarray, shape = (n_samples, n_features)

The shaped input locations of the observed function.

zarray, shape = (n_samples, [n_curves])

The function values

bin_boundsarray, shape = (n_bins-1,)

The boundaries of the bins for the distances between the inputs. The bin location is computed as the average of all distances within the bin.

compute(self, rt_scale=False)[source]

Returns the mean semivariogram and approximate 68% confidence intervals.

Can be given on the 4th root scale or the variogram scale (default).

Parameters
rt_scalebool

Returns results on 4th root scale if True (default is False)

Returns
gamma, lower, upper

The semivariogram estimate and its lower and upper 68% bands

corr_ijkl(self, i, j, k, l)[source]

The correlation between \(\sqrt(|Z_i - Z_j|)\) and \(\sqrt(|Z_k - Z_l|)\), estimated by gamma tilde

This is estimated using gamma tilde, the estimate of the variogram via the 4th root transform. Because the estimate can exceed the bounds [-1, 1], any correlation outside this range is manually set to +/-1.

cov_ijkl(self, i, j, k, l)[source]

The covariance between \(\sqrt(|Z_i - Z_j|)\) and \(\sqrt(|Z_k - Z_l|)\), estimated by gamma tilde

Only estimates the correlation when (i,j) != (k,l), otherwise uses 1.

rho_ijkl(self, i, j, k, l)[source]

The correlation between \((Z_i - Z_j)\) and \((Z_k - Z_l)\), estimated by gamma tilde

var_ij(self, i, j)[source]

The variance of sqrt(|Z_i - Z_j|), estimated by gamma tilde

gsum.helpers.median_pdf(pdf, x)[source]

Returns the median given the pdf.

gsum.helpers.hpd_pdf(pdf, alpha, x, x0=None, **kwargs)[source]

Returns the highest probability density interval given the pdf.

Inspired by this answer https://stackoverflow.com/a/22290087