p_m1: Non-principal branch probability

View source: R/p_m1.R

p_m1R Documentation

Non-principal branch probability


Computes the probability that (at least) one (out of n) observation(s) of the latent variable U lies in the non-principal branch region. The 'm1' in p_m1 stands for 'minus 1', i.e, the non-principal branch.

See Goerg (2011) and Details for mathematical derivations.


p_m1(gamma, beta, distname, n = 1, use.mean.variance = TRUE)



scalar; skewness parameter.


numeric vector (deprecated); parameter \boldsymbol β of the input distribution. See check_beta on how to specify beta for each distribution.


character; name of input distribution; see get_distnames.


number of RVs/observations.


logical; if TRUE it uses mean and variance implied by \boldsymbol β to do the transformation (Goerg 2011). If FALSE, it uses the alternative definition from Goerg (2016) with location and scale parameter.


The probability that one observation of the latent RV U lies in the non-principal region equals at most

p_{-1}(γ, n=1) = P≤ft(U < -\frac{1}{|γ|}\right),

where U is the zero-mean, unit variance version of the input X \sim F_X(x \mid \boldsymbol β) – see References.

For N independent RVs U_1, …, U_N, the probability that at least one data point came from the non-principal region equals

p_{-1}(γ, n=N) = P≤ft(U_i < -\frac{1}{|γ|} \; for \; at \; least \; one \; i \right)

This equals (assuming independence)

P≤ft(U_i < -\frac{1}{|γ|} \; for \; at \; least \; one \; i \right) = 1 - P≤ft(U_i ≥q -\frac{1}{|γ|}, \forall i \right) = 1 - ∏_{i=1}^{N} P≤ft(U_i ≥q -\frac{1}{|γ|} \right)

= 1 - ∏_{i=1}^{N} ≤ft(1 - p_{-1}(γ, n=1) \right) = 1 - (1-p_{-1}(γ, n=1))^N.

For improved numerical stability the cdf of a geometric RV (pgeom) is used to evaluate the last expression. Nevertheless, numerical problems can occur for |γ| < 0.03 (returns 0 due to rounding errors).

Note that 1 - (1-p_{-1}(γ, n=1))^N reduces to p_{-1}(γ) for N=1.


non-negative float; the probability p_{-1} for n observations.


beta.01 <- c(mu = 0, sigma = 1)
# for n=1 observation
p_m1(0, beta = beta.01, distname = "normal") # identical to 0
# in theory != 0; but machine precision too low
p_m1(0.01, beta = beta.01, distname = "normal") 
p_m1(0.05, beta = beta.01, distname = "normal") # extremely small
p_m1(0.1, beta = beta.01, distname = "normal") # != 0, but very small
# 1 out of 4 samples is a non-principal input;
p_m1(1.5, beta = beta.01, distname = "normal") 
# however, gamma=1.5 is not common in practice

# for n=100 observations
p_m1(0, n=100, beta = beta.01, distname = "normal") # == 0
p_m1(0.1, n=100, beta = beta.01, distname = "normal") # still small
p_m1(0.3, n=100, beta = beta.01, distname = "normal") # a bit more likely
p_m1(1.5, n=100, beta = beta.01, distname = "normal") 
# Here we can be almost 100% sure (rounding errors) that at least one
# y_i was caused by an input in the non-principal branch.

LambertW documentation built on Sept. 22, 2022, 5:07 p.m.