# propagate: Propagation of uncertainty using higher-order Taylor... In propagate: Propagation of Uncertainty

## Description

A general function for the calculation of uncertainty propagation by first-/second-order Taylor expansion and Monte Carlo simulation including covariances. Input data can be any symbolic/numeric differentiable expression and data based on summaries (mean & s.d.) or sampled from distributions. Uncertainty propagation is based completely on matrix calculus accounting for full covariance structure. Monte Carlo simulation is conducted using a multivariate t-distribution with covariance structure. Propagation confidence intervals are calculated from the expanded uncertainties by means of the degrees of freedom obtained from WelchSatter, or from the [\frac{α}{2}, 1-\frac{α}{2}] quantiles of the MC evaluations.

## Usage

 1 2 propagate(expr, data, second.order = TRUE, do.sim = TRUE, cov = TRUE, df = NULL, nsim = 1000000, alpha = 0.05, ...) 

## Arguments

 expr an expression, such as expression(x/y). data a dataframe or matrix containing either a) the means μ_i, standard deviations σ_i and degrees of freedom ν_i (optionally) in the first, second and third (optionally) row, or b) sampled data generated from any of R's distributions or those implemented in this package (rDistr). If nrow(data) > 3, sampled data is assumed. The column names must match the variable names. second.order logical. If TRUE, error propagation will be calculated with first- and second-order Taylor expansion. See 'Details'. do.sim logical. Should Monte Carlo simulation be applied? cov logical or variance-covariance matrix with the same column names as data. See 'Details'. df an optional scalar with the total degrees of freedom ν_{\mathrm{tot}} of the system. nsim the number of Monte Carlo simulations to be performed, minimum is 10000. alpha the 1 - confidence level. ... other parameters to be supplied to future methods.

## Details

The implemented methods are:

1) Monte Carlo simulation:
For each variable m in data, simulated data X = [x_1, x_2, …, x_n] with n = nsim samples is generated from a multivariate t-distribution X_{m, n} \sim t(μ, Σ, ν) using means μ_i and covariance matrix \boldsymbol{Σ} constructed from the standard deviations σ_i of each variable. All data is coerced into a new dataframe that has the same covariance structure as the initial data: \boldsymbol{Σ}(\mathtt{data}) = \boldsymbol{Σ}(X_{m, n}). Each row i = 1, …, n of the simulated dataset X_{m, n} is evaluated with expr, y_i = f(x_{m, i}), and summary statistics (mean, sd, median, mad, quantile-based confidence interval based on [\frac{α}{2}, 1-\frac{α}{2}]) are calculated on y.

2) Error propagation:
The propagated error is calculated by first-/second-order Taylor expansion accounting for full covariance structure using matrix algebra.
The following transformations based on two variables x_1, x_2 illustrate the equivalence of the matrix-based approach with well-known classical notations:
First-order mean: \rm{E[y]} = f(\bar{x}_i)
First-order variance: σ_y^2 = {\color{red} \nabla \mathbf{Σ} \nabla^T}:

{ \color{red}[\rm{j_1}\; \rm{j_2}] ≤ft[ \begin{array}{cc} σ_1^2 & σ_1σ_2 \\ σ_2σ_1 & σ_2^2 \end{array} \right] ≤ft[ \begin{array}{c} \rm{j_1} \\ \rm{j_2} \end{array} \right]} = \rm{j_1}^2 σ_1^2 + \rm{2 j_1 j_2} σ_1 σ_2 + \rm{j_2}^2 σ_2^2

= \underbrace{∑_{i=1}^2 \rm{j_i}^2 σ_i^2 + 2∑_{i=1\atop i \neq k}^2∑_{k=1\atop k \neq i}^2 \rm{j_i j_k} σ_{ik}}_{\rm{classical\;notation}} = \frac{1}{1!} ≤ft(∑_{i=1}^2 \frac{\partial f}{\partial x_i} σ_i \right)^2

Second-order mean: \rm{E}[y] = f(\bar{x}_i) + {\color{blue} \frac{1}{2}\rm{tr}(\mathbf{HΣ)}}:

{ \color{blue} \frac{1}{2} \rm{tr} ≤ft[ \begin{array}{cc} \rm{h_1} & \rm{h_2} \\ \rm{h_3} & \rm{h_4} \end{array} \right] ≤ft[ \begin{array}{cc} σ_1^2 & σ_1σ_2 \\ σ_2σ_1 & σ_2^2 \end{array} \right]} = \frac{1}{2} \rm{tr} ≤ft[ \begin{array}{cc} \rm{h_1} σ_1^2 + \rm{h_2}σ_1σ_2 & \rm{h_1}σ_1σ_2 + \rm{h_2}σ_2^2 \\ \rm{h_3} σ_1^2 + \rm{h_4} σ_1σ_2 & \rm{h_3} σ_1σ_2 + \rm{h_4} σ_2^2 \end{array} \right]

= \frac{1}{2}(\rm{h_1}σ_1^2 + \rm{h_2}σ_1σ_2 + \rm{h_3}σ_1σ_2 + \rm{h_4}σ_2^2) = \frac{1}{2!} ≤ft(∑_{i=1}^2 \frac{\partial}{\partial x_i} σ_i \right)^2 \it f

Second-order variance: σ_y^2 = {\color{red} \nabla\mathbf{Σ}\nabla^T} + {\color{blue} \frac{1}{2}\rm{tr}(\mathbf{HΣ HΣ)}}:

{\color{blue}\frac{1}{2} \rm{tr} ≤ft[ \begin{array}{cc} \rm{h_1} & \rm{h_2} \\ \rm{h_3} & \rm{h_4} \end{array} \right] ≤ft[ \begin{array}{cc} \rm{σ_1^2} & \rm{σ_1σ_2} \\ \rm{σ_2σ_1} & \rm{σ_2^2} \end{array} \right] ≤ft[ \begin{array}{cc} \rm{h_1} & \rm{h_2} \\ \rm{h_3} & \rm{h_4} \end{array} \right] ≤ft[ \begin{array}{cc} \rm{σ_1^2} & \rm{σ_1σ_2} \\ \rm{σ_2σ_1} & \rm{σ_2^2} \end{array} \right]} = …

= \frac{1}{2} (\rm{h_1}^2σ_1^4 + \rm{2h_1h_2}σ_1^3σ_2 + \rm{2h_1h_3}σ_1^3σ_2 + \rm{h_2}^2σ_1^2σ_2^2 + \rm{2h_2h_3}σ_1^2σ_2^2 + \rm{h_3}^2σ_1^2σ_2^2 + \rm{2h_1h_4}σ_1^2σ_2^2

+ \rm{2h_2h_4}σ_1σ_2^3 + \rm{2h_3h_4}σ_1σ_2^3 + \rm{h_4}^2σ_2^4 = \frac{1}{2} (\rm{h_1}σ_1^2 + \rm{h_2}σ_1σ_2 + \rm{h_3}σ_1σ_2 + \rm{h_4}σ_2^2)^2

= \frac{1}{2!} ≤ft( ≤ft(∑_{i=1}^2 \frac{\partial}{\partial x_i} σ_i \right)^2 \it f \right)^2

with \mathrm{E}(y) = expectation of y, \mathbf{σ_y^2} = variance of y, {\color{red} \nabla} = the p x n gradient matrix with all partial first derivatives {\color{red} \rm{j_i}}, \mathbf{Σ} = the p x p covariance matrix, {\color{blue}\mathbf{H}} the Hessian matrix with all partial second derivatives {\color{blue} \rm{h_i}}, σ_i = the uncertainties and \rm{tr}(\cdot) = the trace (sum of diagonal) of a matrix. Note that because the Hessian matrices are symmetric, {\color{blue} \rm{h_2}} = {\color{blue} \rm{h_3}}. For a detailed derivation, see 'References'.
The second-order Taylor expansion corrects for bias in nonlinear expressions as the first-order Taylor expansion assumes linearity around \bar{x}_i. There is also a Python library available for second-order error propagation ('soerp', https://pypi.python.org/pypi/soerp). The 'propagate' package gives exactly the same results, see last example under "Examples".
Depending on the input expression, the uncertainty propagation may result in an error that is not normally distributed. The Monte Carlo simulation, starting with a symmetric t-distributions of the variables, can clarify this. For instance, a high tendency from deviation of normality is encountered in formulas in which the error of the denominator is relatively large or in exponential models with a large error in the exponent.

For setups in which there is no symbolic derivation possible (i.e. e <- expression(abs(x)) => "Function 'abs' is not in the derivatives table") the function automatically switches from symbolic (using makeGrad or makeHess) to numeric (numGrad or numHess) differentiation.

The function will try to evaluate the expression in an environment using eval which results in a significant speed enhancement (~ 10-fold). If that fails, evaluation is done over the rows of the simulated data using apply.

cov is used in the following ways:
1) If μ_i, σ_i are supplied, a covariance matrix is built with diagonals σ_i^2, independent of cov = TRUE, FALSE.
2) When simulated data is supplied, a covariance matrix is constructed that either has (cov = TRUE) or has not (cov = FALSE) off-diagonal covariances.
3) The user can supply an own covariance matrix Σ, with the same column/row names as in data.

The expanded uncertainty used for constructing the confidence interval is calculated from the Welch-Satterthwaite degrees of freedom ν_{\mathrm{WS}} of the WelchSatter function.

## Value

A list with the following components:

 gradient the symbolic gradient vector \nabla of partial first-order derivatives. evalGrad the evaluated gradient vector \nabla of partial first-order derivatives, also known as the "sensitivity". See summary.propagate. hessian the symbolic Hessian matrix \mathbf{H} of partial second-order derivatives. evalHess the evaluated Hessian matrix \mathbf{H} of partial second-order derivatives. rel.contr the relative contribution matrix, see summary.propagate. covMat the covariance matrix \mathbf{Σ} used for Monte Carlo simulation and uncertainty propagation. ws.df the Welch-Satterthwaite degrees of freedom ν_{\mathrm{ws}}, as obtained from WelchSatter. k the coverage factor k, as calculated by t(1-(α/2), ν_{\mathrm{ws}}). u.exp the expanded uncertainty, kσ(y), where σ(y) is derived either from the second-order uncertainty, if successfully calculated, or first-order otherwise. resSIM a vector containing the nsim values obtained from the row-wise expression evaluations f(x_{m, i}) of the simulated data in datSIM. datSIM a vector containing the nsim simulated multivariate values for each variable in column format. prop a summary vector containing first-/second-order expectations and uncertainties as well as the confidence interval based on alpha. sim a summary vector containing the mean, standard deviation, median, MAD as well as the confidence interval based on alpha. expr the original expression expr. data the original data data. alpha the otiginal alpha.

## Author(s)

Andrej-Nikolai Spiess

## References

Error propagation (in general):
An Introduction to error analysis.
Taylor JR.
University Science Books (1996), New York.

Evaluation of measurement data - Guide to the expression of uncertainty in measurement.
JCGM 100:2008 (GUM 1995 with minor corrections).
http://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf.

Evaluation of measurement data - Supplement 1 to the Guide to the expression of uncertainty in measurement - Propagation of distributions using a Monte Carlo Method.
JCGM 101:2008.
http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E.pdf.

Higher-order Taylor expansion:
On higher-order corrections for propagating uncertainties.
Wang CM & Iyer HK.
Metrologia (2005), 42: 406-410.

Propagation of uncertainty: Expressions of second and third order uncertainty with third and fourth moments.
Mekid S & Vaja D.
Measurement (2008), 41: 600-609.

Matrix algebra for error propagation:
An Introduction to Error Propagation: Derivation, Meaning and Examples of Equation Cy = FxCxFx^t.

Second order nonlinear uncertainty modeling in strapdown integration using MEMS IMUs.
Zhang M, Hol JD, Slot L, Luinge H.
2011 Proceedings of the 14th International Conference on Information Fusion (FUSION) (2011).

Uncertainty propagation in non-linear measurement equations.
Mana G & Pennecchi F.
Metrologia (2007), 44: 246-251.

A compact tensor algebra expression of the law of propagation of uncertainty.
Bouchot C, Quilantan JLC, Ochoa JCS.
Metrologia (2011), 48: L22-L28.

Nonlinear error propagation law.
Kubacek L.
Appl Math (1996), 41: 329-345.

Monte Carlo simulation (normal- and t-distribution):
MUSE: computational aspects of a GUM supplement 1 implementation.
Mueller M, Wolf M, Roesslein M.
Metrologia (2008), 45: 586-594.

Copulas for uncertainty analysis.
Possolo A.
Metrologia (2010), 47: 262-271.

Multivariate normal distribution:
Stochastic Simulation.
Ripley BD.
Stochastic Simulation (1987). Wiley. Page 98.

Testing for normal distribution:
Testing for Normality.
Thode Jr. HC.
Marcel Dekker (2002), New York.

Approximating the Shapiro-Wilk W-test for non-normality.
Royston P.
Stat Comp (1992), 2: 117-119.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 ## In these examples, 'nsim = 100000' to save ## Rcmd check time (CRAN). It is advocated ## to use at least 'nsim = 1000000' though... ## Example without given degrees-of-freedom. EXPR1 <- expression(x/y) x <- c(5, 0.01) y <- c(1, 0.01) DF1 <- cbind(x, y) RES1 <- propagate(expr = EXPR1, data = DF1, type = "stat", do.sim = TRUE, verbose = TRUE, nsim = 100000) RES1 ## Same example with given degrees-of-freedom ## => third row in input 'data'. EXPR2 <- expression(x/y) x <- c(5, 0.01, 12) y <- c(1, 0.01, 5) DF2 <- cbind(x, y) RES2 <- propagate(expr = EXPR2, data = DF2, type = "stat", do.sim = TRUE, verbose = TRUE, nsim = 100000) RES2 ## With the 'summary' function, we can get the ## Welch-Satterthwaite DF's, coverage, expanded uncertainty, ## Gradient and Hessian matrix etc. summary(RES2) ## Example using a recursive function: ## no Taylor expansion possible, only Monte-Carlo. a <- c(5, 0.1) b <- c(100, 2) DAT <- cbind(a, b) f <- function(a, b) { N <- 0 for (i in 1:100) { N <- N + i * log(a) + b^(1/i) } return(N) } propagate(f, DAT, nsim = 100000) ## Not run: ################# GUM 2008 (1) ######################## ## Example in Annex H.1 from the GUM 2008 manual ## (see 'References'), an end gauge calibration ## study. We use only first-order error propagation, ## with total df = 16 and alpha = 0.01, ## as detailed in GUM H.1.6. EXPR3 <- expression(ls + d - ls * (da * the + as * dt)) ls <- c(50000623, 25) d <- c(215, 9.7) da <- c(0, 0.58E-6) the <- c(-0.1, 0.41) as <- c(11.5E-6, 1.2E-6) dt <- c(0, 0.029) DF3 <- cbind(ls, d, da, the, as, dt) RES3 <- propagate(expr = EXPR3, data = DF3, second.order = FALSE, df = 16, alpha = 0.01) RES3 ## propagate: sd.1 = 31.71 ## GUM H.1.4/H.6c: u = 32 ## Expanded uncertainty, from summary function. summary(RES3) ## propagate: 92.62 ## GUM H.1.6: 93 ## Proof that covariance of Monte-Carlo ## simulated dataset is "fairly"" the same ## as from initial data. RES3$covMat cov(RES3$datSIM) all.equal(RES3$covMat, cov(RES3$datSIM)) ## Now using second-order Taylor expansion. RES4 <- propagate(expr = EXPR3, data = DF3) RES4 ## propagate: sd.2 = 33.91115 ## GUM H.1.7: u = 34. ## Also similar to the non-matrix-based approach ## in Wang et al. (2005, page 408): u1 = 33.91115. ## NOTE: After second-order correction ("sd.2"), ## uncertainty is more similar to the uncertainty ## obtained from Monte Carlo simulation! #################### GUM 2008 (2) ################# ## Example in Annex H.2 from the GUM 2008 manual ## (see 'References'), simultaneous resistance ## and reactance measurement. data(H.2) ## This gives exactly the means, uncertainties and ## correlations as given in Table H.2: colMeans(H.2) sqrt(colVarsC(H.2))/sqrt(5) cor(H.2) ## H.2.3 Approach 1 using mean values and ## standard uncertainties: EXPR6a <- expression((V/I) * cos(phi)) ## R EXPR6b <- expression((V/I) * sin(phi)) ## X EXPR6c <- expression(V/I) ## Z MEAN6 <- colMeans(H.2) SD6 <- sqrt(colVarsC(H.2)) DF6 <- rbind(MEAN6, SD6) COV6ab <- cov(H.2) ## covariance matrix of V, I, phi COV6c <- cov(H.2[, 1:2]) ## covariance matrix of V, I RES6a <- propagate(expr = EXPR6a, data = DF6, cov = COV6ab) RES6b <- propagate(expr = EXPR6b, data = DF6, cov = COV6ab) RES6c <- propagate(expr = EXPR6c, data = DF6[, 1:2], cov = COV6c) ## This gives exactly the same values of mean and sd/sqrt(5) ## as given in Table H.4. RES6a$prop # 0.15892/sqrt(5) = 0.071 RES6b$prop # 0.66094/sqrt(5) = 0.296 RES6c\$prop # 0.52846/sqrt(5) = 0.236 ######### GUM 2008 Supplement 1 (1) ####################### ## Example from 9.2.2 of the GUM 2008 Supplement 1 ## (see 'References'), normally distributed input ## quantities. Assign values as in 9.2.2.1. EXPR7 <- expression(X1 + X2 + X3 + X4) X1 <- c(0, 1) X2 <- c(0, 1) X3 <- c(0, 1) X4 <- c(0, 1) DF7 <- cbind(X1, X2, X3, X4) RES7 <- propagate(expr = EXPR7, data = DF7, nsim = 1E5) ## This will give exactly the same results as in ## 9.2.2.6, Table 2. RES7 ######### GUM 2008 Supplement 1 (2) ####################### ## Example from 9.3 of the GUM 2008 Supplement 1 ## (see 'References'), mass calibration. ## Formula 24 in 9.3.1.3 and values as in 9.3.1.4, Table 5. EXPR8 <- expression((Mrc + dMrc) * (1 + (Pa - Pa0) * ((1/Pw) - (1/Pr))) - Mnom) Mrc <- rnorm(1E5, 100000, 0.050) dMrc <- rnorm(1E5, 1.234, 0.020) Pa <- runif(1E5, 1.10, 1.30) ## E(Pa) = 1.2, (b-a)/2 = 0.1 Pw <- runif(1E5, 7000, 9000) ## E(Pw) = 8000, (b-a)/2 = 1000 Pr <- runif(1E5, 7950, 8050) ## E(Pr) = 8000, (b-a)/2 = 50 Pa0 <- 1.2 Mnom <- 100000 DF8 <- cbind(Mrc, dMrc, Pa, Pw, Pr, Pa0, Mnom) RES8 <- propagate(expr = EXPR8, data = DF8, nsim = 1E5) ## This will give exactly the same results as in ## 9.3.2.3, Table 6 RES8 RES8 ######### GUM 2008 Supplement 1 (3) ####################### ## Example from 9.4 of the GUM 2008 Supplement 1 ## (see 'References'), comparioson loss in microwave ## power meter calibration, zero covariance. ## Formula 28 in 9.4.1.5 and values as in 9.4.1.7. EXPR9 <- expression(X1^2 - X2^2) X1 <- c(0.050, 0.005) X2 <- c(0, 0.005) DF9 <- cbind(X1, X2) RES9a <- propagate(expr = EXPR9, data = DF9, nsim = 1E5) ## This will give exactly the same results as in ## 9.4.2.2.7, Table 8, x1 = 0.050. RES9a ## Using covariance matrix with r(x1, x2) = 0.9 ## We convert to covariances using cor2cov. COR9 <- matrix(c(1, 0.9, 0.9, 1), nrow = 2) COV9 <- cor2cov(COR9, c(0.005^2, 0.005^2)) colnames(COV9) <- c("X1", "X2") rownames(COV9) <- c("X1", "X2") RES9b <- propagate(expr = EXPR9, data = DF9, cov = COV9) ## This will give exactly the same results as in ## 9.4.3.2.1, Table 9, x1 = 0.050. RES9b ######### GUM 2008 Supplement 1 (4) ####################### ## Example from 9.5 of the GUM 2008 Supplement 1 ## (see 'References'), gauge block calibration. ## Assignment of PDF's as in Table 10 of 9.5.2.1. EXPR10 <- expression(Ls + D + d1 + d2 - Ls *(da *(t0 + Delta) + as * dt) - Lnom) Lnom <- 50000000 Ls <- propagate:::rst(1000000, mean = 50000623, sd = 25, df = 18) D <- propagate:::rst(1000000, mean = 215, sd = 6, df = 25) d1 <- propagate:::rst(1000000, mean = 0, sd = 4, df = 5) d2 <- propagate:::rst(1000000, mean = 0, sd = 7, df = 8) as <- runif(1000000, 9.5E-6, 13.5E-6) t0 <- rnorm(1000000, -0.1, 0.2) Delta <- propagate:::rarcsin(1000000, -0.5, 0.5) da <- propagate:::rctrap(1000000, -1E-6, 1E-6, 0.1E-6) dt <- propagate:::rctrap(1000000, -0.050, 0.050, 0.025) DF10 <- cbind(Ls, D, d1, d2, as, t0, Delta, da, dt, Lnom) RES10 <- propagate(expr = EXPR10, data = DF10, cov = FALSE, alpha = 0.01) RES10 ## This gives the same results as in 9.5.4.2, Table 11. ## However: results are exacter than in the GUM 2008 ## manual, especially when comparing sd(Monte Carlo) with sd.2! ## GUM 2008 gives 32 and 36, respectively. RES10 ########## Comparison to Pythons 'soerp' ################### ## Exactly the same results as under ## https://pypi.python.org/pypi/soerp ! EXPR11 <- expression(C * sqrt((520 * H * P)/(M *(t + 460)))) H <- c(64, 0.5) M <- c(16, 0.1) P <- c(361, 2) t <- c(165, 0.5) C <- c(38.4, 0) DAT11 <- makeDat(EXPR11) RES11 <- propagate(expr = EXPR11, data = DAT11) RES11 ## End(Not run) 

propagate documentation built on May 7, 2018, 1:03 a.m.