sens.dynaTree | R Documentation |
A Monte Carlo sensitivity analysis using random Latin hypercube samples (LHSs) or bootstrap resamples for each particle to estimate main effects as well as 1st order and total sensitivity indices
## S3 method for class 'dynaTree'
sens(object, class = NULL, nns = 1000, nME = 100,
span = 0.3, method = c("lhs", "boot"),
lhs = NULL, categ = NULL, verb = 0)
object |
a |
class |
only valid for |
nns |
A positive integer scalar indicating the size of each
LHS or bootstrap drawn for use in the Monte Carlo
integration scheme underlying the sensitivity analysis;
the total number of locations is |
nME |
A positive integer scalar indicating number of grid points, in each input dimension, upon which main effects will be estimated |
span |
A positive real-valued scalar giving the smoothing parameter
for main effects integration: the fraction of |
method |
indicates whether LHS or bootstrap should be used |
lhs |
if
|
categ |
A vector of logicals of length |
verb |
a positive scalar integer indicating how many predictive locations
(iterations) after which a progress statement should be
printed to the console; a (default) value of |
Saltelli (2002) describes a Latin Hypercube sampling based method for estimation of the 'Sobol' sensitivity indices:
1st Order for input i
,
S(i) = \mbox{Var}(E[f|x_i])/\mbox{Var}(f),
where x_i
is the i
-th input.
Total Effect for input i
,
T(i) = E[\mbox{Var}(f|x_{-i})]/\mbox{Var}(f),
where x_{-i}
is all inputs except for the i
-th.
All moments are with respect to the appropriate marginals of the
uncertainty distribution U
– that is, the probability
distribution on the inputs with respect to which sensitivity is being
investigated.
Under this approach, the integrals involved are approximated through
averages over properly chosen samples based on two LH samples
proportional to U. If nns
is the sample size for the
Monte Carlo estimate, this scheme requires nns*(ncol(X)+2)
function evaluations.
The sens.dynaTree
function implements the method for unknown functions
f
, through prediction via one of the tgp regression
models conditional on an observed set of X
locations.
For each particle, treated as sample from the dynaTree
model posterior,
the nns*(ncol(X)+2)
locations are drawn randomly from the
LHS scheme and realizations of the sensitivity indices are
calculated. Thus we obtain a posterior sample of the indices,
incorporating variability from both the Monte Carlo estimation and
uncertainty about the function output. Since a subset of the
predictive locations are actually an LHS proportional to the
uncertainty distribution, we can also estimate the main effects
through simple non-parametric regression (a moving average).
See the Gramacy, Taddy, & Wild (2011) reference below for more details.
If method = "boot"
is used then simply replace LHS above
with a bootstrap resample of the object$X
locations.
As with prediction, the dynaTrees
function enables
repeated calls to sens.dynaTree
The object returned is of class "dynaTree"
, which includes a
copy of the list elements from the object
passed in,
with the following (sensitivity-analysis specific)
additions.
MEgrid |
An |
MEmean |
A |
MEq1 |
same as |
MEq2 |
same as |
S |
An |
T |
same as |
In the case of object$model = "class"
the entries
listed above will themselves be lists with an entry for each
class
specified on input, or all classes as is the
default
The quality of sensitivity analysis is dependent on the size of
the LHSs used for integral approximation; as with any Monte
Carlo integration scheme, the sample size (nns
) must
increase with the dimensionality of the problem. The total
sensitivity indices T
are forced non-negative,
and if negative values occur it is necessary to increase
nnd
. Postprocessing replaces negative values with NA
Robert B. Gramacy rbg@vt.edu,
Matt Taddy and Christoforos Anagnostopoulos
Saltelli, A. (2002) Making best use of model evaluations to compute sensitivity indices. Computer Physics Communications, 145, 280-297.
Gramacy, R.B., Taddy, M.A., and S. Wild (2011). “Variable Selection and Sensitivity Analysis via Dynamic Trees with an Application to Computer Code Performance Tuning” arXiv:1108.4739
https://bobby.gramacy.com/r_packages/dynaTree/
dynaTree
, predict.dynaTree
,
relevance.dynaTree
,
varpropuse
, varproptotal
## friedman data
if(require("tgp")) {
f <- friedman.1.data(1000)
X <- f[,1:6]
Z <- f$Y
## fit the model and do the sensitivity analysis
N <- 100 ## use N >= 1000 for better results
## small N is for fast CRAN checks
out <- dynaTree(X=X, y=Z, N=N, ab=c(0.01,2))
## also try with model="linear"
## gather relevance statistics
out <- relevance(out)
boxplot(out$relevance)
abline(h=0, col=2, lty=2)
## relevance stats are not as useful when model="linear"
## since it will appear that x4 and x5 not helpful; these
## interact linearly with the response
## full simulation-based sensitivity analysis, the dynaTree::
## part is only needed if the tgp package is loaded
out <- dynaTree::sens(out, verb=100)
## plot the main effects
r <- range(rbind(c(out$MEmean, out$MEq1, out$MEq2)))
par(mfrow=c(1,ncol(out$X)), mar=c(5,3,2,1))
plot(out$MEgrid[,1], out$MEmean[,1], type="l", ylim=r, lwd=2,
ylab="", xlab=colnames(out$MEmean)[1])
lines(out$MEgrid[,1], out$MEq1[,1], lty=2, lwd=2)
lines(out$MEgrid[,1], out$MEq2[,1], lty=2, lwd=2)
if(ncol(out$X) > 1) {
for(d in 2:ncol(out$X)) {
plot(out$MEgrid[,d], out$MEmean[,d], col=d, type="l", ylim=r,
lwd=2, xlab=colnames(out$MEmean)[d], ylab="")
lines(out$MEgrid[,d], out$MEq1[,d], col=d, lty=2)
lines(out$MEgrid[,d], out$MEq2[,d], col=d, lty=2)
}
}
## Sobol indices
par(mfrow=c(1,2), mar=c(5,4,4,2))
boxplot(out$S, main="first order indices", xlab="inputs")
boxplot(out$T, main="total indices", xlab="inputs")
## these look better when model="linear"
## clean up
deletecloud(out)
## for a classification example using the sensitivity hooks
## in the dynaTrees function, see the class2d demo
## i.e., demo("class2d")
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.