uncertainty | R Documentation |
Quantifying uncertainty for fitted objects.
uncertainty(object, ...) ## S3 method for class 'opticut' uncertainty(object, which = NULL, type = c("asymp", "boot", "multi"), B = 99, cl = NULL, ...) ## S3 method for class 'multicut' uncertainty(object, which = NULL, type = c("asymp", "boot"), B = 99, cl = NULL, ...) check_strata(x, mat) ## S3 method for class 'uncertainty' strata(object, ...) ## S3 method for class 'uncertainty' subset(x, subset=NULL, ...) ## S3 method for class 'uncertainty' bestpart(object, ...) ## S3 method for class 'uncertainty1' bestpart(object, ...) ## S3 method for class 'uncertainty1' print(x, ...) ## S3 method for class 'uncertainty' print(x, ...) ## S3 method for class 'summary.uncertainty' print(x, sort, digits, ...) ## S3 method for class 'uncertainty' summary(object, level = 0.95, ...) ## S3 method for class 'uncertainty' as.data.frame(x, row.names = NULL, optional = FALSE, sort, ...) ## S3 method for class 'summary.uncertainty' as.data.frame(x, row.names = NULL, optional = FALSE, sort, ...) ## S3 method for class 'uncertainty1' bsmooth(object, ...) ## S3 method for class 'uncertainty' bsmooth(object, ...)
object |
fitted model object
(which should not contain extra arguments as part of |
which |
numeric or character (can be a vector) defining
a subset of species from the fitted object,
or or |
type |
character, describing the type of uncertainty calculation. See Details. |
B |
numeric, number of iterations. For |
cl |
a cluster object, or an integer for multiple cores in parallel computations (integer value for forking is ignored on Windows). |
x |
an object to be printed. |
level |
the confidence level required. |
sort |
logical value indicating if species
should be meaningfully sorted, the default is |
digits |
numeric, number of significant digits in output. |
mat |
a matrix with resampling indices (rows as samples, columns as iterations). |
row.names |
|
optional |
logical. If |
subset |
logical, numeric, or character index indicating species to keep, missing values are not accepted. |
... |
other arguments passed to the underlying functions. |
Uncertainty is calculated for
indicator potential I
, and expected values
(mu0
, and mu1
for opticut, and mu_*
for multicut objects).
"asymp"
: asymptotic distribution is based on best supported model
(this option is unavailable for custom distribution functions because
it requires the Hessian matrix).
This type is available for both opticut and multicut objects.
"boot"
: non-parametric bootstrap distribution
based on best partition found for the input object.
This type is available for both opticut and multicut objects.
"multi"
: non-parametric bootstrap distribution
based on best partition found for the bootstrap data (i.e.
the model ranking is re-evaluated each time).
"multi"
works only if comb = "rank"
in the
opticut
call.
This type is not available for multicut objects.
uncertainty
returns an object of class 'uncertainty'.
The uncertainty
element of the object is a list with species specific
output as elements (object class 'uncertainty1').
Each 'uncertainty1' output is a data frame with columns:
best
partition, indicator potential I
,
and expected values
(mu0
, and mu1
for opticut, and mu_*
for multicut objects).
check_strata
returns a logical vector checking if
all original strata from the input object are represented
by resampling indices. Number of strata are attached as attributes
for further diagnostics.
The summary method prints the name of the best supported split,
selection frequency (R, reliability), indicator values (I, based on
the distribution of values within the best supported split with highest
reliability) and confidence interval for I (based on level
).
The subset
method subsets the species in the uncertainty object.
bestpart
finds the selection frequencies for
strata as best partitions (number of strata x number of species).
The coercion method as.data.frame
returns a data frame.
The bsmooth
method returns bootstrap smoothed results
for each strata (not available for multicut based uncertainty objects,
check uncertainty results instead).
Resampling methods can lead to complete exclusion of certain strata
when sample size is small. Try revising the stratification of the
input object, or provide custom resampling indices via the B
argument using stratified (block) bootstrap, jackknife (leave-one-out),
or similar techniques. Finding a suitable random seed
via set.seed
or dropping unsuitable iterations
can also resolve the issue.
Peter Solymos <solymos@ualberta.ca>
opticut
and multicut
for the
user interface of the input objects.
set.seed(2345) n <- 50 x0 <- sample(1:4, n, TRUE) x1 <- ifelse(x0 %in% 1:2, 1, 0) x2 <- rnorm(n, 0.5, 1) x3 <- ifelse(x0 %in% 2:4, 1, 0) lam1 <- exp(0.5 + 1*x1 + -0.2*x2) Y1 <- rpois(n, lam1) lam2 <- exp(1 + 0.5*x3) Y2 <- rpois(n, lam2) Y3 <- rpois(n, exp(0)) Y <- cbind(Spp1=Y1, Spp2=Y2, Spp3=Y3) oc <- opticut(Y ~ x2, strata=x0, dist="poisson", comb="rank") ## asymptotic confidence intervals (uc1 <- uncertainty(oc, type="asymp", B=999)) summary(uc1) ## bootstrap-based confidence intervals (uc2 <- uncertainty(oc, type="boot", B=19)) summary(uc2) ## use user-supplied indices ## multi-model bootstrap based uncertainties B <- replicate(25, sample.int(n, replace=TRUE)) check_strata(oc, B) # check representation (uc3 <- uncertainty(oc, type="multi", B=B)) summary(uc3) ## best partitions: ## selection frequencies for strata and species bestpart(uc3) heatmap(bestpart(uc3), scale="none", col=occolors()(25)) ## bootstrap smoothed predictions per strata bsmooth(uc3) heatmap(bestpart(uc3), scale="none", col=occolors()(25)) ## individual species results uc3$uncertainty bestpart(uc3$uncertainty[[1]]) bsmooth(uc3$uncertainty[[1]]) ## Not run: ## block bootstrap block_fun <- function() unlist(lapply(unique(x0), function(z) if (sum(x0==z) < 2) which(x0==z) else sample(which(x0==z), sum(x0==z), replace=TRUE))) B <- replicate(25, block_fun()) check_strata(oc, B) # check representation summary(uncertainty(oc, type="multi", B=B)) ## jackknife B <- sapply(1:n, function(i) which((1:n) != i)) check_strata(oc, B) # check representation summary(uncertainty(oc, type="multi", B=B)) ## multicut based uncertainty mc <- multicut(Y ~ x2, strata=x0, dist="poisson") ## asymptotic confidence intervals (muc1 <- uncertainty(mc, type="asymp", B=999)) summary(muc1) bestpart(muc1) ## bootstrap-based confidence intervals (muc2 <- uncertainty(mc, type="boot", B=19)) summary(muc2) bestpart(muc2) ## dolina example data(dolina) ## stratum as ordinal dolina$samp$stratum <- as.integer(dolina$samp$stratum) ## filter species to speed up things a bit Y <- ifelse(dolina$xtab[,colSums(dolina$xtab > 0) >= 20] > 0, 1, 0) ## opticut results, note the cloglog link function dol <- opticut(Y ~ stratum + lmoist + method, data=dolina$samp, strata=dolina$samp$mhab, dist="binomial:cloglog") ## parallel computing for uncertainty library(parallel) cl <- makeCluster(2) ucdol <- uncertainty(dol, type="multi", B=25, cl=cl) stopCluster(cl) bestpart(ucdol) heatmap(t(bestpart(ucdol)), scale="none", col=occolors()(25), distfun=function(x) dist(x, "manhattan")) ## See how indicator value changes with different partitions ## (and why it is the wrong metric to use in this calse) with(ucdol$uncertainty[["pvic"]], boxplot(I ~ best, col="gold", ylab="Indicator value")) ## What we should calculate is the bootstrap smoothed mean of the ## expected value and its confidence intervals bs <- bsmooth(ucdol$uncertainty[["pvic"]]) boxplot(t(bs), ylab="Expected value") cbind(Mean=rowMeans(bs), t(apply(bs, 1, quantile, probs=c(0.025, 0.975)))) ## A more interesting simulated example for bootstrap smoothing ## and comparing opticut vs. multicut set.seed(1) n <- 2000 x <- sort(runif(n, -8, 8)) p <- plogis(0.5 + -0.1 * x + -0.2 * x^2) y <- rbinom(n, 1, p) d <- diff(range(x))/10 br <- seq(min(x), max(x), by=d) g <- cut(x, br, include.lowest=TRUE) levels(g) <- LETTERS[1:nlevels(g)] o <- opticut(y ~ 1, strata=g, dist="binomial") m <- multicut(y ~ 1, strata=g, dist="binomial") library(parallel) cl <- makeCluster(2) uo <- uncertainty(o, type="multi", B=99, cl=cl) um <- uncertainty(m, type="boot", B=99, cl=cl) stopCluster(cl) ## bootstrap average for opticut bs <- bsmooth(uo$uncertainty[[1]]) stat <- cbind(Mean=rowMeans(bs), t(apply(bs, 1, quantile, probs=c(0.025, 0.975)))) ## bootstrap average for multicut bsm <- as.matrix(um$uncertainty[[1]][,-(1:2)]) statm <- cbind(Mean=colMeans(bsm), t(apply(bsm, 2, quantile, probs=c(0.025, 0.975)))) op <- par(mfrow=c(2,1)) plot(p ~ x, type="l", ylim=c(0,1), main="Binary partitions (opticut)") abline(v=br, col="grey", lty=3) lines(br[-1]-0.5*d, stat[,1], col=4) lines(br[-1]-0.5*d, stat[,2], col=4, lty=2) lines(br[-1]-0.5*d, stat[,3], col=4, lty=2) lines(br[-1]-0.5*d, bs[,1], col=2) legend("topright", bty="n", lty=c(1,1,2,1), col=c(1,4,4,2), legend=c("True response","bsmooth","0.95 CI","Best partition")) plot(p ~ x, type="l", ylim=c(0,1), main="Multi-level model (multicut)") abline(v=br, col="grey", lty=3) lines(br[-1]-0.5*d, statm[,1], col=4) lines(br[-1]-0.5*d, statm[,2], col=4, lty=2) lines(br[-1]-0.5*d, statm[,3], col=4, lty=2) legend("topright", bty="n", lty=c(1,1,2), col=c(1,4,4), legend=c("True response","bsmooth","0.95 CI")) par(op) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.