gjamPredict | R Documentation |
Predicts data from a gjam object, including conditional and out-of-sample prediction.
gjamPredict(output, newdata = NULL, y2plot = NULL, ylim = NULL, FULL = FALSE)
output |
object of |
newdata |
a |
y2plot |
|
ylim |
vector of lower and upper bounds for prediction plot |
FULL |
will return full chains for predictions as |
If newdata
is not specified, the response is predicted from xdata
as an in-sample prediction. If newdata
is specified, prediction is either conditional or out-of-sample.
Conditional prediction on a new set of y
values is done if newdata
includes the matrix ycondData
, which holds columns to condition on. ycondData
must be a matrix
and have column names matching those in y
that it will replace. ycondData
must have at least one column, but fewer than ncol(y)
columns. Columns not included in ycondData
will be predicted conditionally. Note that conditional prediction can be erratic if the observations on which the prediction is conditioned are unlikely given the model.
Alternatively, the list newdata
can include a new version of xdata
for out-of-sample prediction. The version of xdata
passed in newdata
has the columns with the same names and variable types as xdata
passed to gjam
. Note that factor levels must also match those included when fitting the model. All columns in y
will be predicted out-of-sample.
For count composition data the effort (total count) is 1000.
Because there is no out-of-sample effort for 'CC'
data, values are predicted on the [0, 1] scale.
See examples below.
|
design matrix. |
|
|
|
predictive intervals, only generated if |
|
|
|
effort |
|
full prediction chains if |
James S Clark, jimclark@duke.edu
Clark, J.S., D. Nemergut, B. Seyednasrollah, P. Turner, and S. Zhang. 2016. Generalized joint attribute modeling for biodiversity analysis: Median-zero, multivariate, multifarious data. Ecological Monographs, 87, 34-56.
gjamSimData
simulates data
A more detailed vignette is can be obtained with:
browseVignettes('gjam')
web site 'http://sites.nicholas.duke.edu/clarklab/code/'.
## Not run: S <- 10 f <- gjamSimData(n = 200, S = S, Q = 3, typeNames = 'CC') ml <- list(ng = 1500, burnin = 50, typeNames = f$typeNames, holdoutN = 10) out <- gjam(f$formula, f$xdata, f$ydata, modelList = ml) # predict data cols <- c('#018571', '#a6611a') par(mfrow=c(1,3),bty='n') gjamPredict(out, y2plot = colnames(f$ydata)) # predict the data in-sample title('full sample') # out-of-sample prediction xdata <- f$xdata[1:20,] xdata[,3] <- mean(f$xdata[,3]) # mean for x[,3] xdata[,2] <- seq(-2,2,length=20) # gradient x[,2] newdata <- list(xdata = xdata, nsim = 50 ) p1 <- gjamPredict( output = out, newdata = newdata) # plus/minus 1 prediction SE, default effort = 1000 x2 <- p1$x[,2] ylim <- c(0, max(p1$sdList$yMu[,1] + p1$sdList$yPe[,1])) plot(x2, p1$sdList$yMu[,1],type='l',lwd=2, ylim=ylim, xlab='x2', ylab = 'Predicted', col = cols[1]) lines(x2, p1$sdList$yMu[,1] + p1$sdList$yPe[,1], lty=2, col = cols[1]) lines(x2, p1$sdList$yMu[,1] - p1$sdList$yPe[,1], lty=2, col = cols[1]) # upper 0.95 prediction error lines(x2, p1$piList$yLo[,1], lty=3, col = cols[1]) lines(x2, p1$piList$yHi[,1], lty=3, col = cols[1]) title('SE and prediction, Sp 1') # conditional prediction, assume first species is absent ydataCond <- out$inputs$y[,1,drop=FALSE]*0 # set first column to zero newdata <- list(ydataCond = ydataCond, nsim=50) p0 <- gjamPredict(output = out, newdata = newdata) ydataCond <- ydataCond + 10 # first column is 10 newdata <- list(ydataCond = ydataCond, nsim=50) p1 <- gjamPredict(output = out, newdata = newdata) s <- 4 # species chosen at random to compare ylim <- range( p0$sdList$yMu[,s], p1$sdList$yMu[,s] ) plot(out$inputs$y[,s],p0$sdList$yMu[,s], cex=.2, col=cols[1], xlab = 'Observed', ylab = 'Predicted', ylim = ylim) abline(0,1,lty=2) points(out$inputs$y[,s],p1$sdList$yMu[,s], cex=.2, col=cols[2]) title('Cond. on 1st Sp') legend( 'topleft', c('first species absent', 'first species = 10'), text.col = cols, bty = 'n') # conditional, out-of-sample n <- 1000 S <- 10 f <- gjamSimData(n = n, S = S, Q = 3, typeNames = 'CA') holdOuts <- sort( sample(n, 50) ) xdata <- f$xdata[-holdOuts,] # fitted data ydata <- f$ydata[-holdOuts,] xx <- f$xdata[holdOuts,] # use for prediction yy <- f$ydata[holdOuts,] ml <- list(ng = 2000, burnin = 500, typeNames = f$typeNames) # fit the non-holdouts out <- gjam(f$formula, xdata, ydata, modelList = ml) cdex <- sample(S, 4) # condition on 4 species ndex <- c(1:S)[-cdex] # conditionally predict others newdata <- list(xdata = xx, ydataCond = yy[,cdex], nsim = 200) # conditionally predict out-of-sample p2 <- gjamPredict(output = out, newdata = newdata) par(bty='n', mfrow=c(1,1)) plot( as.matrix(yy[,ndex]), p2$sdList$yMu[,ndex], xlab = 'Observed', ylab = 'Predicted', cex=.3, col = cols[1]) abline(0,1,lty=2) title('RMSPE') mspeC <- sqrt( mean( (as.matrix(yy[,ndex]) - p2$sdList$yMu[,ndex])^2 ) ) #predict unconditionally, out-of-sample newdata <- list(xdata = xx, nsim = 200 ) p1 <- gjamPredict(out, newdata = newdata) points( as.matrix(yy[,ndex]), p1$sdList$yMu[,ndex], col=cols[2], cex = .3) mspeU <- sqrt( mean( (as.matrix(yy[,ndex]) - p1$sdList$yMu[,ndex])^2 ) ) e1 <- paste( 'cond, out-of-sample =', round(mspeC, 2) ) e2 <- paste( 'uncond, out-of-sample =', round(mspeU, 2) ) legend('topleft', c(e1, e2), text.col = cols, bty = 'n') ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.