View source: R/predict.spmeshed.r
predict.spmeshed | R Documentation |
Sample from the posterior predictive distribution of the outcomes at new spatial or spatiotemporal locations after MCMC.
## S3 method for class 'spmeshed' predict(object, newx, newcoords, n_threads=4, verbose=FALSE, ...)
object |
Object output from |
newx |
matrix of covariate values at the new coordinates. |
newcoords |
matrix of new coordinates. |
n_threads |
integer number of OpenMP threads. This is ineffective if |
verbose |
boolean for progress messagging. |
... |
other arguments (unused). |
While this function can always be used to make predictions, in most cases it is more efficient to just include the prediction locations in the main data as NA
values; spmeshed
will sample from the posterior predictive distribution at those locations while doing MCMC. The predict
method is only recommended when all 4 of the following are true:
(1) spmeshed
was run with settings$forced_grid=FALSE
and
(2) the prediction locations are uniformly scattered on the domain (or rather, they are not clustered as a large empty area) and
(3) the number of prediction locations is a large portion of the number of observed data points and
(4) the prediction locations are not on a grid.
In all other cases the main spmeshed
function is setup to be more efficient in automatically performing predictions during MCMC.
coords_out |
matrix with the prediction location coordinates (order updated after predictions). |
preds_out |
array of dimension (n_{o}, q, m) where n_{o} is the number of prediction locations, q is the output dimension, m is the number of MCMC samples. |
Michele Peruzzi michele.peruzzi@duke.edu
Peruzzi, M., Banerjee, S., and Finley, A.O. (2020) Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains. Journal of the American Statistical Association, in press. doi: 10.1080/01621459.2020.1833889
Peruzzi, M., Banerjee, S., Dunson, D.B., and Finley, A.O. (2021) Grid-Parametrize-Split (GriPS) for Improved Scalable Inference in Spatial Big Data Analysis. https://arxiv.org/abs/2101.03579
# toy example with tiny dataset and short MCMC # on a univariate outcome library(magrittr) library(dplyr) library(meshed) set.seed(2021) SS <- 12 n <- SS^2 # total n. locations, including missing ones coords <- data.frame(Var1=runif(n), Var2=runif(n)) %>% as.matrix() # generate data sigmasq <- 2.3 phi <- 6 tausq <- .1 B <- c(-1,.5,1) CC <- sigmasq * exp(-phi * as.matrix(dist(coords))) LC <- t(chol(CC)) w <- LC %*% rnorm(n) p <- length(B) X <- rnorm(n * p) %>% matrix(ncol=p) y_full <- X %*% B + w + tausq^.5 * rnorm(n) set_missing <- rbinom(n, 1, 0.1) simdata <- data.frame(coords, y_full = y_full, w_latent = w) %>% mutate(y_observed = ifelse(set_missing==1, NA, y_full)) # MCMC setup mcmc_keep <- 500 mcmc_burn <- 100 mcmc_thin <- 2 y <- simdata$y_observed ybar <- mean(y, na.rm=TRUE) # training set y_in <- (y-ybar)[!is.na(y)] X_in <- X[!is.na(y),] coords_in <- coords[!is.na(y),] # suppose we dont want to have gridded knots # i.e. we are fixing the MGP reference set at the observed locations # (this may be inefficient in big data settings) meshout <- spmeshed(y_in, X_in, coords_in, axis_partition=c(4,4), n_samples = mcmc_keep, n_burn = mcmc_burn, n_thin = mcmc_thin, settings = list(forced_grid=FALSE, cache=FALSE), prior=list(phi=c(1,15)), verbose = 0, n_threads = 1) # test set coords_out <- coords[is.na(y),] X_out <- X[is.na(y),] df_predict <- predict(meshout, newx=X_out, newcoords=coords_out) y_posterior_predictive_mean <- df_predict$preds_out[,1,] %>% apply(1, mean) %>% add(ybar) df_predicted <- df_predict$coords_out %>% cbind(y_posterior_predictive_mean)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.