knitr::opts_chunk$set( collapse = TRUE, fig.width = 8, fig.height = 6, comment = "#>" )
require(stAirPol) require(spTimer) require(ggplot2) require(data.table)
We load the daily data of Munich for PM2.5
data("muc_airPol_p2")
remove outliers in the data
data <- clean_model_data(muc_airPol_p2, timesIQR = 1.5)
Now we want to fit a Gaussian Process model, and analyze the parameters and the model himself. We use the following formula to specify the covariables in the linear part of the model
formula = value ~ humi + temp + rainhist + windhist + trafficvol + log(sensor_age)
There are tree most used model types, - Gaussian Process short GP - Autoregressive Gaussian Process short AR - Gaussian Predictive Process short GPP First we fit only the Gaussian Process For more model types, see ?spT.Gibbs
model.gp.p2 <- fit_sp_model(data = data, formula = formula, model = 'GP')
With summary we get a brief overview of the fitted parameters, and the PMCC of the model
summary(model.gp.p2)
Trace plots of the MCMC-Iterations Note: make the plot window in RStudio as big as possible
# plot(model.gp.p2) # dev.off()
Now we can do a deeper specification of the Model First, we add our own Prioris to the model, for more details see ?spT.priors
priors <- spT.priors(model = "GP", inv.var.prior = Gamm(a = 2, b = 1), beta.prior = Norm(0, 10^4))
Covariance function for the spatial effects we want to use the exponential covariance, for more choices see ?spT.Gibbs
cov.fnc = "exponential"
We change the method how the parameter for the spatial decay should be fitted, see ?spT.decay for more details We choose a Gamma(2,1) Prior distribution with the tuning parameter 0.1, the tuning parameter should be chosen that the acceptance rate for phi is between 20%-40%.
spatial.decay = spT.decay(distribution = Gamm(a = 2, b = 1), tuning = 0.25)
If it's wanted to get some feedback during the fitting process, we can choose a number of feedback messages we want to arrive
report = 5
We are able to perform a on the fly transformations of the target variable
scale.transform = "SQRT" model.gp.p2.mod <- fit_sp_model(data = data, formula = formula, model = 'GP', priors = priors, cov.fnc = cov.fnc, report = report, scale.transform = scale.transform, spatial.decay = spatial.decay) summary(model.gp.p2.mod)
Now we have created two different kinds of Gaussian Process Models. With a crossvalidatioin we want to the them against each other. Therefore, we specify a train and a test set with 75% random training sensors
training_set <- get_test_and_training_set(data, sampel_size = 0.75, random.seed = 220292)
Now we are able the train the models only the the chosen training sensors by adding training_set = training_set into the function
model.gp.p2 <- fit_sp_model(data = data, formula = formula, model = 'GP', training_set = training_set) model.gp.p2.mod <- fit_sp_model(data = data, formula = formula, model = 'GP', priors = priors, cov.fnc = cov.fnc, report = report, training_set = training_set, scale.transform = scale.transform, spatial.decay = spatial.decay)
we are able to perform with the well-known predict function predictions for that models, by adding the training_set, only predictions for the test sensors will be calculated.
pred.gp.p2 <- predict(model.gp.p2, data, training_set) pred.gp.p2.mod <- predict(model.gp.p2.mod, data, training_set)
To evaluate the predictive performance ,we use evaluate_prediction()
evaluate_prediction(pred.gp.p2) evaluate_prediction(pred.gp.p2.mod)
There are also some validation plots availible Fitted vs. predicted:
plot(pred.gp.p2.mod)
Fitted vs. predicted separated by time:
plot(pred.gp.p2.mod, time_dimension = TRUE)
There are more Models besides the Gaussian Process. For example, the autoregressive Gaussian Process:
priors.ar <- spT.priors(model = "AR", inv.var.prior = Gamm(a = 2, b = 1), beta.prior = Norm(0, 10^4), rho.prior=Norm(0,10^10)) model.ar.p2 <- fit_sp_model(data = data, formula = formula, model = 'AR', priors = priors.ar, cov.fnc = cov.fnc, report = report, training_set = training_set, scale.transform = scale.transform, spatial.decay = spatial.decay) pred.ar.p2 <- predict(model.ar.p2, data, training_set) evaluate_prediction(pred.ar.p2)
The last model is the Gaussian Predictive Process, here we have to specify two more things: - How much knots should be used? - How should the knots be placed? Detailed options for that question are shown in 4_advanced_modelling.R
priors.gpp <- spT.priors(model = "GPP", inv.var.prior = Gamm(a = 2, b = 1), beta.prior = Norm(0, 10^4)) model.gpp.p2 <- fit_sp_model(data = data, formula = formula, model = 'GPP', priors = priors.gpp, cov.fnc = cov.fnc, knots_count = 4, report = report, training_set = training_set, scale.transform = scale.transform, spatial.decay = spatial.decay) pred.gpp.p2 <- predict(model.gpp.p2, data, training_set) evaluate_prediction(pred.gpp.p2)
evaluate_prediction_table(list('pred.gp.p2' = pred.gp.p2, 'pred.gp.p2.mod' = pred.gp.p2.mod, 'pred.ar.p2' = pred.ar.p2, 'pred.gpp.p2' = pred.gpp.p2)) gridExtra::grid.arrange(grobs = list( plot(pred.gp.p2) + ggtitle('GP'), plot(pred.gp.p2.mod) + ggtitle('mod GP'), plot(pred.ar.p2) + ggtitle('AR'), plot(pred.gpp.p2) + ggtitle('GPP') ))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.