The package performs Bayesian geostatistical spatial mapping of disease prevalence using prevalence survey data and environmental covariate raster files. Using an integrated nested Laplace approximation (INLA) framework, the outputs include mapped prevalence in the form of raster files, and measures of in-sample-fit and cross-validation.

```
devtools::install_github(repo = "suyunkang/PrevalenceMapping")
```

```
devtools::install_github(repo = "suyunkang/PrevalenceMapping", ref = "Documentation")
```

Below is an example using mock data and covariates which are included in the *"Documentation"* branch of the package. The full script may be found on GitHub at `inst/doc/Spatial_model_mock_data.R`

. Here I present a brief bare-bones example

- Load a response dataset
- Create a mesh for it
- Load in covariate raster files
- Fit a model with parameter selection (minimum DIC wins)
- Perform cross-validation
- Visualise the cross-validation results

First, specify working directory and load packages. We'll make use of a few `raster`

functions at the top level, so it's good to load it as we go as well. You may choose to use your own working directory, we're going to store the model plots and output in a new folder `Outputs`

.

```
dir.create("Outputs", showWarnings = FALSE)
setwd("Outputs")
library(PrevalenceMapping)
library(raster)
library(INLA)
```

Load the response data and do any pre-procsessing.

```
Resp = read.csv(file.path(.libPaths(), "PrevalenceMapping/doc/ResponseData/MOZ_mock_data.csv"))
Resp$examined = round(Resp$tested, 0)
Resp$n_positive = round(Resp$positive, 0)
```

The mock data has columns `latitude`

, `longitude`

, `tested`

and `positive`

, eg, the first five rows are

| latitude | longitude | tested | positive| |-|-|-|-| | -14.24 | 39.82 | 37.58 | 15.37 | | -24.93 | 31.82 | 6.11 | 0.00 | | -11.00 | 35.42 | 5.59 | 2.93 | | -15.95 | 33.48 | 20.54 | 11.56 | | -25.52 | 33.11 | 13.56 | 0.00 | | ... | | | |

With the response we make a mesh and build an SPDE model. The parameters in `control`

, `"convex"`

, `"concave"`

, `"max_edge"`

, and `"cutoff"`

may require some tuning, it's common to try a broad range of meshes before moving on to prediction later.

```
out_mesh = makeMeshSPDE(response = Resp,
control = list(convex = -0.1,
concave = -0.1,
max_edge = c(0.2, 0.4),
cutoff = 0.3),
plot_mesh = TRUE)
```

We have our dataset `Resp`

and our mesh `out_mesh`

, now stack the environmental covariate rasters using the example covariates `.tif`

files. We set any `NA`

's to be `-9999`

. The covariates in this data are useful malaria prevalence predictors like `distance to water`

, and several `land surface temperature`

among others.

```
folder = file.path(.libPaths(), "PrevalenceMapping/doc/Covariates")
lsf = grep("*.tif$", list.files(folder, full.names = TRUE), value = T)
covariate_stack = stack(lsf)
NAvalue(covariate_stack) = -9999
```

With the response, the mesh and the covariates loaded we fit and, with parameter selection, re-fit the model, choosing the "best" model as the one with the smallest deviance information criterion (DIC). This function produces `"Fixed_effects.txt"`

-- a table of coefficients for the final set of covariates.

```
mod_final = findModelWithSmallestDIC(response = Resp,
raster_stack = covariate_stack,
A_mat = out_mesh$A_mat,
spde = out_mesh$spde,
family = "binomial")
```

Using the model we predict prevalence on a grid/raster and compute posterior samples. Use the `nsamp`

parameter to choose number of desired realisations of the posterior distribution and `write_posterior`

to choose whether to write the posterior samples into `.tif`

files.

```
nsamp = 1000
mod_final = read.table("Fixed_effects.txt")
predict = predictionOnAGrid(response = Resp,
finalmodel = mod_final,
A_mat = out_mesh$A_mat,
spde = out_mesh$spde,
mesh = out_mesh$mesh,
family = "binomial",
raster_stack = covariate_stack,
nsamp = nsamp,
int.strategy = "eb",
write_posterior = TRUE)
```

Cross-validation statistics are simple to produce, use parameter `n_reps`

to choose number of desired replicates and inspect the output `validation_result`

.

```
n_reps = 100
test_pct = c(0.1, 0.2, 0.3, 0.4)
validation_result = vector("list", 4)
names(validation_result) = paste0(test_pct*100, "pct")
for (k in 1:length(test_pct)){
fn <- paste0("Cross_validation_", test_pct[k]*100, "pct.txt")
if (file.exists(fn)) file.remove(fn)
validation_result[[k]] = crossValidation(response = Resp,
finalmodel = mod_final,
A_mat = out_mesh$A_mat,
spde = out_mesh$spde,
family = "binomial",
raster_stack = covariate_stack,
int.strategy = "eb",
n_reps = n_reps,
pct_out = test_pct[k])
}
validation_result
```

Finally, we may also choose to visualise the validation results.

```
# pdf("Cross_validation_result.pdf")
par(mfrow = c(2,1), mar = c(3.5,4,3,1), mgp = c(2,1,0))
## Pearson correlation
DF = data.frame(Cor10 = validation_result$`10pct`[,1],
Cor20 = validation_result$`20pct`[,1],
Cor30 = validation_result$`30pct`[,1],
Cor40 = validation_result$`40pct`[,1])
boxplot(DF,
names = c("10%", "20%", "30%", "40%"),
ylab = expression(rho),
xlab = "Percentage of validation set",
col = 2:6,
main = "(A) Pearson correlation coefficient",
cex.main = 1)
## Cross-validated R-squared
DF2 = data.frame(Cor10 = validation_result$`10pct`[,2],
Cor20 = validation_result$`20pct`[,2],
Cor30 = validation_result$`30pct`[,2],
Cor40 = validation_result$`40pct`[,2])
boxplot(DF2,
names = c("10%", "20%", "30%", "40%"),
ylab = expression(R^2),
xlab = "Percentage of validation set",
col = 2:6,
main = "(B) Cross-validated R-squared",
cex.main = 1)
# dev.off()
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.