Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/simdatafunctions.R
Simulate a multivariate response matrix, given parameters such as but not necessarily all of: family, number of latent variables and related coefficients, an matrix of explanatory variables and related coefficients, row effects, cutoffs for cumulative probit regression of ordinal responses.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
create.life(true.lv = NULL, lv.coefs,
lv.control = list(num.lv = 0, type = "independent",
lv.covparams = NULL, distmat = NULL),
X = NULL, X.coefs = NULL,
traits = NULL, traits.coefs = NULL, family, row.eff = "none",
row.params = NULL, row.ids = NULL, offset = NULL,
trial.size = 1, cutoffs = NULL, powerparam = NULL,
manual.dim = NULL, save.params = FALSE)
## S3 method for class 'boral'
simulate(object, nsim = 1, seed = NULL, new.lvs = FALSE, distmat = NULL,
est = "median", ...)
|
object |
An object of class "boral". |
nsim |
Number of multivariate response matrices to simulate. Defaults to 1. |
seed |
Seed for dataset simulation. Defaults to |
new.lvs |
If |
distmat |
A distance matrix required to calculate correlations across sites when a non-independent correlation structure on the latent variables is imposed, when |
est |
A choice of either the posterior median ( |
true.lv |
A matrix of true latent variables. With multivariate abundance data in ecology for instance, each row corresponds to the true site ordination coordinates. If supplied, then simulation is based of these true latent variables. If |
lv.coefs |
A matrix containing column-specific intercepts, latent variable coefficients relating to |
lv.control |
This argument is utilized if
Please see |
X |
An model matrix of covariates, which can be included as part of the data generation. Defaults to |
X.coefs |
The coefficients relating to |
traits |
A model matrix of species traits, which can be included as part of the model. Defaults to |
traits.coefs |
A matrix of coefficients that are used to generate "new" column-specific intercepts and How this argument works is as follows: when both It is important that highlight then with in this data generation mechanism, the new column-specific intercepts and Defaults to |
family |
Either a single element, or a vector of length equal to the number of columns in Please see |
row.eff |
Single element indicating whether row effects are included as fixed effects ("fixed"), random effects ("random") or not included ("none") in the fitted model. If fixed effects, then for parameter identifiability the first row effect is set to zero, which analogous to acting as a reference level when dummy variables are used. If random effects, they are drawn from a normal distribution with mean zero and standard deviation given by |
row.params |
Parameters corresponding to the row effect from the fitted model. If |
row.ids |
A matrix with the number of rows equal to the number of rows in |
offset |
A matrix with the same dimensions as the response matrix |
trial.size |
Either equal to a single element, or a vector of length equal to the number of columns in y. If a single element, then all columns assumed to be binomially distributed will have trial size set to this. If a vector, different trial sizes are allowed in each column of y. The argument is ignored for all columns not assumed to be binomially distributed. Defaults to 1, i.e. Bernoulli distribution. |
cutoffs |
A vector of common common cutoffs for proportional odds regression when any of |
powerparam |
A common power parameter for tweedie regression when any of |
manual.dim |
A vector of length 2, containing the number of rows (n) and columns (p) for the multivariate response matrix. This is a "backup" argument only required when |
save.params |
If |
... |
Not used. |
create.life
gives the user full capacity to control the true parameters of the model from which the multivariate responses matrices are generated from. If true.lv
is supplied, then the data generation mechanism is based on this. If true.lv = NULL
, then the function looks to lv.control
to determine whether and how the true latent variables are to be simulated.
simulate
makes use of the generic function of the same name in R
: it takes a fitted model, treats either the posterior medians and mean estimates from the model as the true parameters, and generates response matrices based off that.
If create.life
is used, then: 1) if save.params
= FALSE, a n by p multivariate response matrix is returned only, 2) if save.params = TRUE
, then a list containing the element resp
which is a n times p multivariate response matrix, as well as other elements for the parameters used in the true model, e.g. true.lv, lv.coefs = lv.coefs, traits.coef
, is returned.
If simulate
is used, then a three dimensional array of dimension n by p by nsim
is returned. The same latent variables can be used each time (new.lvs = FALSE
), or new true latent variables can be generated each time (new.lvs = TRUE
).
NA
Maintainer: NA
boral
for the default function for model fitting.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | ## NOTE: The values below MUST NOT be used in a real application;
## they are only used here to make the examples run quick!!!
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100,
n.thin = 1)
## Example 1a - Simulate a response matrix of normally distributed data
library(mvtnorm)
## 30 rows (sites) with two latent variables
true.lv <- rbind(rmvnorm(n=15,mean=c(1,2)),rmvnorm(n=15,mean=c(-3,-1)))
## 30 columns (species)
lv.coefs <- cbind(matrix(runif(30*3),30,3),1)
X <- matrix(rnorm(30*4),30,4)
## 4 explanatory variables
X.coefs <- matrix(rnorm(30*4),30,4)
simy <- create.life(true.lv = true.lv, lv.coefs = lv.coefs,
X = X, X.coefs = X.coefs, family = "normal")
## Not run:
fit.boral <- boral(simy, X = X, family = "normal", lv.control = list(num.lv = 2),
mcmc.control = example_mcmc_control)
summary(fit.boral)
## End(Not run)
## Example 1b - Include a nested random row effect
## 30 subregions nested within six regions
example_row_ids <- cbind(1:30, rep(1:6,each=5))
## Subregion has a small std deviation; region has a larger one
true.row.sigma <- list(ID1 = 0.5, ID2 = 2)
simy <- create.life(true.lv = true.lv, lv.coefs = lv.coefs,
X = X, X.coefs = X.coefs, row.eff = "random",
row.params = true.row.sigma, row.ids = example_row_ids, family = "normal",
save.params = TRUE)
## Example 1c - Same as example 1b except new, true latent variables are generated
simy <- create.life(true.lv = NULL, lv.coefs = lv.coefs,
X = X, X.coefs = X.coefs, row.eff = "random",
row.params = true.row.sigma, row.ids = example_row_ids, family = "normal",
save.params = TRUE)
## Example 1d - Same as example 1a except new, true latent variables are generated
## with a non-independent correlation structure using a fake distance matrix
makedistmat <- as.matrix(dist(1:30))
simy <- create.life(true.lv = NULL, lv.coefs = lv.coefs,
lv.control = list(num.lv = 2, type = "exponential", lv.covparams = 5, distmat = makedistmat),
X = X, X.coefs = X.coefs, row.eff = "random",
row.params = true.row.sigma, row.ids = example_row_ids, family = "normal",
save.params = TRUE)
## Example 2 - Simulate a response matrix of ordinal data
## 30 rows (sites) with two latent variables
true.lv <- rbind(rmvnorm(15,mean=c(-2,-2)),rmvnorm(15,mean=c(2,2)))
## 10 columns (species)
true.lv.coefs <- rmvnorm(10,mean = rep(0,3));
## Cutoffs for proportional odds regression (must be in increasing order)
true.ordinal.cutoffs <- seq(-2,10,length=10-1)
simy <- create.life(true.lv = true.lv, lv.coefs = true.lv.coefs,
family = "ordinal", cutoffs = true.ordinal.cutoffs, save.params = TRUE)
## Not run:
fit.boral <- boral(y = simy$resp, family = "ordinal", lv.control = list(num.lv = 2),
mcmc.control = example_mcmc_control)
## End(Not run)
## Not run:
## Example 3 - Simulate a response matrix of count data based off
## a fitted model involving traits (ants data from mvabund)
library(mvabund)
data(antTraits)
y <- antTraits$abun
X <- as.matrix(antTraits$env)
## Include only traits 1, 2, and 5, plus an intercept
traits <- as.matrix(antTraits$traits[,c(1,2,5)])
## Please see help file for boral regarding the use of which.traits
example_which_traits <- vector("list",ncol(X)+1)
for(i in 1:length(example_which_traits))
example_which_traits[[i]] <- 1:ncol(traits)
fit_traits <- boral(y, X = X, traits = traits, which.traits = example_which_traits,
family = "negative.binomial", lv.control = list(num.lv = 2),
mcmc.control = example_mcmc_control)
## The hard way
simy <- create.life(true.lv = fit_traits$lv.mean,
lv.coefs = fit_traits$lv.coefs.median, X = X,
X.coefs = fit_traits$X.coefs.median, traits = traits,
traits.coefs = fit_traits$traits.coefs.median, family = "negative.binomial")
## The easy way, using the same latent variables as the fitted model
simy <- simulate(object = fit_traits)
## The easy way, generating new latent variables
simy <- simulate(object = fit_traits, new.lvs = TRUE)
## End(Not run)
## Example 4 - simulate Bernoulli data, based on a model with two latent variables,
## no site variables, with two traits and one environmental covariates
## This example is a proof of concept that traits can used
## to explain environmental responses
library(mvtnorm)
n <- 100; s <- 50
X <- as.matrix(scale(1:n))
colnames(X) <- c("elevation")
traits <- cbind(rbinom(s,1,0.5), rnorm(s))
## one categorical and one continuous variable
colnames(traits) <- c("thorns-dummy","SLA")
simfit <- list(lv.coefs = cbind(rnorm(s), rmvnorm(s, mean = rep(0,2))),
lv.control = list(num.lv = 2, type = "independent"),
traits.coefs = matrix(c(0.1,1,-0.5,1,0.5,0,-1,1), 2, byrow = TRUE))
rownames(simfit$traits.coefs) <- c("beta0","elevation")
colnames(simfit$traits.coefs) <- c("kappa0","thorns-dummy","SLA","sigma")
simy <- create.life(lv.control = simfit$lv.control, lv.coefs = simfit$lv.coefs,
X = X, traits = traits, traits.coefs = simfit$traits.coefs,
family = "binomial")
## Not run:
## NOTE: The values below MUST NOT be used in a real application;
## they are only used here to make the examples run quick!!!
example_mcmc_control <- list(n.burnin = 10, n.iteration = 100,
n.thin = 1)
example_which_traits <- vector("list",ncol(X)+1);
for(i in 1:length(example_which_traits))
example_which_traits[[i]] <- 1:ncol(traits)
fit_traits <- boral(y = simy, X = X, traits = traits,
which.traits = example_which_traits, family = "binomial",
lv.control = list(num.lv = 2), save.model = TRUE,
mcmc.control = example_mcmc_control)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.