In greta-dev/greta: Simple and Scalable Statistical Modelling in R

Multiple species with sub-model for regression coefficients

An example of a logistic regression being used to estimate the probability of multiple species' presences along a number of environmental gradients. Instead of assuming independence of species regression coefficients, or partial pooling in shared distributions, we use a sub-model to estimate species regression coefficients. In this case, we're using species traits to estimate their response to different environmental gradients.

Because we're building a sub-model, it's more efficient to simply add a column of ones to dataframes for the base model and sub-model. This is simply to prevent our code from becoming too cumbersome. If we didn't want to use our sub-model to estimate the intercept, we would not need to include the column of ones in the environmental dataframe.

data

# make fake data
n_species <- 3
n_env <- 1
n_sites <- 5
n_traits <- 1

# n_sites x n_env matrix of environmental variables
env <- matrix(rnorm(n_sites * n_env), nrow = n_sites)
# n_species * n_traits matix of trait variables
traits <- matrix(rnorm(n_species * n_traits), nrow = n_species)
# n_sites * n_species matrix of observed occupancy
occupancy <- matrix(rbinom(n_sites * n_species, 1, 0.5), nrow = n_sites)

greta code

# include a column of 1's for intercept estimation in the sub-model (traits) and base model
traits <- cbind(rep(1, n_species), traits)
env <- cbind(rep(1, n_sites), env)

# redefine n_env and n_traits after adding columns for intercepts
n_env <- ncol(env)
n_traits <- ncol(traits)

# sub-model parameters have normal prior distributions
g <- normal(0, 10, dim = c(n_env, n_traits))
# parameters of the base model are a function of the parameters of the sub-model
beta <-  g %*% t(traits) 

# use the coefficients to get the model linear predictor
linear_predictor <- env %*% beta 

# use the logit link to get probabilities of occupancy
p <- ilogit(linear_predictor)

# data are bernoulli distributed
distribution(occupancy) <- bernoulli(p)