An example of a logistic regression being used to estimate the probability of multiple species' presences along a number of environmental gradients. Instead of assuming independence of species regression coefficients, or partial pooling in shared distributions, we use a sub-model to estimate species regression coefficients. In this case, we're using species traits to estimate their response to different environmental gradients.
Because we're building a sub-model, it's more efficient to simply add a column of ones to dataframes for the base model and sub-model. This is simply to prevent our code from becoming too cumbersome. If we didn't want to use our sub-model to estimate the intercept, we would not need to include the column of ones in the environmental dataframe.
# make fake data n_species <- 3 n_env <- 1 n_sites <- 5 n_traits <- 1 # n_sites x n_env matrix of environmental variables env <- matrix(rnorm(n_sites * n_env), nrow = n_sites) # n_species * n_traits matix of trait variables traits <- matrix(rnorm(n_species * n_traits), nrow = n_species) # n_sites * n_species matrix of observed occupancy occupancy <- matrix(rbinom(n_sites * n_species, 1, 0.5), nrow = n_sites)
# include a column of 1's for intercept estimation in the sub-model (traits) and base model traits <- cbind(rep(1, n_species), traits) env <- cbind(rep(1, n_sites), env) # redefine n_env and n_traits after adding columns for intercepts n_env <- ncol(env) n_traits <- ncol(traits) # sub-model parameters have normal prior distributions g <- normal(0, 10, dim = c(n_env, n_traits)) # parameters of the base model are a function of the parameters of the sub-model beta <- g %*% t(traits) # use the coefficients to get the model linear predictor linear_predictor <- env %*% beta # use the logit link to get probabilities of occupancy p <- ilogit(linear_predictor) # data are bernoulli distributed distribution(occupancy) <- bernoulli(p)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.