# mcmcFD: First Differences of a Bayesian Logit or Probit model In BayesPostEst: Generate Postestimation Quantities for Bayesian MCMC Estimation

## Description

R function to calculate first differences after a Bayesian logit or probit model. First differences are a method to summarize effects across covariates. This quantity represents the difference in predicted probabilities for each covariate for cases with low and high values of the respective covariate. For each of these differences, all other variables are held constant at their median. For more, see Long (1997, Sage Publications) and King, Tomz, and Wittenberg (2000, American Journal of Political Science 44(2): 347-361).

## Usage

 ```1 2 3 4 5 6 7 8``` ```mcmcFD( modelmatrix, mcmcout, link = "logit", ci = c(0.025, 0.975), percentiles = c(0.25, 0.75), fullsims = FALSE ) ```

## Arguments

 `modelmatrix` model matrix, including intercept (if the intercept is among the parameters estimated in the model). Create with model.matrix(formula, data). Note: the order of columns in the model matrix must correspond to the order of columns in the matrix of posterior draws in the `mcmcout` argument. See the `mcmcout` argument for more. `mcmcout` posterior distributions of all logit coefficients, in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed into a matrix using the function as.mcmc() from the coda package for `jags` class objects, as.matrix() from base R for `mcmc`, `mcmc.list`, `stanreg`, and `stanfit` class objects, and `object\$sims.matrix` for `bugs` class objects. Note: the order of columns in this matrix must correspond to the order of columns in the model matrix. One can do this by examining the posterior distribution matrix and sorting the variables in the order of this matrix when creating the model matrix. A useful function for sorting column names containing both characters and numbers as you create the matrix of posterior distributions is `mixedsort()` from the gtools package. `link` type of generalized linear model; a character vector set to `"logit"` (default) or `"probit"`. `ci` the bounds of the credible interval. Default is `c(0.025, 0.975)` for the 95% credible interval. `percentiles` values of each predictor for which the difference in Pr(y = 1) is to be calculated. Default is `c(0.25, 0.75)`, which will calculate the difference between Pr(y = 1) for the 25th percentile and 75th percentile of the predictor. For binary predictors, the function automatically calculates the difference between Pr(y = 1) for x = 0 and x = 1. `fullsims` logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`.

## Value

An object of class `mcmcFD`. If `fullsims = FALSE` (default), a data frame with five columns:

• median_fd: median first difference

• lower_fd: lower bound of credible interval of the first difference

• upper_fd: upper bound of credible interval of the first difference

• VarName: name of the variable as found in `modelmatrix`

• VarID: identifier of the variable, based on the order of columns in `modelmatrix` and `mcmcout`. Can be adjusted for plotting

If `fullsims = TRUE`, a matrix with as many columns as predictors in the model. Each row is the first difference for that variable based on one set of posterior draws. Column names are taken from the column names of `modelmatrix`.

## References

• King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316

• Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54``` ```if (interactive()) { ## simulating data set.seed(1234) b0 <- 0.2 # true value for the intercept b1 <- 0.5 # true value for first beta b2 <- 0.7 # true value for second beta n <- 500 # sample size X1 <- runif(n, -1, 1) X2 <- runif(n, -1, 1) Z <- b0 + b1 * X1 + b2 * X2 pr <- 1 / (1 + exp(-Z)) # inv logit function Y <- rbinom(n, 1, pr) df <- data.frame(cbind(X1, X2, Y)) ## formatting the data for jags datjags <- as.list(df) datjags\$N <- length(datjags\$Y) ## creating jags model model <- function() { for(i in 1:N){ Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i logit(p[i]) <- mu[i] ## Logit link function mu[i] <- b[1] + b[2] * X1[i] + b[3] * X2[i] } for(j in 1:3){ b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity } } params <- c("b") inits1 <- list("b" = rep(0, 3)) inits2 <- list("b" = rep(0, 3)) inits <- list(inits1, inits2) ## fitting the model with R2jags set.seed(123) fit <- R2jags::jags(data = datjags, inits = inits, parameters.to.save = params, n.chains = 2, n.iter = 2000, n.burnin = 1000, model.file = model) ## running function with logit xmat <- model.matrix(Y ~ X1 + X2, data = df) mcmc <- coda::as.mcmc(fit) mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)] object <- mcmcFD(modelmatrix = xmat, mcmcout = mcmc_mat) object } ```

BayesPostEst documentation built on Nov. 11, 2021, 9:07 a.m.