gibbs_logit | R Documentation |
Applies the Collapsed Gibbs Sampler to a Logistic regression with dirac spike and slab distribution, as detailed in Reversible Jump PDMP Samplers for Variable Selection, 2020.
For included variables an independent Gaussian prior is assumed with variance prior_sigma2
and mean zero, variables are given prior probabilities of inclusion ppi
.
Code makes use of the package set-up for Polya-Gamma simulation available at https://github.com/jgscott/helloPG
.
gibbs_logit( dataX, datay, beta, gamma, ppi = 0.5, nsamples = 100000L, maxTime = 1e+08, prior_sigma2 = 10 )
dataX |
Matrix of all covariates where the i-th row corresponds to all p covariates x_i,1, ..., x_i,p of the i-th observation. |
datay |
Vector of n observations of a 0, 1-valued variable y. |
beta |
Initial position of the regression parameter |
gamma |
Initial model for the sampler. Enteries should either be 1s or 0s. |
ppi |
Double for the prior probability of inclusion (ppi) for each parameter. |
nsamples |
Maximum number of samples. Default value is 10^5, lower values should be chosen for memory constraints if less samples are desired. |
maxTime |
Maximum runtime (in Seconds) of the algorithm; will terminate the code after a given computation time or nmax iterations of the algorithm is reached. |
prior_sigma2 |
Double for the prior variance for included variables. Default 10. |
Returns a list with the following objects:
beta
: Matrix of regression parameter samples, columns are samples.
gamma
: Matrix of model parameter samples columns are samples.
times
: computation times at sampled events - Useful for plotting computational efficiency.
generate.logistic.data <- function(beta, n.obs, Sig) { p <- length(beta) dataX <- MASS::mvrnorm(n=n.obs,mu=rep(0,p),Sigma=Sig) vals <- dataX %*% as.vector(beta) generateY <- function(p) { rbinom(1, 1, p)} dataY <- sapply(1/(1 + exp(-vals)), generateY) return(list(dataX = dataX, dataY = dataY)) } n <- 15 p <- 25 beta <- c(1, rep(0, p-1)) Siginv <- diag(1,p,p) Siginv[1,2] <- Siginv[2,1] <- 0.9 set.seed(1) data <- generate.logistic.data(beta, n, solve(Siginv)) ppi <- 2/p zigzag_fit <- zigzag_logit(maxTime = 1, dataX = data$dataX, datay = data$dataY, prior_sigma2 = 10, theta0 = rep(0, p), x0 = rep(0, p), rj_val = 0.6, ppi = ppi) gibbs_fit <- gibbs_logit(maxTime = 1, dataX = data$dataX, datay =data$dataY, prior_sigma2 = 10,beta = rep(0,p), gamma =rep(0,p), ppi = ppi) ## Not run: plot_pdmp(zigzag_fit, coords = 1:2, inds = 1:1e3,burn = .1, nsamples = 1e4, mcmc_samples =t(gibbs_fit$beta*gibbs_fit$gamma)) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.