gibbs_logit: gibbs_logit
In rjpdmp: Reversible Jump PDMP Samplers

gibbs_logit

R Documentation

gibbs_logit

Description

Applies the Collapsed Gibbs Sampler to a Logistic regression with dirac spike and slab distribution, as detailed in Reversible Jump PDMP Samplers for Variable Selection, 2020. For included variables an independent Gaussian prior is assumed with variance prior_sigma2 and mean zero, variables are given prior probabilities of inclusion ppi. Code makes use of the package set-up for Polya-Gamma simulation available at https://github.com/jgscott/helloPG.

Usage

gibbs_logit(
  dataX,
  datay,
  beta,
  gamma,
  ppi = 0.5,
  nsamples = 100000L,
  maxTime = 1e+08,
  prior_sigma2 = 10
)

Arguments

`dataX`	Matrix of all covariates where the i-th row corresponds to all p covariates x_i,1, ..., x_i,p of the i-th observation.
`datay`	Vector of n observations of a 0, 1-valued variable y.
`beta`	Initial position of the regression parameter
`gamma`	Initial model for the sampler. Enteries should either be 1s or 0s.
`ppi`	Double for the prior probability of inclusion (ppi) for each parameter.
`nsamples`	Maximum number of samples. Default value is 10^5, lower values should be chosen for memory constraints if less samples are desired.
`maxTime`	Maximum runtime (in Seconds) of the algorithm; will terminate the code after a given computation time or nmax iterations of the algorithm is reached.
`prior_sigma2`	Double for the prior variance for included variables. Default 10.

Value

Returns a list with the following objects:

beta: Matrix of regression parameter samples, columns are samples.

gamma: Matrix of model parameter samples columns are samples.

times: computation times at sampled events - Useful for plotting computational efficiency.

Examples


generate.logistic.data <- function(beta, n.obs, Sig) {
p <- length(beta)
dataX <- MASS::mvrnorm(n=n.obs,mu=rep(0,p),Sigma=Sig)
vals <- dataX %*% as.vector(beta)
generateY <- function(p) { rbinom(1, 1, p)}
dataY <- sapply(1/(1 + exp(-vals)), generateY)
return(list(dataX = dataX, dataY = dataY))
}

n <- 15
p <- 25
beta <- c(1, rep(0, p-1))
Siginv <- diag(1,p,p)
Siginv[1,2] <- Siginv[2,1] <- 0.9
set.seed(1)
data <- generate.logistic.data(beta, n, solve(Siginv))
ppi <- 2/p

zigzag_fit <- zigzag_logit(maxTime = 1, dataX = data$dataX,
                           datay = data$dataY, prior_sigma2 = 10,
                           theta0 = rep(0, p), x0 = rep(0, p), rj_val = 0.6,
                           ppi = ppi)

gibbs_fit <- gibbs_logit(maxTime = 1, dataX = data$dataX, datay =data$dataY,
                         prior_sigma2 = 10,beta = rep(0,p), gamma =rep(0,p),
                         ppi = ppi)
## Not run: 
plot_pdmp(zigzag_fit, coords = 1:2, inds = 1:1e3,burn = .1,
          nsamples = 1e4, mcmc_samples =t(gibbs_fit$beta*gibbs_fit$gamma))

## End(Not run)

rjpdmp documentation built on March 18, 2022, 7:52 p.m.