gibbs_logit: gibbs_logit

View source: R/RcppExports.R

gibbs_logitR Documentation

gibbs_logit

Description

Applies the Collapsed Gibbs Sampler to a Logistic regression with dirac spike and slab distribution, as detailed in Reversible Jump PDMP Samplers for Variable Selection, 2020. For included variables an independent Gaussian prior is assumed with variance prior_sigma2 and mean zero, variables are given prior probabilities of inclusion ppi. Code makes use of the package set-up for Polya-Gamma simulation available at https://github.com/jgscott/helloPG.

Usage

gibbs_logit(
  dataX,
  datay,
  beta,
  gamma,
  ppi = 0.5,
  nsamples = 100000L,
  maxTime = 1e+08,
  prior_sigma2 = 10
)

Arguments

dataX

Matrix of all covariates where the i-th row corresponds to all p covariates x_i,1, ..., x_i,p of the i-th observation.

datay

Vector of n observations of a 0, 1-valued variable y.

beta

Initial position of the regression parameter

gamma

Initial model for the sampler. Enteries should either be 1s or 0s.

ppi

Double for the prior probability of inclusion (ppi) for each parameter.

nsamples

Maximum number of samples. Default value is 10^5, lower values should be chosen for memory constraints if less samples are desired.

maxTime

Maximum runtime (in Seconds) of the algorithm; will terminate the code after a given computation time or nmax iterations of the algorithm is reached.

prior_sigma2

Double for the prior variance for included variables. Default 10.

Value

Returns a list with the following objects:

beta: Matrix of regression parameter samples, columns are samples.

gamma: Matrix of model parameter samples columns are samples.

times: computation times at sampled events - Useful for plotting computational efficiency.

Examples


generate.logistic.data <- function(beta, n.obs, Sig) {
p <- length(beta)
dataX <- MASS::mvrnorm(n=n.obs,mu=rep(0,p),Sigma=Sig)
vals <- dataX %*% as.vector(beta)
generateY <- function(p) { rbinom(1, 1, p)}
dataY <- sapply(1/(1 + exp(-vals)), generateY)
return(list(dataX = dataX, dataY = dataY))
}

n <- 15
p <- 25
beta <- c(1, rep(0, p-1))
Siginv <- diag(1,p,p)
Siginv[1,2] <- Siginv[2,1] <- 0.9
set.seed(1)
data <- generate.logistic.data(beta, n, solve(Siginv))
ppi <- 2/p

zigzag_fit <- zigzag_logit(maxTime = 1, dataX = data$dataX,
                           datay = data$dataY, prior_sigma2 = 10,
                           theta0 = rep(0, p), x0 = rep(0, p), rj_val = 0.6,
                           ppi = ppi)

gibbs_fit <- gibbs_logit(maxTime = 1, dataX = data$dataX, datay =data$dataY,
                         prior_sigma2 = 10,beta = rep(0,p), gamma =rep(0,p),
                         ppi = ppi)
## Not run: 
plot_pdmp(zigzag_fit, coords = 1:2, inds = 1:1e3,burn = .1,
          nsamples = 1e4, mcmc_samples =t(gibbs_fit$beta*gibbs_fit$gamma))

## End(Not run)

rjpdmp documentation built on March 18, 2022, 7:52 p.m.