graper: Fit a regression model with graper
In cvraut/iBAGpkg: integrative Bayesian Analysis of the Genome

graper

R Documentation

Fit a regression model with graper

Description

Fit a regression model with graper given a matrix of predictors (X), a response vector (y) and a vector of group memberships for each predictor in X (annot). For each group a different strength of penalization is determined adaptively.

Usage

graper(
  X,
  y,
  annot,
  factoriseQ = TRUE,
  spikeslab = TRUE,
  intercept = TRUE,
  family = "gaussian",
  standardize = TRUE,
  n_rep = 1,
  max_iter = 3000,
  th = 0.01,
  d_tau = 0.001,
  r_tau = 0.001,
  d_gamma = 0.001,
  r_gamma = 0.001,
  r_pi = 1,
  d_pi = 1,
  calcELB = TRUE,
  verbose = TRUE,
  freqELB = 1,
  nogamma = FALSE,
  init_psi = 1
)

Arguments

`X`	design matrix of size n (samples) x p (features)
`y`	response vector of size n
`annot`	factor of length p indicating group membership of each feature (column) in X
`factoriseQ`	if set to TRUE, the variational distribution is assumed to fully factorize across features (faster, default). If FALSE, a multivariate variational distribution is used.
`spikeslab`	if set to TRUE, a spike and slab prior on the coefficients (default).
`intercept`	whether to include an intercept into the model
`family`	Likelihood model for the response, either "gaussian" for linear regression or "binomial" for logistic regression
`standardize`	whether to standardize the predictors to unit variance
`n_rep`	number of repetitions with different random initializations to be fit
`max_iter`	maximum number of iterations
`th`	convergence threshold for the evidence lower bound (ELB)
`d_tau`	hyper-parameters for prior of tau (noise precision)
`r_tau`	hyper-parameters for prior of tau (noise precision)
`d_gamma`	hyper-parameters for prior of gamma (coefficients' prior precision)
`r_gamma`	hyper-parameters for prior of gamma (coefficients' prior precision)
`r_pi`	hyper-parameters for Beta prior of the mixture probabilities in the spike and slab prior
`d_pi`	hyper-parameters for Beta prior of the mixture probabilities in the spike and slab prior
`calcELB`	whether to calculate the evidence lower bound (ELB)
`verbose`	whether to print out intermediate messages during fitting
`freqELB`	frequency at which the evidence lower bound (ELB) is to be calculated, i.e. each freqELB-th iteration
`nogamma`	if TRUE, the normal prior will have same variance for all groups (only relevant for spikeslab = TRUE)
`init_psi`	initial value for the spike variables

Details

The function trains the graper model given a matrix of predictors (X), a response vector (y) and a vector of group memberships for each predictor in X (annot). For each feature group as specified in annot a penalty factor and sparsity level is learnt.

By default it uses a Spike-and-Slab prior on the coefficients and uses a fully factorized variational distribution in the inference. This provides a fast way to train the model. Using spikeslab=FALSE a ridge regression like model can be fitted using a normal instead of the spike and slab prior. Setting factoriseQ = FALSE gives a more exact inference scheme based on a multivariate variational distribution, but can be much slower.

As the optimization is non-convex is can be helpful to use multiple random initializations by setting n_rep to a value larger 1. The returned model is then chosen as the optimal fit with respect to the evidence lower bound (ELB).

Depending on the response vector a linear regression model (family = "gaussian") or a logistic regression model (family = "binomial") is fitted. Note, that the implementation of logistic regression is still experimental.

Value

A graper object containing

EW_beta: estimated model coefficients in liner/logistic regression
EW_s: estimated posterior-inclusion probabilities for each feature
intercept: estimated intercept term
annot: annotation vector of features to the groups as specified when calling graper
EW_gamma: estimated penalty factor per group
EW_pi: estimated sparsity level per group (from 1 (dense) to 0 (sparse))
EW_tau: estimated noise precision
sigma2_tildebeta_s1, EW_tildebeta_s1, alpha_gamma, alpha_tau, beta_tau, Sigma_beta, alpha_pi, beta_pi: parameters of the variational distributions of beta, gamma, tau and pi
ELB: final value of the evidence lower bound
ELB_trace: values of the evidence lower bound for all iterations
Options: other options used when calling graper

Examples

# create data
dat <- makeExampleData()

# fit a sparse model with spike and slab prior
fit <- graper(dat$X, dat$y, dat$annot)
fit # print fitted object
beta <- coef(fit, include_intercept=FALSE) # model coeffients
pips <- getPIPs(fit) # posterior inclusion probabilities
pf <- fit$EW_gamma # penalty factors per group
sparsities <- fit$EW_pi # sparsity levels per group

# fit a dense model without spike and slab prior
fit <- graper(dat$X, dat$y, dat$annot, spikeslab=FALSE)

# fit a dense model using a multivariate variational distribution
fit <- graper(dat$X, dat$y, dat$annot, factoriseQ=TRUE,
      spikeslab=FALSE)

cvraut/iBAGpkg documentation built on July 26, 2022, 9:55 p.m.