graper: Fit a regression model with graper

View source: R/graper.R

graperR Documentation

Fit a regression model with graper

Description

Fit a regression model with graper given a matrix of predictors (X), a response vector (y) and a vector of group memberships for each predictor in X (annot). For each group a different strength of penalization is determined adaptively.

Usage

graper(
  X,
  y,
  annot,
  factoriseQ = TRUE,
  spikeslab = TRUE,
  intercept = TRUE,
  family = "gaussian",
  standardize = TRUE,
  n_rep = 1,
  max_iter = 3000,
  th = 0.01,
  d_tau = 0.001,
  r_tau = 0.001,
  d_gamma = 0.001,
  r_gamma = 0.001,
  r_pi = 1,
  d_pi = 1,
  calcELB = TRUE,
  verbose = TRUE,
  freqELB = 1,
  nogamma = FALSE,
  init_psi = 1
)

Arguments

X

design matrix of size n (samples) x p (features)

y

response vector of size n

annot

factor of length p indicating group membership of each feature (column) in X

factoriseQ

if set to TRUE, the variational distribution is assumed to fully factorize across features (faster, default). If FALSE, a multivariate variational distribution is used.

spikeslab

if set to TRUE, a spike and slab prior on the coefficients (default).

intercept

whether to include an intercept into the model

family

Likelihood model for the response, either "gaussian" for linear regression or "binomial" for logistic regression

standardize

whether to standardize the predictors to unit variance

n_rep

number of repetitions with different random initializations to be fit

max_iter

maximum number of iterations

th

convergence threshold for the evidence lower bound (ELB)

d_tau

hyper-parameters for prior of tau (noise precision)

r_tau

hyper-parameters for prior of tau (noise precision)

d_gamma

hyper-parameters for prior of gamma (coefficients' prior precision)

r_gamma

hyper-parameters for prior of gamma (coefficients' prior precision)

r_pi

hyper-parameters for Beta prior of the mixture probabilities in the spike and slab prior

d_pi

hyper-parameters for Beta prior of the mixture probabilities in the spike and slab prior

calcELB

whether to calculate the evidence lower bound (ELB)

verbose

whether to print out intermediate messages during fitting

freqELB

frequency at which the evidence lower bound (ELB) is to be calculated, i.e. each freqELB-th iteration

nogamma

if TRUE, the normal prior will have same variance for all groups (only relevant for spikeslab = TRUE)

init_psi

initial value for the spike variables

Details

The function trains the graper model given a matrix of predictors (X), a response vector (y) and a vector of group memberships for each predictor in X (annot). For each feature group as specified in annot a penalty factor and sparsity level is learnt.

By default it uses a Spike-and-Slab prior on the coefficients and uses a fully factorized variational distribution in the inference. This provides a fast way to train the model. Using spikeslab=FALSE a ridge regression like model can be fitted using a normal instead of the spike and slab prior. Setting factoriseQ = FALSE gives a more exact inference scheme based on a multivariate variational distribution, but can be much slower.

As the optimization is non-convex is can be helpful to use multiple random initializations by setting n_rep to a value larger 1. The returned model is then chosen as the optimal fit with respect to the evidence lower bound (ELB).

Depending on the response vector a linear regression model (family = "gaussian") or a logistic regression model (family = "binomial") is fitted. Note, that the implementation of logistic regression is still experimental.

Value

A graper object containing

EW_beta

estimated model coefficients in liner/logistic regression

EW_s

estimated posterior-inclusion probabilities for each feature

intercept

estimated intercept term

annot

annotation vector of features to the groups as specified when calling graper

EW_gamma

estimated penalty factor per group

EW_pi

estimated sparsity level per group (from 1 (dense) to 0 (sparse))

EW_tau

estimated noise precision

sigma2_tildebeta_s1, EW_tildebeta_s1, alpha_gamma, alpha_tau, beta_tau, Sigma_beta, alpha_pi, beta_pi

parameters of the variational distributions of beta, gamma, tau and pi

ELB

final value of the evidence lower bound

ELB_trace

values of the evidence lower bound for all iterations

Options

other options used when calling graper

Examples

# create data
dat <- makeExampleData()

# fit a sparse model with spike and slab prior
fit <- graper(dat$X, dat$y, dat$annot)
fit # print fitted object
beta <- coef(fit, include_intercept=FALSE) # model coeffients
pips <- getPIPs(fit) # posterior inclusion probabilities
pf <- fit$EW_gamma # penalty factors per group
sparsities <- fit$EW_pi # sparsity levels per group

# fit a dense model without spike and slab prior
fit <- graper(dat$X, dat$y, dat$annot, spikeslab=FALSE)

# fit a dense model using a multivariate variational distribution
fit <- graper(dat$X, dat$y, dat$annot, factoriseQ=TRUE,
      spikeslab=FALSE)

cvraut/iBAGpkg documentation built on July 26, 2022, 9:55 p.m.