RunTGGLMixSelection: RunTGGLMixSelection is a convenience function which runs...

Description Usage Arguments Value See Also

View source: R/ms_tggl_mixture.R

Description

Fit a tree-guided group lasso mixture model. Restart with different random initializations and keep the model with the lowest objective value. Optional: Evaluate best model on validation set. By default all data is used for training. If validation.ids is not NULL, exclude corresponding indices from training and use them for validating parameters instead.

Usage

1
2
3
4
5
6
RunTGGLMixSelection(X = NULL, task.specific.features = list(), Y, M.vec,
  validation.ids = NULL, num.starts = 1, num.threads = NULL, groups,
  weights, lambda.vec, verbose = 0, gam = 1, homoscedastic = FALSE,
  EM.max.iter = 200, EM.epsilon = 1e-05, EM.verbose = 0,
  sample.data = FALSE, TGGL.mu = 1e-05, TGGL.epsilon = 1e-05,
  TGGL.iter = 25, shrink.mu = TRUE)

Arguments

X

N by J1 matrix of features common to all tasks.

task.specific.features

List of features which are specific to each task. Each entry contains an N by J2 matrix for one particular task (where columns are features). List has to be ordered according to the columns of Y.

Y

N by K output matrix for every task.

M.vec

Vector with numbers of clusters.

validation.ids

(Optional) Indices of data points to be used for validation. Needs to be supplied if more than one parameter pair (M, lambda) is given. Default is to use all data for training.

num.starts

(Optional) Number of starts. Default is 1 (no restarts).

num.threads

(Optional) Number of threads to be used. Default is 1.

groups

Binary V by K matrix determining group membership: Task k in group v iff groups[v,k] == 1.

weights

V dimensional vector with group weights.

lambda.vec

Vector with regularization parameters.

verbose

(Optional) Integer in 0,1,2,3. verbose = 0: No output. verbose = 1: Print final summary. verbose = 2: Print summary for each parameter. verbose = 3: Print summary for each restart.

gam

(Optional) Regularization parameter for component m will be lambda times the prior for component m to the power of gam.

homoscedastic

(Optional) Force variance to be the same for all tasks in a component. Default is FALSE.

EM.max.iter

(Optional) Maximum number of iterations for EM algorithm.

EM.epsilon

(Optional) Desired accuracy. Algorithm will terminate if change in penalized negative log-likelihood drops below EM.epsilon.

EM.verbose

(Optional) Integer in 0,1,2. verbose = 0: No output. verbose = 1: Print summary at the end of the optimization. verbose = 2: Print progress during optimization.

sample.data

(Optional) Sample data according to posterior probability or not.

TGGL.mu

(Optional) Mu parameter for TGGL.

TGGL.epsilon

(Optional) Epsilon parameter for TGGL.

TGGL.iter

(Optional) Initial number of iterations for TGGL. Will be increased incrementally to ensure convergence. When the number of samples is much larger than the dimensionalty, it can be beneficial to use a large initial number of iterations for TGGL. This is because every run of TGGL requires precomputation of multiple n-by-n matrix products.

shrink.mu

(Optional) Multiply mu by min(lambda, 1).

Value

List containing

results

List of TGGLMix models for each parameter setting.

top.model

Model with highest predictive likelihood.

See Also

TGGLMix


tohein/linearMTL documentation built on May 17, 2019, 8:22 a.m.