FLLat.BIC: Optimal Tuning Parameters for the Fused Lasso Latent Feature Model

Description

Returns the optimal values of the fused lasso tuning parameters for the Fused Lasso Latent Feature (FLLat) model by minimizing the BIC. Also returns the fitted FLLat model for the optimal values of the tuning parameters.

Usage

1
2
FLLat.BIC(Y, J=min(15,floor(ncol(Y)/2)), B="pc", thresh=10^(-4), maxiter=100,
          maxiter.B=1, maxiter.T=1)

Arguments

Y

A matrix of data from an aCGH experiment (usually in the form of log intensity ratios) or some other type of copy number data. Rows correspond to the probes and columns correspond to the samples.

J

The number of features in the FLLat model. The default is the smaller of either 15 or the number of samples divided by 2.

B

The initial values for the features. Can be one of "pc" (the first J principal components of Y), "rand" (a random selection of J columns of Y), or a user specified matrix of initial values, where rows correspond to the probes and columns correspond to the features. The default is "pc".

thresh

The threshold for determining when the solutions have converged. The default is 10^(-4).

maxiter

The maximum number of iterations for the outer loop of the FLLat algorithm. The default is 100.

maxiter.B

The maximum number of iterations for the inner loop of the FLLat algorithm for estimating the features B. The default is 1. Increasing this may decrease the number of iterations for the outer loop but may still increase total run time.

maxiter.T

The maximum number of iterations for the inner loop of the FLLat algorithm for estimating the weights Θ. The default is 1. Increasing this may decrease the number of iterations for the outer loop but may still increase total run time.

Details

This function returns the optimal values of the fused lasso tuning parameters, λ_1 and λ_2, for the FLLat model. The optimal values are chosen by first re-parameterizing λ_1 and λ_2 in terms of λ_0 and a proportion α such that λ_1=α*λ_0 and λ_2=(1-α)*λ_0. The values of α are fixed to be {0.1, 0.3, 0.5, 0.7, 0.9} and for each value of α we consider a range of λ_0 values. The optimal values of λ_0 and α (and consequently λ_1 and λ_2) are chosen by minimizing the following BIC-type criterion over this two dimensional grid:

(SL)*log(RSS/(SL)) + k_{α, λ_0}*log(SL),

where S is the number of samples, L is the number probes, RSS denotes the residual sum of squares and k_{α, λ_0} denotes the sum over all the features of the number of unique non-zero elements in each estimated feature.

Note that for extremely large data sets, this function may take some time to run.

For more details, please see Nowak and others (2011) and the package vignette.

Value

A list with components:

lam0

The optimal value of λ_0.

alpha

The optimal value of α.

lam1

The optimal value of λ_1.

lam2

The optimal value of λ_2.

opt.FLLat

The fitted FLLat model for the optimal values of the tuning parameters.

Author(s)

Gen Nowak gen.nowak@gmail.com, Trevor Hastie, Jonathan R. Pollack, Robert Tibshirani and Nicholas Johnson.

References

G. Nowak, T. Hastie, J. R. Pollack and R. Tibshirani. A Fused Lasso Latent Feature Model for Analyzing Multi-Sample aCGH Data. Biostatistics, 2011, doi: 10.1093/biostatistics/kxr012

See Also

FLLat

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Load simulated aCGH data.
data(simaCGH)

## Run FLLat.BIC to choose optimal tuning parameters for J = 5 features.
result.bic <- FLLat.BIC(simaCGH,J=5)

## Plot the features for the optimal FLLat model.
plot(result.bic$opt.FLLat)

## Plot a heatmap of the weights for the optimal FLLat model.
plot(result.bic$opt.FLLat,type="weights")

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.