Description Usage Arguments Details Value Examples

Generate block diagonal matrices to allow for fused L2 optimization with glmnet.

1 2 | ```
generateBlockDiagonalMatrices(X, Y, groups, G, intercept = FALSE,
penalty.factors = rep(1, dim(X)[2]), scaling = TRUE)
``` |

`X` |
covariates matrix (n by p). |

`Y` |
response vector (length n). |

`groups` |
vector of group indicators (ideally factors, length n) |

`G` |
matrix representing the fusion strengths between pairs of groups (K by K). Zero entries are assumed to be independent pairs. |

`intercept` |
whether to include an (per-group) intercept in the model |

`penalty.factors` |
vector of weights for the penalization of each covariate (length p) |

`scaling` |
Whether to scale each subgroup by its size. See Details for an explanation. |

We use the `glmnet`

package to perform fused subgroup regression.
In order to achieve this, we need to reformulate the problem as Y' = X'beta',
where Y' is a concatenation of the responses Y and a vector of zeros, X' is a
a matrix consisting of the block-diagonal matrix n by pK matrix X, where each
block contains the covariates for one subgroups, and the choose(K,2)*p by pK
matrix encoding the fusion penalties between pairs of groups. The vector of
parameters beta' of length pK can be rearranged as a p by K matrix giving the
parameters for each subgroup. The lasso penalty on the parameters is handled
by glmnet.

One weakness of the approach described above is that larger subgroups will
have a larger influence on the global parameters lambda and gamma.
In order to mitigate this, we introduce the `scaling`

parameter. If
`scaling=TRUE`

, then we scale the responses and covariates for each
subgroup by the number of samples in that group.

A list with components X, Y, X.fused and penalty, where X is a n by pK block-diagonal bigmatrix, Y is a re-arranged bigvector of length n, and X.fused is a choose(K,2)*p by pK bigmatrix encoding the fusion penalties.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ```
set.seed(123)
# Generate simple heterogeneous dataset
k = 4 # number of groups
p = 100 # number of covariates
n.group = 15 # number of samples per group
sigma = 0.05 # observation noise sd
groups = rep(1:k, each=n.group) # group indicators
# sparse linear coefficients
beta = matrix(0, p, k)
nonzero.ind = rbinom(p*k, 1, 0.025/k) # Independent coefficients
nonzero.shared = rbinom(p, 1, 0.025) # shared coefficients
beta[which(nonzero.ind==1)] = rnorm(sum(nonzero.ind), 1, 0.25)
beta[which(nonzero.shared==1),] = rnorm(sum(nonzero.shared), -1, 0.25)
X = lapply(1:k,
function(k.i) matrix(rnorm(n.group*p),
n.group, p)) # covariates
y = sapply(1:k,
function(k.i) X[[k.i]] %*% beta[,k.i] +
rnorm(n.group, 0, sigma)) # response
X = do.call('rbind', X)
# Pairwise Fusion strength hyperparameters (tau(k,k'))
# Same for all pairs in this example
G = matrix(1, k, k)
# Generate block diagonal matrices
transformed.data = generateBlockDiagonalMatrices(X, y, groups, G)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.