Description Usage Arguments Details Value Author(s) References Examples
Efficient procedure for solving the Lasso or SCAD penalized soft maximin problem for large scale data. This software implements two proximal gradient based algorithms (NPG and FISTA) to solve different forms of the soft maximin problem from Lund et al., 2021. i) For general group specific design the soft maximin problem is solved using the NPG algorithm. ii) For fixed identical darraytensor design across groups, where d = 1, 2, 3, the estimation procedure uses either the FISTA algorithm or the NPG algorithm, and for d = 2,3 avoids using the tensor design matrix. Multithreading is possible when openMP is available for R.
Note this package SMME replaces the SMMA package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
x 
list containing the G group specific design matrices of sizes
n_i \times p_i. Alternatively for a model with identical tensor design
across G groups, 
y 
list containing the G group specific response vectors of sizes
n_i \times 1. Alternatively for a model with identical tensor design
across G groups, 
zeta 
vector of strictly positive floats controlling the softmaximin
approximation accuracy. When 
penalty 
string specifying the penalty type. Possible values are

alg 
string specifying the optimization algorithm. Possible values are

nlambda 
positive integer giving the number of 
lambda.min.ratio 
strictly positive float giving the smallest value for

lambda 
sequence of strictly positive floats used as penalty parameters. 
penalty.factor 
array of size p_1 \times \cdots \times p_d of
positive floats. Is multiplied with each element in 
reltol 
strictly positive float giving the convergence tolerance for the inner loop. 
maxiter 
positive integer giving the maximum number of iterations
allowed for each 
steps 
strictly positive integer giving the number of steps used in the
multistep adaptive lasso algorithm for nonconvex penalties. Automatically
set to 1 when 
btmax 
strictly positive integer giving the maximum number of backtracking
steps allowed in each iteration. Default is 
c 
strictly positive float used in the NPG algorithm. Default is

tau 
strictly positive float used to control the stepsize for NPG.
Default is 
M 
positive integer giving the look back for the NPG. Default is 
nu 
strictly positive float used to control the stepsize. A value less
that 1 will decrease the stepsize and a value larger than one will increase it.
Default is 
Lmin 
nonnegative float used by the NPG algorithm to control the
stepsize. For the default 
log 
logical variable indicating whether to use logloss. TRUE is default and yields the loss below. 
nthreads 
integer giving the number of threads to use when openMP is available. Default is 4. 
Consider modeling heterogeneous data y_1,…, y_n by dividing it into G groups \mathbf{y}_g = (y_1, …, y_{n_g}), g \in \{ 1,…, G\} and then using a linear model
\mathbf{y}_g = \mathbf{X}_gb_g + ε_g, \quad g \in \{1,…, G\},
to model the group response. Then b_g is a group specific p\times 1 coefficient, \mathbf{X}_g an n_g\times p group design matrix and ε_g an n_g\times 1 error term. The objective is to estimate a common coefficient β such that \mathbf{X}_gβ is a robust and good approximation to \mathbf{X}_gb_g across groups.
Following Lund et al., 2021, this objective may be accomplished by solving the soft maximin estimation problem
\min_{β}\frac{1}{ζ}\log\bigg(∑_{g = 1}^G \exp(ζ \hat V_g(β))\bigg) + λ \Vertβ\Vert_1, \quad ζ > 0,λ ≥q 0.
Here ζ essentially controls the amount of pooling across groups (ζ \sim 0 ignores grouping and pools observations) and
\hat V_g(β):=\frac{1}{n_g}(2β^\top \mathbf{X}_g^\top \mathbf{y}_gβ^\top \mathbf{X}_g^\top \mathbf{X}_gβ),
is the empirical explained variance from Meinshausen and Buhlmann, 2015. See Lund et al., 2021 for more details and references.
The function softmaximin
solves the soft maximin estimation problem in
large scale settings for a sequence of penalty parameters
λ_{max}>… >λ_{min}>0 and a sequence of strictly positive
softmaximin parameters ζ_1, ζ_2,….
The implementation also solves the problem above with the penalty given by the SCAD penalty, using the multiple step adaptive lasso procedure to loop over the inner proximal algorithm.
Two optimization algorithms are implemented in the SMME packages; a nonmonotone proximal gradient (NPG) algorithm and a fast iterative soft thresholding algorithm (FISTA). The implementation is particularly efficient for applications where the design is identical across groups i.e. \mathbf{X}_g = \mathbf{X} \forall g \in \{1, …, G\} and where \mathbf{X} has tensor structure i.e.
\mathbf{X} = \bigotimes_{i=1}^d \mathbf{M}_i.
for marginal n_i\times p_i design matrices \mathbf{M}_1,…, \mathbf{M}_d.
For d \in \{ 1, 2, 3\}, provided only with the marginal matrices and the
group response vectors, softmaximin
solves the soft maximin problem with
minimal memory footprint using tensor optimized arithmetic, see RH
.
Note that when multiple values for ζ is provided it is possible to distribute the computations across CPUs if openMP is available.
An object with S3 Class "SMME".
spec 
A string indicating the array dimension (1, 2 or 3) and the penalty. 
coef 
A p_1\cdots p_d \times 
lambda 
A vector containing the sequence of penalty values used
in the estimation procedure for which the procedure converged.
When 
Obj 
A matrix containing the objective values for each
iteration and each model for which the procedure converged.
When 
df 
A vector indicating the nonzero model coefficients for each
value of 
dimcoef 
An integer giving the number p of model parameters. For array data a vector giving the dimension of the model coefficient array β. 
dimobs 
An integer giving the number of observations. For array data a
vector giving the dimension of the observation (response) array 
iter 
A vector containing the number of iterations for each

Adam Lund
Maintainer: Adam Lund, adam.lund@math.ku.dk
Lund, A., S. W. Mogensen and N. R. Hansen (2021). Soft Maximin Estimation for Heterogeneous Data. Preprint. url = https://arxiv.org/abs/1805.02407
Meinshausen, N and P. Buhlmann (2015). Maximin effects in inhomogeneous largescale data. The Annals of Statistics. 43, 4, 18011830. url = https://doi.org/10.1214/15AOS1325.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65  #Nonarray data
##size of example
set.seed(42)
G < 10; n < sample(100:500, G); p < 60
x < y < list()
##group design matrices
for(g in 1:G){x[[g]] < matrix(rnorm(n[g] * p), n[g], p)}
##common features and effects
common_features < rbinom(p, 1, 0.1)
common_effects < rnorm(p) * common_features
##group response
for(g in 1:G){
bg < rnorm(p, 0, 0.5) * (1  common_features) + common_effects
mu < x[[g]] %*% bg
y[[g]] < rnorm(n[g]) + mu
}
##fit model for range of lambda and zeta
system.time(fit < softmaximin(x, y, zeta = c(0.1, 1), penalty = "lasso", alg = "npg"))
betahat < fit$coef
##estimated common effects for specific lambda and zeta
modelno < 6; zetano < 2
m < min(betahat[[zetano]][ , modelno], common_effects)
M < max(betahat[[zetano]][ , modelno], common_effects)
plot(common_effects, type = "p", ylim = c(m, M), col = "red")
lines(betahat[[zetano]][ , modelno], type = "h")
#Array data
##size of example
set.seed(42)
G < 5; n < c(65, 26, 13); p < c(13, 5, 4)
##marginal design matrices (Kronecker components)
x < list()
for(i in 1:length(n)){x[[i]] < matrix(rnorm(n[i] * p[i]), n[i], p[i])}
##common features and effects
common_features < rbinom(prod(p), 1, 0.1)
common_effects < rnorm(prod(p), 0, 0.1) * common_features
##group response
y < array(NA, c(n, G))
for(g in 1:G){
bg < rnorm(prod(p), 0, 0.1) * (1  common_features) + common_effects
Bg < array(bg, p)
mu < RH(x[[3]], RH(x[[2]], RH(x[[1]], Bg)))
y[,,, g] < array(rnorm(prod(n)), dim = n) + mu
}
##fit model for range of lambda and zeta
system.time(fit < softmaximin(x, y, zeta = c(0.1, 1, 10, 100), penalty = "lasso", alg = "npg"))
Betahat < fit$coef
##estimated common effects for specific lambda and zeta
modelno < 10; zetano < 3
m < min(Betahat[[zetano]][, modelno], common_effects)
M < max(Betahat[[zetano]][, modelno], common_effects)
plot(common_effects, type = "h", ylim = c(m, M), col = "red")
lines(Betahat[[zetano]][, modelno], type = "h")

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.