maximin | R Documentation |
Efficient procedure for solving the maximin estimation problem with identical design across groups, see (Lund, 2022).
maximin(y, x, penalty = "lasso", alg ="aradmm", kappa = 0.99, nlambda = 30, lambda_min_ratio = 1e-04, lambda = NULL, penalty_factor = NULL, standardize = TRUE, tol = 1e-05, maxiter = 1000, delta = 1, gamma = 1, eta = 0.1, aux_par = NULL)
y |
Array of size n_1 \times\cdots\times n_d \times G containing the response values. |
x |
Either i) the design matrix, ii) a list containing the Kronecker
components (2 or 3) if the design matrix has Kronecker structure or iii) a
string indicating the name of the wavelet to use (see |
penalty |
string specifying the penalty type. Possible values are
|
alg |
string specifying the optimization algorithm. Possible values are
|
kappa |
Strictly positive float controlling the maximum sparsity in the solution. Only used with ADMM type algorithms. Should be between 0 and 1. |
nlambda |
Positive integer giving the number of |
lambda_min_ratio |
strictly positive float giving the smallest value for
|
lambda |
Sequence of strictly positive floats used as penalty parameters. |
penalty_factor |
A vector of length p containing positive floats that are
multiplied with each element in |
standardize |
Boolean indicating if response |
tol |
Strictly positive float controlling the convergence tolerance. |
maxiter |
Positive integer giving the maximum number of iterations
allowed for each |
delta |
Positive float controlling the step size in the algorithm. |
gamma |
Positive float controlling the relaxation parameter in the algorithm. Should be between 0 and 2. |
eta |
Scaling parameter for the step size in the accelerated TOS algorithm. Should be between 0 and 1. |
aux_par |
Auxiliary parameters for the algorithms. |
For n heterogeneous data points divided into G equal sized groups with m<n data points in each, let y_g=(y_{g,1},…,y_{g,m}) denote the vector of observations in group g. For a m\times p design matrix X consider the model
y_g=Xb_g+ε_g
for b_g a random group specific coefficient vector and ε_g an error term, see Meinshausen and Buhlmann (2015). For the model above following Lund (2022) this package solves the maximin estimation problem
\min_{β} -\hat V_g(β)) + λ\Vertβ\Vert_1,λ ≥ 0
where
\hat V_g(β):=\frac{1}{n}(2β^\top X^\top y_g - β^\top X^\top Xβ),
is the empirical explained variance in group g. See Lund, 2022 for more details and references.
The package solves the problem using different algorithms depending on X:
i) If X is orthogonal (e.g. the inverse wavelet transform) either an ADMM algorithm (standard or relaxed) or an adaptive relaxed ADMM (ARADMM) with auto tuned step size is used, see Xu et al (2017).
ii) For general X, a three operator splitting (TOS) algorithm is implemented, see Damek and Yin (2017). Note if the design is tensor structured, X = \bigotimes_{i=1}^d X_i for d\in\{1, 2,3\}, the procedure accepts a list containing the tensor components (matrices).
An object with S3 Class "FRESHD".
spec |
A string indicating the array dimension (1, 2 or 3) and the penalty. |
coef |
A p_1\cdots p_d \times |
lambda |
A vector containing the sequence of penalty values used in the estimation procedure. |
df |
The number of nonzero coefficients for each value of |
dimcoef |
A vector giving the dimension of the model coefficient array β. |
dimobs |
A vector giving the dimension of the observation (response) array |
dim |
Integer indicating the dimension of of the array model. Equal to 1 for non array. |
wf |
A string indicating the wavelet name if used. |
diagnostics |
A list where item |
Adam Lund
Maintainer: Adam Lund, adam.lund@math.ku.dk
Lund, Adam (2022). Fast Robust Signal Estimation for Heterogeneous data. In preparation.
Meinshausen, N and P. Buhlmann (2015). Maximin effects in inhomogeneous large-scale data. The Annals of Statistics. 43, 4, 1801-1830.
Davis, Damek and Yin, Wotao, (2017). A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis. 25, 4, 829-858.
Xu, Zheng and Figueiredo, Mario AT and Yuan, Xiaoming and Studer, Christoph and Goldstein, Tom (2017). Adaptive relaxed admm: Convergence theory and practical implementation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 7389-7398.
## general 3d tensor design matrix set.seed(42) G <- 20; n <- c(65, 26, 13)*3; p <- c(13, 5, 4)*3 sigma <- 1 ##marginal design matrices (Kronecker components) x <- list() for(i in 1:length(n)){x[[i]] <- matrix(rnorm(n[i] * p[i], 0, sigma), n[i], p[i])} ##common features and effects common_features <- rbinom(prod(p), 1, 0.1) common_effects <- rnorm(prod(p), 0, 0.1) * common_features ##simulate group response y <- array(NA, c(n, G)) for(g in 1:G){ bg <- rnorm(prod(p), 0, 0.1) * (1 - common_features) + common_effects Bg <- array(bg, p) mu <- RH(x[[3]], RH(x[[2]], RH(x[[1]], Bg))) y[,,, g] <- array(rnorm(prod(n), 0, var(mu)), dim = n) + mu } ##fit model for range of lambda system.time(fit <- maximin(y, x, penalty = "lasso", alg = "tosacc")) ##estimated common effects for specific lambda modelno <- 10 betafit <- fit$coef[, modelno] plot(common_effects, type = "h", ylim = range(betafit, common_effects), col = "red") lines(betafit, type = "h") ##size of example set.seed(42) G <- 50; p <- n <- c(2^6, 2^5, 2^6); ##common features and effects common_features <- rbinom(prod(p), 1, 0.1) #sparsity of comm. feat. common_effects <- rnorm(prod(p), 0, 1) * common_features ##group response y <- array(NA, c(n, G)) for(g in 1:G){ bg <- rnorm(prod(p), 0, 0.1) * (1 - common_features) + common_effects Bg <- array(bg, p) mu <- iwt(Bg) y[,,, g] <- array(rnorm(prod(n),0, 0.5), dim = n) + mu } ##orthogonal wavelet design with 1d data G = 50; N1 = 2^10; p = 101; J = 2; amp = 20; sigma2 = 10 y <- matrix(0, N1, G) z <- seq(0, 2, length.out = N1) sig <- cos(10 * pi * z) + 1.5 * sin(5 * pi * z) for (i in 1:G){ freqs <- sample(1:100, size = J, replace = TRUE) y[, i] <- sig * 2 + rnorm(N1, sd = sqrt(sigma2)) for (j in 1:J){ y[, i] <- y[, i] + amp * sin(freqs[j] * pi * z + runif(1, -pi, pi)) } } system.time(fit <- maximin(y, "la8", alg = "aradmm", kappa = 0.9)) mmy <- predict(fit, "la8") plot(mmy[,1], type = "l") lines(sig, col = "red")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.