Description Usage Arguments Details Value Author(s) References See Also Examples
Fit the penalized log-contrast regression with functional compositional predictors
proposed by Zhe et al. (2020) <arXiv:1808.02403>. The model estimation is
conducted by minimizing a linearly constrained group lasso criterion. The regularization
paths are computed for the group lasso penalty at grid values of the regularization
parameter lam
and the degree of freedom of the basis function K
.
1 2 3 4 5 6 7 8 9 10 11 | FuncompCGL(y, X, Zc = NULL, intercept = TRUE, ref = NULL,
k, degree = 3, basis_fun = c("bs", "OBasis", "fourier"),
insert = c("FALSE", "X", "basis"), method = c("trapezoidal", "step"),
interval = c("Original", "Standard"), Trange,
T.name = "TIME", ID.name = "Subject_ID",
W = rep(1,times = p - length(ref)),
dfmax = p - length(ref), pfmax = min(dfmax * 1.5, p - length(ref)),
lam = NULL, nlam = 100, lambda.factor = ifelse(n < p1, 0.05, 0.001),
tol = 1e-8, mu_ratio = 1.01,
outer_maxiter = 1e+6, outer_eps = 1e-8,
inner_maxiter = 1e+4, inner_eps = 1e-8)
|
y |
response vector with length n. |
X |
data frame or matrix.
|
Zc |
a n*p_c design matrix of unpenalized variables. Default is NULL. |
intercept |
Boolean, specifying whether to include an intercept. Default is TRUE. |
ref |
reference level (baseline), either an integer between [1,p] or
|
k |
an integer, degrees of freedom of the basis function. |
degree |
degrees of freedom of the basis function. Default value is 3. |
basis_fun |
method to generate basis:
|
insert |
a character string sepcifying method to perform functional interpolation.
If |
method |
a character string sepcifying method used to approximate integral.
|
interval |
a character string sepcifying the domain of the integral.
|
Trange |
range of time points |
T.name, ID.name |
a character string specifying names of the time variable and the Subject
ID variable in |
W |
a vector of length p (the total number of groups),
or a matrix with dimension p1*p1, where
|
dfmax |
limit the maximum number of groups in the model. Useful for handling very large p, if a partial path is desired. Default is p. |
pfmax |
limit the maximum number of groups ever to be nonzero. For example once a group enters the model along the path,
no matter how many times it re-enters the model through the path, it will be counted only once.
Default is |
lam |
a user supplied lambda sequence.
If |
nlam |
the length of the |
lambda.factor |
the factor for getting the minimal lambda in |
tol |
tolerance for coefficient to be considered as non-zero. Once the convergence criterion is satisfied, for each element β_j in coefficient vector β, β_j = 0 if β_j < tol. |
mu_ratio |
the increasing ratio of the penalty parameter |
outer_maxiter, outer_eps |
|
inner_maxiter, inner_eps |
|
The functional log-contrast regression model for compositional predictors is defined as
y = 1_nβ_0 + Z_cβ_c + \int_T Z(t)β(t)dt + e, s.t. (1_p)^T β(t)=0 \forall t \in T,
where β_0 is the intercept,
β_c is the regression coefficient vector with length p_c corresponding to the control variables,
β(t) is the functional regression coefficient vector with length p as a funtion of t
and e is the random error vector with zero mean with length n.
Moreover, Z(t)
is the log-transformed functional compostional data.
If zero(s) exists in the original functional compositional data, user should pre-process these zero(s).
For example, if count data provided, user could replace 0's with 0.5.
After adopting a truncated basis expansion approach to re-express β(t)
β(t) = B Φ(t),
where B is a p-by-k unkown but fixed coefficient matrix, and Φ(t) consists of basis with degree of freedom k. We could write functional log-contrast regression model as
y = 1_nβ_0 + Z_cβ_c + Zβ + e, s.t. ∑_{j=1}^{p}β_j=0_k,
where Z is a n-by-pk matrix corresponding to the integral, β=vec(B^T) is a
pk-vector with every each k-subvector corresponding to the coefficient vector for the j-th
compositional component.
To enable variable selection, FuncompCGL
model is estimated via linearly constrained group lasso,
argmin_{β_0, β_c, β}(\frac{1}{2n}\|y - 1_nβ_0 - Z_cβ_c - Zβ\|_2^2 + λ ∑_{j=1}^{p} \|β_j\|_2), s.t. ∑_{j=1}^{p} β_j = 0_k.
An object with S3 class "FuncompCGL"
, which is a list containing:
Z |
the integral matrix for the functional compositional predictors with dimension n*(pk). |
lam |
the sequence of |
df |
the number of non-zero groups in the estimated coefficients for
the functional compositional predictors at each value of |
beta |
a matrix of coefficients with |
dim |
dimension of the coefficient matrix. |
sseq |
sequence of the time points. |
call |
the call that produces this object. |
Zhe Sun and Kun Chen
Sun, Z., Xu, W., Cong, X., Li G. and Chen K. (2020) Log-contrast regression with functional compositional predictors: linking preterm infant's gut microbiome trajectories to neurobehavioral outcome, https://arxiv.org/abs/1808.02403 Annals of Applied Statistics.
Yang, Y. and Zou, H. (2015) A fast unified algorithm for computing group-lasso penalized learning problems, https://link.springer.com/article/10.1007/s11222-014-9498-5 Statistics and Computing 25(6) 1129-1141.
Aitchison, J. and Bacon-Shone, J. (1984) Log-contrast models for experiments with mixtures, Biometrika 71 323-330.
cv.FuncompCGL
and GIC.FuncompCGL
, and
predict
, coef
,
plot
and print
methods for "FuncompCGL"
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | df_beta = 5
p = 30
beta_C_true = matrix(0, nrow = p, ncol = df_beta)
beta_C_true[1, ] <- c(-0.5, -0.5, -0.5 , -1, -1)
beta_C_true[2, ] <- c(0.8, 0.8, 0.7, 0.6, 0.6)
beta_C_true[3, ] <- c(-0.8, -0.8 , 0.4 , 1 , 1)
beta_C_true[4, ] <- c(0.5, 0.5, -0.6 ,-0.6, -0.6)
Data <- Fcomp_Model(n = 50, p = p, m = 0, intercept = TRUE,
SNR = 4, sigma = 3, rho_X = 0, rho_T = 0.6, df_beta = df_beta,
n_T = 20, obs_spar = 1, theta.add = FALSE,
beta_C = as.vector(t(beta_C_true)))
m1 <- FuncompCGL(y = Data$data$y, X = Data$data$Comp, Zc = Data$data$Zc,
intercept = Data$data$intercept, k = df_beta, tol = 1e-10)
print(m1)
plot(m1)
beta <- coef(m1)
arg_list <- as.list(Data$call)[-1]
arg_list$n <- 30
TEST <- do.call(Fcomp_Model, arg_list)
y_hat <- predict(m1, Znew = TEST$data$Comp, Zcnew = TEST$data$Zc)
plot(y_hat[, floor(length(m1$lam)/2)], TEST$data$y,
ylab = "Observed Response", xlab = "Predicted Response")
beta <- coef(m1, s = m1$lam[20])
beta_C <- matrix(beta[1:(p*df_beta)], nrow = p, byrow = TRUE)
colSums(beta_C)
Non.zero <- (1:p)[apply(beta_C, 1, function(x) max(abs(x)) > 0)]
Non.zero
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.