Description Usage Arguments Details Value Author(s) References See Also Examples
Fit the penalized logcontrast regression with functional compositional predictors
proposed by Zhe et al. (2020) <arXiv:1808.02403>. The model estimation is
conducted by minimizing a linearly constrained group lasso criterion. The regularization
paths are computed for the group lasso penalty at grid values of the regularization
parameter lam
and the degree of freedom of the basis function K
.
1 2 3 4 5 6 7 8 9 10 11  FuncompCGL(y, X, Zc = NULL, intercept = TRUE, ref = NULL,
k, degree = 3, basis_fun = c("bs", "OBasis", "fourier"),
insert = c("FALSE", "X", "basis"), method = c("trapezoidal", "step"),
interval = c("Original", "Standard"), Trange,
T.name = "TIME", ID.name = "Subject_ID",
W = rep(1,times = p  length(ref)),
dfmax = p  length(ref), pfmax = min(dfmax * 1.5, p  length(ref)),
lam = NULL, nlam = 100, lambda.factor = ifelse(n < p1, 0.05, 0.001),
tol = 1e8, mu_ratio = 1.01,
outer_maxiter = 1e+6, outer_eps = 1e8,
inner_maxiter = 1e+4, inner_eps = 1e8)

y 
response vector with length n. 
X 
data frame or matrix.

Zc 
a n*p_c design matrix of unpenalized variables. Default is NULL. 
intercept 
Boolean, specifying whether to include an intercept. Default is TRUE. 
ref 
reference level (baseline), either an integer between [1,p] or

k 
an integer, degrees of freedom of the basis function. 
degree 
degrees of freedom of the basis function. Default value is 3. 
basis_fun 
method to generate basis:

insert 
a character string sepcifying method to perform functional interpolation.
If 
method 
a character string sepcifying method used to approximate integral.

interval 
a character string sepcifying the domain of the integral.

Trange 
range of time points 
T.name, ID.name 
a character string specifying names of the time variable and the Subject
ID variable in 
W 
a vector of length p (the total number of groups),
or a matrix with dimension p1*p1, where

dfmax 
limit the maximum number of groups in the model. Useful for handling very large p, if a partial path is desired. Default is p. 
pfmax 
limit the maximum number of groups ever to be nonzero. For example once a group enters the model along the path,
no matter how many times it reenters the model through the path, it will be counted only once.
Default is 
lam 
a user supplied lambda sequence.
If 
nlam 
the length of the 
lambda.factor 
the factor for getting the minimal lambda in 
tol 
tolerance for coefficient to be considered as nonzero. Once the convergence criterion is satisfied, for each element β_j in coefficient vector β, β_j = 0 if β_j < tol. 
mu_ratio 
the increasing ratio of the penalty parameter 
outer_maxiter, outer_eps 

inner_maxiter, inner_eps 

The functional logcontrast regression model for compositional predictors is defined as
y = 1_nβ_0 + Z_cβ_c + \int_T Z(t)β(t)dt + e, s.t. (1_p)^T β(t)=0 \forall t \in T,
where β_0 is the intercept,
β_c is the regression coefficient vector with length p_c corresponding to the control variables,
β(t) is the functional regression coefficient vector with length p as a funtion of t
and e is the random error vector with zero mean with length n.
Moreover, Z(t)
is the logtransformed functional compostional data.
If zero(s) exists in the original functional compositional data, user should preprocess these zero(s).
For example, if count data provided, user could replace 0's with 0.5.
After adopting a truncated basis expansion approach to reexpress β(t)
β(t) = B Φ(t),
where B is a pbyk unkown but fixed coefficient matrix, and Φ(t) consists of basis with degree of freedom k. We could write functional logcontrast regression model as
y = 1_nβ_0 + Z_cβ_c + Zβ + e, s.t. ∑_{j=1}^{p}β_j=0_k,
where Z is a nbypk matrix corresponding to the integral, β=vec(B^T) is a
pkvector with every each ksubvector corresponding to the coefficient vector for the jth
compositional component.
To enable variable selection, FuncompCGL
model is estimated via linearly constrained group lasso,
argmin_{β_0, β_c, β}(\frac{1}{2n}\y  1_nβ_0  Z_cβ_c  Zβ\_2^2 + λ ∑_{j=1}^{p} \β_j\_2), s.t. ∑_{j=1}^{p} β_j = 0_k.
An object with S3 class "FuncompCGL"
, which is a list containing:
Z 
the integral matrix for the functional compositional predictors with dimension n*(pk). 
lam 
the sequence of 
df 
the number of nonzero groups in the estimated coefficients for
the functional compositional predictors at each value of 
beta 
a matrix of coefficients with 
dim 
dimension of the coefficient matrix. 
sseq 
sequence of the time points. 
call 
the call that produces this object. 
Zhe Sun and Kun Chen
Sun, Z., Xu, W., Cong, X., Li G. and Chen K. (2020) Logcontrast regression with functional compositional predictors: linking preterm infant's gut microbiome trajectories to neurobehavioral outcome, https://arxiv.org/abs/1808.02403 Annals of Applied Statistics.
Yang, Y. and Zou, H. (2015) A fast unified algorithm for computing grouplasso penalized learning problems, https://link.springer.com/article/10.1007/s1122201494985 Statistics and Computing 25(6) 11291141.
Aitchison, J. and BaconShone, J. (1984) Logcontrast models for experiments with mixtures, Biometrika 71 323330.
cv.FuncompCGL
and GIC.FuncompCGL
, and
predict
, coef
,
plot
and print
methods for "FuncompCGL"
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  df_beta = 5
p = 30
beta_C_true = matrix(0, nrow = p, ncol = df_beta)
beta_C_true[1, ] < c(0.5, 0.5, 0.5 , 1, 1)
beta_C_true[2, ] < c(0.8, 0.8, 0.7, 0.6, 0.6)
beta_C_true[3, ] < c(0.8, 0.8 , 0.4 , 1 , 1)
beta_C_true[4, ] < c(0.5, 0.5, 0.6 ,0.6, 0.6)
Data < Fcomp_Model(n = 50, p = p, m = 0, intercept = TRUE,
SNR = 4, sigma = 3, rho_X = 0, rho_T = 0.6, df_beta = df_beta,
n_T = 20, obs_spar = 1, theta.add = FALSE,
beta_C = as.vector(t(beta_C_true)))
m1 < FuncompCGL(y = Data$data$y, X = Data$data$Comp, Zc = Data$data$Zc,
intercept = Data$data$intercept, k = df_beta, tol = 1e10)
print(m1)
plot(m1)
beta < coef(m1)
arg_list < as.list(Data$call)[1]
arg_list$n < 30
TEST < do.call(Fcomp_Model, arg_list)
y_hat < predict(m1, Znew = TEST$data$Comp, Zcnew = TEST$data$Zc)
plot(y_hat[, floor(length(m1$lam)/2)], TEST$data$y,
ylab = "Observed Response", xlab = "Predicted Response")
beta < coef(m1, s = m1$lam[20])
beta_C < matrix(beta[1:(p*df_beta)], nrow = p, byrow = TRUE)
colSums(beta_C)
Non.zero < (1:p)[apply(beta_C, 1, function(x) max(abs(x)) > 0)]
Non.zero

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.