glog: Generalized linear model constraint on hierarchical structure...
In smog: Structural Modeling by using Overlapped Group Penalty

Description Usage Arguments Value Author(s) References See Also Examples

Generalized linear model constraint on hierarchical structure by using overlapped group penalty

1
2
3

glog(y, x, g, v, lambda, hierarchy, family = "gaussian", rho = 10,
  scale = TRUE, eabs = 0.001, erel = 0.001, LL = 1, eta = 1.25,
  maxitr = 1000L)

`y`	response variable, in the format of matrix. When family is `gaussian'' or` binomial”, `y` ought to be a matrix of n by 1 for the observations; when family is “coxph”, y represents the survival objects, that is, a matrix of n by 2, containing the survival time and the censoring status. See `Surv`.
`x`	a model matrix of dimensions n by p,in which the column represents the predictor variables.
`g`	a numeric vector of group labels for the predictor variables.
`v`	a numeric vector of binary values, represents whether or not the predictor variables are penalized. Note that 1 indicates penalization and 0 for not penalization.
`lambda`	a numeric vector of three penalty parameters corresponding to L2 norm, squared L2 norm, and L1 norm, respectively.
`hierarchy`	a factor value in levels 0, 1, 2, which represent different hierarchical structure within groups, respectively. When `hierarchy=0`, λ_2 and λ_3 are forced to be zeroes; when `hierarchy=1`, λ_2 is forced to be zero; when `hierarchy=2`, there is no constraint on λ's. See `smog`.
`family`	a description of the distribution family for the response variable variable. For continuous response variable, family is `gaussian''; for multinomial or binary response variable, family is` binomial”; for survival response variable, family is “coxph”, respectively.
`rho`	the penalty parameter used in the alternating direction method of multipliers algorithm (ADMM). Default is 10.
`scale`	whether or not scale the design matrix. Default is `true`.
`eabs`	the absolute tolerance used in the ADMM algorithm. Default is 1e-3.
`erel`	the reletive tolerance used in the ADMM algorithm. Default is 1e-3.
`LL`	initial value for the Lipschitz continuous constant for approximation to the objective function in the Majorization- Minimization (MM) (or iterative shrinkage-thresholding algorithm (ISTA)). Default is 1.
`eta`	gradient stepsize for the backtrack line search for the Lipschitz continuous constant. Default is 1.25.
`maxitr`	the maximum iterations for convergence in the ADMM algorithm. Default is 500.

A list of

`coefficients`	A data frame of the variable name and the estimated coefficients
`llikelihood`	The log likelihood value based on the ultimate estimated coefficients
`loglike`	The sequence of log likelihood values since the algorithm starts
`PrimalError`	The sequence of primal errors in the ADMM algorithm
`DualError`	The sequence of dual errors in the ADMM algorithm
`converge`	The integer of the iteration when the convergence occurs

Chong Ma, chongma8903@gmail.com.

\insertRef

ma2019structuralsmog

cv.smog, smog.default, smog.formula, predict.smog, plot.smog.

set.seed(2018) 
# generate design matrix x
n=50;p=100
s=10
x=matrix(0,n,1+2*p)
x[,1]=sample(c(0,1),n,replace = TRUE)
x[,seq(2,1+2*p,2)]=matrix(rnorm(n*p),n,p)
x[,seq(3,1+2*p,2)]=x[,seq(2,1+2*p,2)]*x[,1]

g=c(p+1,rep(1:p,rep(2,p)))  # groups 
v=c(0,rep(1,2*p))           # penalization status

# generate beta
beta=c(rnorm(13,0,2),rep(0,ncol(x)-13))
beta[c(2,4,7,9)]=0

# generate y
data1=x%*%beta
noise1=rnorm(n)
snr1=as.numeric(sqrt(var(data1)/(s*var(noise1))))
y1=data1+snr1*noise1
lambda = c(8,0,8)
hierarchy = 1
gfit1 = glog(y1,x,g,v,lambda,hierarchy,family = "gaussian")