smog.default: Generalized linear model constraint on hierarchical structure...
In smog: Structural Modeling by using Overlapped Group Penalty

Description Usage Arguments Details Value Penalized regression model Author(s) References See Also Examples

smog fits a linear non-penalized phynotype (demographic) variables such as age, gender, treatment, etc, and penalized groups of prognostic effect (main effect) and predictive effect (interaction effect), by satisfying the hierarchy structure: if a predictive effect exists, its prognostic effect must be in the model. It can deal with continuous, binomial or multinomial, and survival response variables, underlying the assumption of Gaussian, binomial (multinomial), and Cox proportional hazard models, respectively. It can accept formula, and output coefficients table, fitted.values, and convergence information produced in the algorithm iterations.

## Default S3 method:
smog(x, y, g, v, label, lambda1, lambda2, lambda3,
  family = "gaussian", subset = NULL, rho = 10, scale = TRUE,
  eabs = 0.001, erel = 0.001, LL = 1, eta = 1.25, maxitr = 1000,
  ...)

## S3 method for class 'formula'
smog(formula, data = list(), g, v, label, lambda1,
  lambda2, lambda3, ...)

`x`	a model matrix, or a data frame of dimensions n by p, in which the columns represents the predictor variables.
`y`	response variable, corresponds to the family description. When family is ”gaussian” or ”binomial”, `y` ought to be a numeric vector of observations of length n; when family is ”coxph”, `y` represents the survival objects, containing the survival time and the censoring status. See `Surv`.
`g`	a vector of group labels for the predictor variables.
`v`	a vector of binary values, represents whether or not the predictor variables are penalized. Note that 1 indicates penalization and 0 for not penalization.
`label`	a character vector, represents the type of predictors in terms of treatment, prognostic, and predictive effects by using ”t”, ”prog”, and ”pred”, respectively.
`lambda1`	penalty parameter for the L2 norm of each group of prognostic and predictive effects.
`lambda2`	ridge penalty parameter for the squared L2 norm of each group of prognostic and predictive effects.
`lambda3`	penalty parameter for the L1 norm of predictive effects.
`family`	a description of the distribution family for the response variable variable. For continuous response variable, family is ”gaussian”; for multinomial or binary response variable, family is ”binomial”; for survival response variable, family is ”coxph”, respectively.
`subset`	an optional vector specifying a subset of observations to be used in the model fitting. Default is `NULL`.
`rho`	the penalty parameter used in the alternating direction method of multipliers (ADMM) algorithm. Default is 10.
`scale`	whether or not scale the design matrix. Default is `TRUE`.
`eabs`	the absolute tolerance used in the ADMM algorithm. Default is 1e-3.
`erel`	the reletive tolerance used in the ADMM algorithm. Default is 1e-3.
`LL`	initial value for the Lipschitz continuous constant for approximation to the objective function in the Majorization- Minimization (MM) (or iterative shrinkage-thresholding algorithm (ISTA)). Default is 1.
`eta`	gradient stepsize for the backtrack line search for the Lipschitz continuous constant. Default is 1.25.
`maxitr`	the maximum iterations for convergence in the ADMM algorithm. Default is 1000.
`...`	other relevant arguments that can be supplied to smog.
`formula`	an object of class ”formula”: a symbolic description of the model to be fitted. Should not include the intercept.
`data`	an optional data frame, containing the variables in the model.

The formula has the form response ~ 0 + terms where terms is a series of predictor variables to be fitted for response. For gaussian family, the response is a continuous vector. For binomial family, the response is a factor vector, in which the last level denotes the ”pivot”. For coxph family, the response is a Surv object, containing the survival time and censoring status.

smog returns an object of class inhering from ”smog”. The generic accessor functions coef, coefficients, fitted.value, and predict can be used to extract various useful features of the value returned by smog. An object of ”smog” is a list containing at least the following components:

coefficients: Data frame containing the nonzero predictor variables' indexes, names, and estimates. When family is ”binomial”, the estimates have K-1 columns, each column representing the weights for the corresponding group. The last group behaves the ”pivot”.
fitted.values: The fitted mean values for the response variable, for family is ”gaussian”. When family is ”binomial", the fitted.values are the probabilies for each class; when family is ”coxph”, the fitted.values are risk scores.
residuals: The residual is trivial for family = "gaussian". For family = "binomial", Pearson residuals is returned; and for family = "coxph", it yields deviance residuals, i.e., standardized martingale residuals.
model: A list of estimates for the intercept, treatment effect, and prognostic and predictive effects for the selectd biomarkers.
weight: The weight of predictors resulted from the penalty funciton, is used to calculate the degrees of freedom.
DF: the degrees of freedom. When family = ”gaussian”, DF = tr(x_{λ}'(x_{λ}'x_{λ}+W)x_{λ}). For other families, DF is approximated by diag(1/(1+W)).
criteria: model selection criteria, including the correction Akaike's Information Criterion (AIC), AIC, Bayesian Information Criterion (BIC), and the generalized cross-validation score (GCV), respectively. See also cv.smog.
llikelihood: the log-likelihood value for the converged model.
loglike: the penalized log-likelihood values for each iteration in the algorithm.
PrimalError: the averged norms ||β-Z||/√{p} for each iteration, in the ADMM algorithm.
DualError: the averaged norms ||Z^{t+1}-Z^{t}||/√{p} for each iteration, in the ADMM algorithm.
converge: the number of iterations processed in the ADMM algorithm.
call: the matched call.
formula: the formula supplied.

The regression function contains the non-penalized predictor variables, and many groups of prognostic and predictive terms, where in each group the prognostic term comes first, followed by the predictive term.

Penalty function: Different hierachical structures within groups can result from adjusting the penalty parameters in the penalty function:

Ω(\mathbf{β}) = λ_1||\mathbf{β}|| + λ_2||\mathbf{β}||^2+λ_3|β_2|

Where \mathbf{β}=(β_1,β_2). Note that β_1 denotes the prognostic effect (main effect), and β_2 for the predictive effect (interactive effect), respectively. When λ_2 = 0 and λ_3 = 0, it indicates no structure within groups. When λ_2 \ne 0, the penalty function honors the structure within groups such that: predictive effect \ne 0 \Longrightarrow prognostic effect \ne 0.
Tuning parameters: rho,eabs,erel,LL,eta are the corresponding parameters used in the itervative shrinkage-thresholding algorithm (ISTA) and the alternating direction method of multipliers algorithm (ADMM).

Chong Ma, chongma8903@gmail.com.

\insertRef

ma2019structuralsmog

cv.smog, predict.smog, plot.smog.

 

n=100;p=20
set.seed(2018)
# generate design matrix x
s=10
x=matrix(0,n,1+2*p)
x[,1]=sample(c(0,1),n,replace = TRUE)
x[,seq(2,1+2*p,2)]=matrix(rnorm(n*p),n,p)
x[,seq(3,1+2*p,2)]=x[,seq(2,1+2*p,2)]*x[,1]

g=c(p+1,rep(1:p,rep(2,p)))  # groups 
v=c(0,rep(1,2*p))           # penalization status
label=c("t",rep(c("prog","pred"),p))  # type of predictor variables

# generate beta
beta=c(rnorm(13,0,2),rep(0,ncol(x)-13))
beta[c(2,4,7,9)]=0

# generate y
data1=x%*%beta
noise1=rnorm(n)
snr1=as.numeric(sqrt(var(data1)/(s*var(noise1))))
y1=data1+snr1*noise1
lfit1=smog(x,y1,g,v,label,lambda1=8,lambda2=0,lambda3=8,family = "gaussian")

## generate binomial data
prob=exp(as.matrix(x)%*%as.matrix(beta))/(1+exp(as.matrix(x)%*%as.matrix(beta)))
y2=ifelse(prob<0.5,0,1)
lfit2=smog(x,y2,g,v,label,lambda1=0.03,lambda2=0,lambda3=0.03,family = "binomial")

## generate survival data
# Weibull latent event times
lambda = 0.01; rho = 1
V = runif(n)
Tlat = (- log(V) / (lambda*exp(x %*% beta)) )^(1/rho)
C = rexp(n, 0.001)  ## censoring time
time = as.vector(pmin(Tlat, C))
status = as.numeric(Tlat <= C)
y3 = as.matrix(cbind(time = time, status = status))

lfit3=smog(x,y3,g,v,label,lambda1=0.2,lambda2=0,lambda3=0.2,family = "coxph")