glog: Generalized linear model constraint on hierarchical structure...

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/RcppExports.R

Description

Generalized linear model constraint on hierarchical structure by using overlapped group penalty

Usage

1
2
3
glog(y, x, g, v, lambda, hierarchy, family = "gaussian", rho = 10,
  scale = TRUE, eabs = 0.001, erel = 0.001, LL = 1, eta = 1.25,
  maxitr = 1000L)

Arguments

y

response variable, in the format of matrix. When family is gaussian'' or binomial”, y ought to be a matrix of n by 1 for the observations; when family is “coxph”, y represents the survival objects, that is, a matrix of n by 2, containing the survival time and the censoring status. See Surv.

x

a model matrix of dimensions n by p,in which the column represents the predictor variables.

g

a numeric vector of group labels for the predictor variables.

v

a numeric vector of binary values, represents whether or not the predictor variables are penalized. Note that 1 indicates penalization and 0 for not penalization.

lambda

a numeric vector of three penalty parameters corresponding to L2 norm, squared L2 norm, and L1 norm, respectively.

hierarchy

a factor value in levels 0, 1, 2, which represent different hierarchical structure within groups, respectively. When hierarchy=0, λ_2 and λ_3 are forced to be zeroes; when hierarchy=1, λ_2 is forced to be zero; when hierarchy=2, there is no constraint on λ's. See smog.

family

a description of the distribution family for the response variable variable. For continuous response variable, family is gaussian''; for multinomial or binary response variable, family is binomial”; for survival response variable, family is “coxph”, respectively.

rho

the penalty parameter used in the alternating direction method of multipliers algorithm (ADMM). Default is 10.

scale

whether or not scale the design matrix. Default is true.

eabs

the absolute tolerance used in the ADMM algorithm. Default is 1e-3.

erel

the reletive tolerance used in the ADMM algorithm. Default is 1e-3.

LL

initial value for the Lipschitz continuous constant for approximation to the objective function in the Majorization- Minimization (MM) (or iterative shrinkage-thresholding algorithm (ISTA)). Default is 1.

eta

gradient stepsize for the backtrack line search for the Lipschitz continuous constant. Default is 1.25.

maxitr

the maximum iterations for convergence in the ADMM algorithm. Default is 500.

Value

A list of

coefficients

A data frame of the variable name and the estimated coefficients

llikelihood

The log likelihood value based on the ultimate estimated coefficients

loglike

The sequence of log likelihood values since the algorithm starts

PrimalError

The sequence of primal errors in the ADMM algorithm

DualError

The sequence of dual errors in the ADMM algorithm

converge

The integer of the iteration when the convergence occurs

Author(s)

Chong Ma, chongma8903@gmail.com.

References

\insertRef

ma2019structuralsmog

See Also

cv.smog, smog.default, smog.formula, predict.smog, plot.smog.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
set.seed(2018) 
# generate design matrix x
n=50;p=100
s=10
x=matrix(0,n,1+2*p)
x[,1]=sample(c(0,1),n,replace = TRUE)
x[,seq(2,1+2*p,2)]=matrix(rnorm(n*p),n,p)
x[,seq(3,1+2*p,2)]=x[,seq(2,1+2*p,2)]*x[,1]

g=c(p+1,rep(1:p,rep(2,p)))  # groups 
v=c(0,rep(1,2*p))           # penalization status

# generate beta
beta=c(rnorm(13,0,2),rep(0,ncol(x)-13))
beta[c(2,4,7,9)]=0

# generate y
data1=x%*%beta
noise1=rnorm(n)
snr1=as.numeric(sqrt(var(data1)/(s*var(noise1))))
y1=data1+snr1*noise1
lambda = c(8,0,8)
hierarchy = 1
gfit1 = glog(y1,x,g,v,lambda,hierarchy,family = "gaussian")

smog documentation built on Aug. 10, 2020, 5:07 p.m.

Related to glog in smog...