glm.gMIC: The gMIC Function for (Group) Variable Selection in...
In liqun730/gMIC: Group Sparsity via Approximated Information Criteria

Description Usage Arguments Value Examples

View source: R/glm.gMIC.R

The gMIC Function for (Group) Variable Selection in Generalized Linear Model

glm.gMIC(formula, family = c("gaussian", "binomial", "poisson"), data,
  group = NULL, beta0 = NULL, criterion = "BIC", lambda0 = 0,
  a0 = NULL, scale.x = FALSE, orthogonal.x = FALSE,
  rounding.digits = 4, optim.method = "BFGS", lower = NULL,
  upper = NULL, maxit.global = 100, maxit.local = 100,
  epsilon = 1e-06, stepsize = 0.01, details = FALSE)

`formula`	An object of class `formula`, with the response on the left of a `~` operator, and the terms on the right.
`family`	A description of the error distribution and link function to be used in the model. Preferably for computational speed, this is a character string naming a family function among the following three choices: `"gaussian"`, `"binomial"`, or `"poisson"`. Otherwise, it has to be a family function or the result of a call to a family function that can be called for by `glm.fit`. See `family` for details of family functions.
`data`	A data.frame in which to interpret the variables named in the `formula` argument.
`group`	A vector indicating the group structure of the model. For example, assume that X has 4 columns and group=c(1,1,2,2). It means the first 2 features form a group of variables and the last 2 features form another group of variables.
`beta0`	A vector eqaul to the initial value for the model parameter, default is NULL.
`criterion`	A string indicating the type of information criterion ("AIC" or "BIC") to approximate. Default is "BIC".
`lambda0`	A number, the user-specified penalty parameter for model complexity. If `criterion="AIC"` or `"BIC"`, the value of `lambda0` will be ignored.
`a0`	The approximation parameter of the gMIC method.
`scale.x`	A boolean indicating whether or not to studentize the features. Default is `TRUE`.
`orthogonal.x`	A boolean indicating whether or not to orthogonalize the features within each group. Default is true. See `link{orthogonalize}` for details.
`rounding.digits`	Number of digits after the decimal point for rounding-up estiamtes. Default value is 4.
`optim.method`	Optimization method for gMIC, one of c("GenSA", "BFGS", "ADAM"), indicating we use GenSA, BFGS, or ADAM for gMIC optimmization. Default is BFGS. For unknown methods specified by user, the default with be used.
`lower`	The lower bounds for the search space in `GenSA`. The default is -10 (p by 1 vector).
`upper`	The upper bounds for the search space in `GenSA`. The default is +10 (p by 1 vector).
`maxit.global`	Maximum number of iterations allowed for the global optimization algorithm `SANN`. Default value is 100.
`maxit.local`	Maximum number of iterations allowed for the local optimizaiton algorithm `BFGS`. Default value is 100.
`epsilon`	The convergence tolerance.
`stepsize`	The stepsize (or learning rate) for optim.method = "GD" and "ADAM".
`details`	Logical value: if `TRUE`, detailed results will be printed out when running `glm.gMIC`.

A list of objects as follows,

coefficients: The estimates for gamma, the standard error of gamma, pvalues of gamma, and estimates of beta.
group.pvalues: The group-level p-values for each group of variables.

library(MASS)
library(Matrix)
n=500;a=100
sig <- function(k, rho){
  m = matrix(rho,nrow=k,ncol=k)
  diag(m) <- 1
  return(m)
}
bt = c(1,1,1, .5,.5,.5, rep(0,24)); p = length(bt)
group = c(1,1,1, 2,2,2,rep(3:10,each=3))
rho1 = 0.1; rho2 = 0.6
COV = sig(p,rho1) + bdiag(rep(list(sig(3,rho2)),p/3))
set.seed(1234)
X = mvrnorm(n,rep(0,p),COV)
z = X%*%bt; pr = 1/(1+exp(-z))
y = rbinom(n,1,pr)
Xy = as.data.frame(cbind(X,y))
dim(Xy)
names(Xy) <- c(paste("X",1:30,sep=""),"y")
fit <- glm.gMIC(y~.-1,group=group,family="binomial",a0=a,data=Xy,orthogonal.x=T)