Description Usage Arguments Details Value References Examples

The package calculate the variable selection deviation (VSD) to measure the uncertainty of the selection in terms of inclusion of predictors in the model.

1 2 3 4 5 6 |

`x` |
Matrix of predictors. |

`y` |
Response variable. |

`n_train` |
Size of training set when the weight function is ARM or ARM with prior. The default value is |

`no_rep` |
Number of replications when the weight function is ARM and ARM with prior. The default value is |

`n_train_bound` |
When computing the weights using |

`n_bound` |
When computing the weights using |

`model_check` |
The index of the model to be assessed by calculating the VSD measures. |

`psi` |
A positive number to control the improvement of the prior weight. The default value is 1. |

`family` |
Choose the family for GLM models. So far only |

`method` |
User chooses one of the |

`candidate_models` |
Only available when |

`weight_type` |
Options for computing weights for VSD measure. User chooses one of the |

`prior` |
Whether use prior in the weight function. The default is |

`reduce_bias` |
If the binomial model is used, occasionally the algorithm might has convergence issue when the problem of so-called complete separation or quasi-complete separation happens. Users can set |

See Reference section.

A "glmvsd" object is retured. The components are:

`VSD` |
Variable selection deviation (VSD) value. |

`VSD_minus` |
The lower VSD value of |

`VSD_plus` |
The upper VSD value of |

`weight` |
The weight for each candidate model. |

`DIFF` |
Counting the variable differences between candidate models and |

`candidate_models_cleaned` |
Cleaned candidate models: the duplicated candidate models are cleaned; When computing VSD weights using AIC and BIC, the models with more than n-2 variables are removed (n is the number of observaitons); When computing VSD weights using ARM, the models with more than n_train-2 variables are removed (n_train is the number of training observations). |

Nan, Y. and Yang, Y. (2013), "Variable Selection Diagnostics Measures for High-dimensional Regression," *Journal of Computational and Graphical Statistics*, 23:3, 636-656.

http://dx.doi.org/10.1080/10618600.2013.829780

BugReport: https://github.com/emeryyi/glmvsd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | ```
# REGRESSION CASE
# generate simulation data
n <- 50
p <- 8
beta <- c(3,1.5,0,0,2,0,0,0)
sigma <- matrix(0,p,p)
for(i in 1:p){
for(j in 1:p) sigma[i,j] <- 0.5^abs(i-j)
}
x <- mvrnorm(n, rep(0,p), sigma)
e <- rnorm(n)
y <- x %*% beta + e
# user provide a model to be checked
model_check <- c(0,1,1,1,0,0,0,1)
# compute VSD for model_check using ARM with prior
v_ARM <- glmvsd(x, y, n_train = ceiling(n/2),
no_rep=50, model_check = model_check, psi=1,
family = "gaussian", method = "union",
weight_type = "ARM", prior = TRUE)
# compute VSD for model_check using AIC
v_AIC <- glmvsd(x, y,
model_check = model_check,
family = "gaussian", method = "union",
weight_type = "AIC", prior = TRUE)
# compute VSD for model_check using BIC
v_BIC <- glmvsd(x, y,
model_check = model_check,
family = "gaussian", method = "union",
weight_type = "BIC", prior = TRUE)
# user supplied candidate models
candidate_models = rbind(c(0,0,0,0,0,0,0,1),
c(0,1,0,0,0,0,0,1), c(0,1,1,1,0,0,0,1),
c(0,1,1,0,0,0,0,1), c(1,1,0,1,1,0,0,0),
c(1,1,0,0,1,0,0,0))
v1_BIC <- glmvsd(x, y,
model_check = model_check, psi=1,
family = "gaussian",
method = "customize",
candidate_models = candidate_models,
weight_type = "BIC", prior = TRUE)
# CLASSIFICATION CASE
# generate simulation data
n = 300
p = 8
b <- c(1,1,1,-3*sqrt(2)/2)
x=matrix(rnorm(n*p, mean=0, sd=1), n, p)
feta=x[, 1:4]%*%b
fprob=exp(feta)/(1+exp(feta))
y=rbinom(n, 1, fprob)
# user provide a model to be checked
model_check <- c(0,1,1,1,0,0,0,1)
# compute VSD for model_check using BIC with prior
b_BIC <- glmvsd(x, y, n_train = ceiling(n/2),
family = "binomial",
no_rep=50, model_check = model_check, psi=1,
method = "union", weight_type = "BIC",
prior = TRUE)
candidate_models =
rbind(c(0,0,0,0,0,0,0,1),
c(0,1,0,0,0,0,0,1),
c(1,1,1,1,0,0,0,0),
c(0,1,1,0,0,0,0,1),
c(1,1,0,1,1,0,0,0),
c(1,1,0,0,1,0,0,0),
c(0,0,0,0,0,0,0,0),
c(1,1,1,1,1,0,0,0))
# compute VSD for model_check using AIC
# user supplied candidate models
b_AIC <- glmvsd(x, y,
family = "binomial",
model_check = model_check, psi=1,
method = "customize",
candidate_models = candidate_models,
weight_type = "AIC")
``` |

glmvsd documentation built on May 29, 2017, 11:54 p.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.