Description Details Author(s) References Examples
Provides functions to perform Boosting / Functional Gradient Descent / Forward Stagewise regression with grouped covariates setting using Generalized Estimating Equations.
Package: | sgee |
Type: | Package |
Version: | 0.6-0 |
Date: | 2018-01-08 |
License: | GPL (>= 3) |
sgee provides several stagewise regression approaches that are designed to address variable selection with grouped covariates in the context of Generalized Estimating Equations. Given a response and design matrix stagewise techniques perform a sequence of small learning steps wherein a subset of the covariates are selected as being the most important at that iteration and are then subsequently updated by a small amount, epsilon. different techniques this optimal update in different ways that achieve different structural goals (i.e. groups of covariates are fully included or not).
The resulting path can then be analyzed to determine an optimal
model along the path of coefficient estimates. The
analyzeCoefficientPath
function provides such
functionality based on various
possible metrics, primarily focused on the Mean Squared Error.
Furthermore, the plot.sgee
function can be used to examine the
path of coefficient estimates versus the iteration number, or some
desired penalty.
Gregory Vaughan [aut, cre], Kun Chen [ctb], Jun Yan [ctb]
Maintainer: Gregory Vaughan <gregory.vaughan@uconn.edu>
Vaughan, G., Aseltine, R., Chen, K., Yan, J., (2017). Stagewise Generalized Estimating Equations with Grouped Variables. Biometrics 73, 1332-1342. URL: http://dx.doi.org/10.1111/biom.12669, doi:10.1111/biom.12669.
Vaughan, G., Aseltine, R., Chen, K., Yan, J., (2017). Efficient interaction selection for clustered data via stagewise generalized estimating equations. Department of Statistics, University of Connecticut. Technical Report.
Wolfson, J. (2011). EEBoost: A general method for prediction and variable selection based on estimating equations. Journal of the American Statistical Association 106, 296–305.
Tibshirani, R. J. (2015). A general framework for fast stagewise algorithms. Journal of Machine Learning Research 16, 2543–2588.
Simon, N., Friedman, J., Hastie, T., and Tibshirani, R. (2013). A sparse-group lasso. Journal of Computational and Graphical Statistics 22, 231–245.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.
Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | #####################
## Generate test data
#####################
## Initialize covariate values
p <- 50
beta <- c(rep(2.4,5),
c(1.2, 0, 1.6, 0, .4),
rep(0.5,5),
rep(0,p-15))
groupSize <- 5
numGroups <- length(beta)/groupSize
generatedData <- genData(numClusters = 50,
clusterSize = 4,
clusterRho = 0.6,
clusterCorstr = "exchangeable",
yVariance = 1,
xVariance = 1,
numGroups = numGroups,
groupSize = groupSize,
groupRho = 0.3,
beta = beta,
family = gaussian(),
intercept = 0)
coefMat1 <- hisee(y = generatedData$y, x = generatedData$x,
family = gaussian(),
clusterID = generatedData$clusterID,
groupID = generatedData$groupID,
corstr="exchangeable",
control = sgee.control(maxIt = 100, epsilon = 0.2))
## interceptLimit allows for compatibility with older R versions
coefMat2 <- bisee(y = generatedData$y, x = generatedData$x,
family = gaussian(),
clusterID = generatedData$clusterID,
groupID = generatedData$groupID,
corstr="exchangeable",
control = sgee.control(maxIt = 100, epsilon = 0.2,
interceptLimit = 10),
lambda1 = .5,
lambda2 = .5)
par(mfrow = c(2,1))
plot(coefMat1)
plot(coefMat2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.