cv.SplitGLM | R Documentation |
cv.SplitGLM
performs the CV procedure for split generalized linear models.
cv.SplitGLM( x, y, glm_type = "Linear", G = 10, include_intercept = TRUE, alpha_s = 3/4, alpha_d = 1, n_lambda_sparsity = 50, n_lambda_diversity = 50, tolerance = 0.001, max_iter = 1e+05, n_folds = 10, active_set = FALSE, full_diversity = FALSE, n_threads = 1 )
x |
Design matrix. |
y |
Response vector. |
glm_type |
Description of the error distribution and link function to be used for the model. Must be one of "Linear", "Logistic", "Gamma" or "Poisson". |
G |
Number of groups into which the variables are split. Can have more than one value. |
include_intercept |
Boolean variable to determine if there is intercept (default is TRUE) or not. |
alpha_s |
Elastic net mixing parmeter. Default is 3/4. |
alpha_d |
Mixing parameter for diversity penalty. Default is 1. |
n_lambda_sparsity |
Number of candidates for the sparsity penalty parameter. Default is 100. |
n_lambda_diversity |
Number of candidates for the sparsity penalty parameter. Default is 100. |
tolerance |
Convergence criteria for the coefficients. Default is 1e-3. |
max_iter |
Maximum number of iterations in the algorithm. Default is 1e5. |
n_folds |
Number of cross-validation folds. Default is 10. |
active_set |
Active set convergence for the algorithm. Default is FALSE. |
full_diversity |
Full diversity between the groups. Default is FALSE. |
n_threads |
Number of threads. Default is 1. |
An object of class cv.SplitGLM.
Anthony-Alexander Christidis, anthony.christidis@stat.ubc.ca
coef.cv.SplitGLM
, predict.cv.SplitGLM
# Data simulation set.seed(1) n <- 50 N <- 2000 p <- 1000 beta.active <- c(abs(runif(p, 0, 1/2))*(-1)^rbinom(p, 1, 0.3)) # Parameters p.active <- 100 beta <- c(beta.active[1:p.active], rep(0, p-p.active)) Sigma <- matrix(0, p, p) Sigma[1:p.active, 1:p.active] <- 0.5 diag(Sigma) <- 1 # Train data x.train <- mvnfast::rmvn(n, mu = rep(0, p), sigma = Sigma) prob.train <- exp(x.train %*% beta)/ (1+exp(x.train %*% beta)) y.train <- rbinom(n, 1, prob.train) mean(y.train) # Test data x.test <- mvnfast::rmvn(N, mu = rep(0, p), sigma = Sigma) prob.test <- exp(x.test %*% beta)/ (1+exp(x.test %*% beta)) y.test <- rbinom(N, 1, prob.test) mean(y.test) # SplitGLM - CV (Multiple Groups) split.out <- cv.SplitGLM(x.train, y.train, glm_type="Logistic", G=10, include_intercept=TRUE, alpha_s=3/4, alpha_d=1, n_lambda_sparsity=50, n_lambda_diversity=50, tolerance=1e-3, max_iter=1e3, n_folds=5, active_set=FALSE, n_threads=1) split.coef <- coef(split.out) # Predictions split.prob <- predict(split.out, newx=x.test, type="prob", group_index=NULL) split.class <- predict(split.out, newx=x.test, type="class", group_index=NULL) plot(prob.test, split.prob, pch=20) abline(h=0.5,v=0.5) mean((prob.test-split.prob)^2) mean(abs(y.test-split.class))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.