glmnet.cme: Misclassification Robust Loss Function for Binomial Glmnet...

Description Usage Arguments Value References

Description

Sometimes even the best data manager or data management software goofs, and response labels get flipped. When using a binomial elastic net model to classify outcomes this can result in a poor choice of λ and lead to over-shrinkage, insufficient shrinkage, or selecting the wrong subset of variables. This function applies the misclassification loss function to each step in the coefficient path of a glmnet fit, and calculates revised coefficients. Deviances and deviance residuals are calculated for each set of revised coefficients.

Usage

1
2
3
4
5
6
7
8
9
glmnet.cme(
  x,
  y,
  alpha = 0.5,
  lambda = NULL,
  nlambda = 15,
  gamma = 0.05,
  tol = 1e-04
)

Arguments

x

a data matrix

y

a binary outcome

alpha

the mixing parameter for combining the L1 and L2 penalties

lambda

supply a vector of specific penalty values if desired. this needs to be longer than a single value, however! defaults to NULL.

nlambda

how many lambda values glmnet should generate on its own if lambda is NULL.

gamma

prior expectation of mislabel probability. defaults to 0.05.

tol

tolerance for convergence. defaults to 1e-4.

Value

a list containing coefficients, deviance residuals, deviances, weights, the vector of lambda values, and the optimal lambda (the value which minimizes the deviance).

References

Copas, J. B. (1988). Binary Regression Models for Contaminated Data. Journal of the Royal Statistical Society: Series B (Methodological), 50(2), 225–253. doi:10.1111/j.2517-6161.1988.tb01723.x

Hung, H., Jou, Z.-Y., & Huang, S.-Y. (2017). Robust mislabel logistic regression without modeling mislabel probabilities. Biometrics, 74(1), 145–154. doi:10.1111/biom.12726


abnormally-distributed/cvreg documentation built on May 3, 2020, 3:45 p.m.