Description Usage Arguments Value References See Also Examples
View source: R/rewlr.R View source: R/rewlr_main.R
rewlr is used to fitting the Rare Event Weighted Logistic Regression to handle the imbalanced or unbalanced response variabel in binary classification
1 2 |
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted |
data |
a dataframe or matrix (tibble is also supported) |
tol |
positive convergence tolerance ε; the iterations converge when |dev - dev_old|/(|dev| + 0.1) < ε' |
iter |
an integer that giving maximum iteration for parameter estimation. |
lambda |
a regularization (penalty) term to obtain better estimation. If the value is missing, lamda will be calculated by 1/sd(y) |
weight0 |
(1 - proportion of events in the sample) devided by (1 - proportion of events in the population) |
weight1 |
proportion of events in the sample devided by proportion of events in the population |
rewlr returns output like glm, use function summary() to obtain the summary coefficients and others. The detail are shown in the following list:
coefficients - a named vector of coefficients.
fitted.values - return the prediction using the training data resulting probablity.
deviance - up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.
AIC - A version of Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameter.
null.deviance - The deviance for the null model, comparable with deviance. The null model will include the offset, and an intercept if there is one in the model.
df.residual - the residual degrees of freedom.
df.null - the residual degrees of freedom for the null model.
auc - an area under ROC curve
Maalouf M, Siddiqi M. (2014) emphWeight logistic regression for large-scale imbalanced and rare events data. emphKnowledge-Based System, strong59, 142-148.
summary.rewlr
for summarises the model that has been built. Also use predict.rewlr
to predict model to testing or new data.
1 2 3 4 5 6 7 8 9 10 11 | library(rewlr)
data(National_exam_id)
#data$Species <- ifelse(data$Species == "setosa",0,1)
#Supposed that current sample data has 9 percent of rare event data, and the population has 2 percent of those rare event data.
(weight0 = (1 - 0.09)/(1-0.02))
(weight1 = (0.09)/(0.02))
iter = 1000; tol = 0.00001
fit <- rewlr(y~., data = National_exam_id, weights0 = weight0, weights1 = weight1)
summary(fit)
p <- predict(fit, newdata = National_exam_id)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.