knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
extendedFamily adds new links to R's generalized linear models. These families are drop in additions to existing families.
Links:
For the binomial family, the link is usually the logit but there are other options. The loglog model assigns a lower probability for X ranging from -5 to 2. For X over 2, the models are essentially indistinguishable. This can lead to improved performance when the response rate is much lower than 50%.
library(stats) library(dplyr) library(extendedFamily) library(ggplot2) library(yardstick) logitLink <- binomial(link = "logit") loglogLink <- binomialEF(link = "loglog") temp <- bind_rows( tibble(X = seq(-5, 5, .01)) %>% mutate( Probability = logitLink$linkinv(X), Link = "logit" ), tibble(X = seq(-5, 5, .01)) %>% mutate( Probability = loglogLink$linkinv(X), Link = "loglog" ) ) ggplot(temp, aes(x = X, y = Probability, colour = Link)) + geom_line() + scale_x_continuous(breaks = seq(-5, 5, 1)) + scale_y_continuous(breaks = seq(0, 1, .1))
The heart data contains info on 4,483 heart attack victims. The goal is to predict if a patient died in the next 48 hours following a myocardial infarction. The low death rate makes this dataset a good candidate for the loglog link.
data(heart) heart %>% summarise(deathRate = mean(death))
Only the family object needs to change to use the loglog link.
glmLogit <- glm( formula = death ~ anterior + hcabg + kk2 + kk3 + kk4 + age2 + age3 + age4, data = heart, family = binomial(link = "logit") ) glmLoglog <- glm( formula = death ~ anterior + hcabg + kk2 + kk3 + kk4 + age2 + age3 + age4, data = heart, family = binomialEF(link = "loglog") )
AUC improved by changing the link.
predictions <- heart %>% select(death) %>% mutate( death = factor(death, levels = c("0", "1")), logitProb = predict(object = glmLogit, newdata = heart, type = "response"), loglogProb = predict(object = glmLoglog, newdata = heart, type = "response") ) roc_auc(data = predictions, truth = death, event_level = "second", logitProb) roc_auc(data = predictions, truth = death, event_level = "second", loglogProb)
The family objects integrate with Tidymodels.
library(tidymodels) heart <- heart %>% mutate(death = factor(death, levels = c("0", "1"))) parsnip_fit <- logistic_reg() %>% set_engine("glm", family = binomialEF("loglog")) %>% fit(death ~ anterior + hcabg + kk2 + kk3 + kk4 + age2 + age3 + age4, data = heart) testPredictions <- parsnip_fit %>% predict(new_data = heart, type = "prob") testPredictions <- heart %>% select(death) %>% bind_cols(testPredictions) testPredictions %>% roc_auc(truth = death, event_level = "second", .pred_1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.