To highlight some more of prettyglm
's functionality we will now build a logistic regression model with a spline.
A critical step for this package to work is to set all categorical predictors as factors. Before we fit a spline we will fit the age variable as a categorical variable to understand the trend.
library(dplyr) library(prettyglm) data('titanic') # Easy way to convert multiple columns to a factor. columns_to_factor <- c('Pclass', 'Sex', 'Cabin', 'Embarked', 'Cabintype') meanage <- base::mean(titanic$Age, na.rm=T) titanic <- titanic %>% dplyr::mutate_at(columns_to_factor, list(~factor(.))) %>% dplyr::mutate(Age = base::ifelse(is.na(Age)==T,meanage,Age)) %>% dplyr::mutate(Age_Cat = prettyglm::cut3(Age, levels.mean = TRUE, g =10)) %>% dplyr::mutate(Fare_Cat = prettyglm::cut3(Fare, levels.mean = TRUE, g =10)) # Build a basic glm survival_model <- stats::glm(Survived ~ Pclass + Sex + Sex:Fare_Cat + Age_Cat + Embarked + SibSp + Parch, data = titanic, family = binomial(link = 'logit'))
By setting plot_factor_as_numeric
equal to TRUE in pretty_relativities()
we can plot understand the bucketed categorical predictors trend. Setting plot_factor_as_numeric
equal to TRUE also works for interactions like Sex:Fare.
We will fit a knot point at 18 and 35.
pretty_relativities(feature_to_plot= 'Age_Cat', model_object = survival_model, relativity_label = 'Liklihood of Survival', plot_factor_as_numeric = T )
We will fit a knot point at 55.
pretty_relativities(feature_to_plot= 'Sex:Fare_Cat', model_object = survival_model, relativity_label = 'Liklihood of Survival', plot_factor_as_numeric = T, iteractionplottype = 'colour', facetorcolourby = 'Sex')
prettyglm
includes some useful functions to assist in building splines. splineit()
takes the minimum and maximum points of a spline to create the splined column. An example workflow using dplyr
is shown below.
titanic <- titanic %>% dplyr::mutate(Age_0_18 = prettyglm::splineit(Age,0,18), Age_18_35 = prettyglm::splineit(Age,18,35), Age_35_120 = prettyglm::splineit(Age,35,120)) %>% dplyr::mutate(Fare_0_55 = prettyglm::splineit(Fare,0,55), Fare_55_600 = prettyglm::splineit(Fare,55,600)) survival_model4 <- stats::glm(Survived ~ Pclass + Sex:Fare_0_55 + Sex:Fare_55_600 + Age_0_18 + Age_18_35 + Age_35_120 + Embarked + SibSp + Parch, data = titanic, family = binomial(link = 'logit'))
For interactions variables are grouped on the left pane under Variables
. It is important to provide the correct spline_seperator
pretty_coefficients(survival_model4, significance_level = 0.1, spline_seperator = '_')
You also need to provide a spline_seperator
input in pretty_relativities()
.
pretty_relativities(feature_to_plot= 'Age', model_object = survival_model4, relativity_label = 'Liklihood of Survival', spline_seperator = '_' )
By default pretty_relativities()
will colour by the factor variable.
pretty_relativities(feature_to_plot= 'Sex:Fare', model_object = survival_model4, relativity_label = 'Liklihood of Survival', spline_seperator = '_', upper_percentile_to_cut = 0.03 )
If you prefer to facet by the factor variable, change iteractionplottype
to "facet"
pretty_relativities(feature_to_plot= 'Sex:Fare', model_object = survival_model4, relativity_label = 'Liklihood of Survival', spline_seperator = '_', upper_percentile_to_cut = 0.03, iteractionplottype = 'facet' )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.