plot_pred_improve: Plot the Model Performance Improvement of Each Predictor...
In AndrewKostandy/MLtoolkit: Functions to Help with Machine Learning & Feature Engineering Tasks

Description Usage Arguments Value Figures Author(s) References Examples

Plot the model performance improvement of each predictor relative to the null model. If the outcome is categorical, then a logisitic regression model is used and the area under the ROC curve is used to assess performance. If the outcome is numeric, then an ordinary least squares model is used and the root mean squared error (RMSE) is used to assess performance.

The results are estimated across resamples and the p-value is determined using a one-sided paired t-test of the predictor results and the null model results in each case. The p-values are adjusted using the Benjamini-Hochberg method to control the false discovery rate.

1	plot_pred_improve(data, outcome, seed, folds = 10, repeats = 3)

`data`	The dataframe containing the predictors and the outcome.
`outcome`	The outcome variable name.
`seed`	A numeric seed for reproducibility. L'Ecuyer-CMRG is used as the RNG kind.
`folds`	Defaults to 10. The number of folds to use with repeated cross-validation.
`repeats`	Defaults to 3. The number of repeats to use with repeated cross-validation.

Returns a ggplot object with the improvement value on the x-axis and the negative log10 of the adjusted p-value on the y-axis.

A vertical dashed red line marks the 0 improvement level on the x-axis while a horizontal dashed red line marks the 0.05 p-value level after adjustment.

Andrew Kostandy (andrew.kostandy@gmail.com)

This technique was discussed in the book Feature Engineering and Selection: A Practical Approach for Predictive Models by Max Kuhn and Kjell Johnson.

library(tidyverse)
library(mlbench)

data(BreastCancer)
dat <- BreastCancer %>% select(-Id)

dat <- dat %>% modify_at(c(1:9), as.numeric) %>% mutate(Class = fct_rev(Class))

plot_pred_improve(data = dat, outcome = Class,
                  seed = 42, folds = 10, repeats = 3)

AndrewKostandy/MLtoolkit documentation built on May 7, 2019, 9:51 p.m.

AndrewKostandy/MLtoolkit index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AndrewKostandy/MLtoolkit
Functions to Help with Machine Learning & Feature Engineering Tasks

plot_pred_improve: Plot the Model Performance Improvement of Each Predictor...
In AndrewKostandy/MLtoolkit: Functions to Help with Machine Learning & Feature Engineering Tasks

Description

Usage

Arguments

Value

Figures

Author(s)

References

Examples

Related to plot_pred_improve in AndrewKostandy/MLtoolkit...

R Package Documentation

Browse R Packages

We want your feedback!

AndrewKostandy/MLtoolkit Functions to Help with Machine Learning & Feature Engineering Tasks

plot_pred_improve: Plot the Model Performance Improvement of Each Predictor... In AndrewKostandy/MLtoolkit: Functions to Help with Machine Learning & Feature Engineering Tasks

Description

Usage

Arguments

Value

Figures

Author(s)

References

Examples

Related to plot_pred_improve in AndrewKostandy/MLtoolkit...

R Package Documentation

Browse R Packages

We want your feedback!

AndrewKostandy/MLtoolkit
Functions to Help with Machine Learning & Feature Engineering Tasks

plot_pred_improve: Plot the Model Performance Improvement of Each Predictor...
In AndrewKostandy/MLtoolkit: Functions to Help with Machine Learning & Feature Engineering Tasks