regression_corrector: Adds a quick layer of flexible univariate regression to fit...

Description Usage Arguments Details Value Examples

Description

Regress actual 'y' versus predicted 'y' in a regression fit using a flexible univariate regressor to reduce bias and possibly reduce other patterns in the residues.

Usage

1
2
regression_corrector(fit, data, actual, predictFun = stats::predict,
  method = "smooth.spline", ...)

Arguments

fit

A regression fit

data

Data to be used to predict the 'fit'

actual

Actual 'y'

predictFun

A predict function to be used with 'fit' and 'data' as first two unnamed arguments

method

Method for the univariate fit. These are implemented: smoothing spline using 'stats::smooth.spline', linear regression using 'lm' and localized regression using 'loess'.

...

Arguments to the function fitting the univariate fit

Details

The returned object

Value

A object of class 'regressionCorrector' with these components:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
dplyr::glimpse(MASS::Boston)

set.seed(2)
train_sample <- sample.int(nrow(MASS::Boston), 400)
boston_train <- MASS::Boston[train_sample, ]
boston_test  <- MASS::Boston[-train_sample, ]

set.seed(500)
fit_gbm  <- gbm::gbm(medv ~., data = boston_train, n.trees = 500)
pred_gbm <- predict(fit_gbm, boston_train, n.trees = 500)
plot(pred_gbm, MASS::Boston$medv[train_sample] - pred_gbm)

rcf <- regression_corrector(
  fit_gbm
  , boston_train
  , boston_train[["medv"]]
  , function(model, data) stats::predict(model, data, n.trees = 500)
  )
rcf

MLmetrics::RMSE(stats::predict(fit_gbm, boston_test, n.trees = 500)
                , boston_test[["medv"]]
                )
MLmetrics::RMSE(predict(rcf, boston_test, n.trees = 500)
                , boston_test[["medv"]]
                )

old <- ggplot2::qplot(boston_test[["medv"]]
     , boston_test[["medv"]] - predict(fit_gbm, boston_test, n.trees = 500)
     ) +
  ggplot2::geom_hline(yintercept = 0, color = "green") +
  ggplot2::ggtitle("before correction")
new <- ggplot2::qplot(boston_test[["medv"]], boston_test[["medv"]] - predict(rcf, boston_test)) +
  ggplot2::geom_hline(yintercept = 0, color = "green") +
  ggplot2::ggtitle("after correction")
cowplot::plot_grid(old, new, align = "h")

rcf
rcf2 <- regression_corrector(rcf, MASS::Boston, MASS::Boston[["medv"]])
rcf2
rcf3 <- regression_corrector(rcf2, MASS::Boston, MASS::Boston[["medv"]])
rcf3

MLmetrics::RMSE(predict(rcf3, boston_test)
                , boston_test[["medv"]]
                )

talegari/sidekicks documentation built on May 30, 2019, 8:40 a.m.